python - OpenCV/C++ program slower than its numpy counterpart, what should I do? -

September 15, 2011

i implemented time ago procrustes analysis algorithm in python , told port opencv/c++ recently. after finishing ran tests , same input/instances, c++ code taking twice time python code (roughly 8 vs 4 seconds, respectively. i'm repeating tests one thousand times create sure i'm not measuring them on period small). i'm baffled these results.

i've used gprof seek understand what's going on, can't tell whole lot beingness wrong, besides fact cv::mat::~mat() taking 34.67% of execution time , beingness called 100+ times more other functions. not sure should either, unless i'm supposed replace cv::mats std::vectors or raw arrays, both of seem bad practice me.

void align(const cv::mat& points, const cv::mat& pointsref, cv::mat& res, cv::mat& ops) {     cv::mat pts(points.rows, points.cols, cv_64fc1);     cv::mat ptsref(points.rows, points.cols, cv_64fc1);     points.copyto(pts);     pointsref.copyto(ptsref);      cv::mat avgs = meanofcolumns(pts);     for(int = 0; < avgs.cols; i++) {         pts.col(i) -= avgs.col(i);     }     cv::mat avgsr = meanofcolumns(ptsref);     for(int = 0; < avgsr.cols; i++) {         ptsref.col(i) -= avgsr.col(i);     }      cv::mat x2(pts.rows, 1, cv_64fc1);     cv::mat y2(pts.rows, 1, cv_64fc1);     cv::mat x2r(pts.rows, 1, cv_64fc1);     cv::mat y2r(pts.rows, 1, cv_64fc1);     cv::pow(pts.col(0), 2, x2);     cv::pow(pts.col(1), 2, y2);     cv::pow(ptsref.col(0), 2, x2r);     cv::pow(ptsref.col(1), 2, y2r);     cv::mat sqrootp(pts.rows, 1, cv_64fc1);     cv::mat sqrootpr(pts.rows, 1, cv_64fc1);     cv::sqrt(x2r + y2r, sqrootpr);     cv::sqrt(x2 + y2, sqrootp);     double offsets = (cv::mean(sqrootpr) / cv::mean(sqrootp))[0];     pts *= offsets;      cv::mat rot(pts.rows, 1, cv_64fc1);     cv::mat rotr(pts.rows, 1, cv_64fc1);     rot = arctan2(pts.col(1), pts.col(0));     rotr = arctan2(ptsref.col(1), ptsref.col(0));     double offsetr = -cv::mean((rot - rotr))[0];     cv::mat angrot(pts.rows, 1, cv_64fc1);     angrot = rot + offsetr;     cv::mat dist(pts.rows, 1, cv_64fc1);     cv::pow(pts.col(0), 2, x2);     cv::pow(pts.col(1), 2, y2);     cv::sqrt(x2 + y2, dist);     copycolumn(dist.mul(cosine(angrot)), res, 0, 0);     copycolumn(dist.mul(sine(angrot)), res, 0, 1);      ops.at<double>(0, 0) = -avgs.at<double>(0, 0);     ops.at<double>(0, 1) = -avgs.at<double>(0, 1);     ops.at<double>(0, 2) = offsets * cv::cos(offsetr / radians_to_degrees);     ops.at<double>(0, 3) = offsets * cv::sin(offsetr / radians_to_degrees); }

this code align 2 sets of points. calls functions aren't shown, they're simple , can explain them if necessary, though hope names plenty understand do.

i'm casual c++ programmer, go easy on me guys.

it seem ignacio vazquez-abrams has right idea. more concise/direct example:

#include <boost/date_time/posix_time/posix_time.hpp> #include <cv.hpp> #include <iostream>  using namespace boost::posix_time;  int main() {     cv::mat m1(1000, 1000, cv_64fc1);     cv::mat m2(1000, 1000, cv_64fc1);     ptime firstvalue( microsec_clock::local_time() );     for(int = 0; < 10; i++) {         cv::mat m3 = m1 * m2;     }     ptime secondvalue( microsec_clock::local_time() );     time_duration diff = secondvalue - firstvalue;     std::cout << diff.seconds() << "." << diff.fractional_seconds() << " microsec" << std::endl; }

that takes around 14+ seconds in machine. python:

import datetime import numpy np  if __name__ == '__main__':     print datetime.datetime.now()     m1 = np.zeros((1000, 1000), dtype=float)     m2 = np.zeros((1000, 1000), dtype=float)     in range(1000):         m3 = np.dot(m1, m2)     print datetime.datetime.now()

that takes 4+ seconds, though c++ illustration doing 10 times, whereas python(fortran) 1 doing 1000.

well okay, update time.

i reviewed python code using , realized loading subset of points (about 5%)... means c++ tests running 20 times more instances python code, c++ code around 10 times faster, since code twice slow. still seems if numpy has opencv beat in operations though.

for(int = 0; < 10; i++) { cv::mat m3 = m1 * m2; }

this totally pointless in c++, m3 destroyed on each iteration of loop - that's why destructor calls.

edit:

cv::mat m3 = m1 * m2;

and

m3 = np.dot(m1, m2)

aren't same thing. have tried comparing cross product in numpy or dot product in opencv?

c++ python image-processing opencv numpy

Search This Blog

JC

python - OpenCV/C++ program slower than its numpy counterpart, what should I do? -

Comments

Post a Comment

Popular posts from this blog

iphone - Dismissing a UIAlertView -

c# - Can ProtoBuf-Net deserialize to a flat class? -

javascript - Change element in each JQuery tab to dynamically generated colors -