Exploring OpenCV’s ML module

OpenCV is a de-facto in the computer vision world. Besides many useful features, it has a machine learning module in which the community has paid less attention to it. In this tutorial, we will see how to use famous ML models to classify handwritten digits. As always, the code is in C++ and available on GitHub.

Read more

2 replies
  1. Bartek Zdanowski
    Bartek Zdanowski says:

    Hello.
    Thank you for this and all other posts. I’ve just discovered you page and I’ve been reading article by article for past hour even that it’s 00:05 in the night :)

    Will you continue to write such a great posts? Are you willing to write CNN article as mentioned in the end?
    Best regards,
    Bartek

    Reply
    • Zana Zakaryaie
      Zana Zakaryaie says:

      Hi Bartek
      Thanks for your warm words :)
      I’m quite willing to write posts on CNNs and other computer vision topics but unfortunately, I’ve been very busy during the last couple of months. Hope to write new posts as soon as possible

      Reply

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *

19 − 2 =

Become an embedded computer vision engineer

Being an embedded computer vision engineer has been my first goal in the last couple of recent years. It all raised from my interest to image processing and the thirst for running codes faster and faster! Now after 7 years, when I look back and compare myself with the early days, I see lots of things that worth sharing. In this post, I will try to shed some light on how to start this path, get good jobs, and remain updated in this fast-changing branch of engineering. Read more

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *

15 − six =

Object trackers you deserve!

You want to continuously detect an object but your object detection algorithm fails to do so in some frames. The easiest way to solve the issue is neither training the detector with more data nor using a fancier detection algorithm. Clone this code and read the post to see how adding a good tracker will give you cleaner outputs with higher FPS!

Read more

6 replies
    • Zana Zakaryaie
      Zana Zakaryaie says:

      Thanks Saeed. Honestly, I’m not a Python coder. But you can use the python API of the Dlib library for running the multi-scale MOSSE tracker. For the multi-scale KCF, there is a Python wrapper here but you should change small parts so that your tracker can report failures. I have explained the details of this for the CPP code. I don’t think it would be hard to do the same for the Python wrapper. The FAST-KLT tracker however requires more work. You have to replace the OpenCV CPP functions with their python versions plus some object-oriented programming.

      Reply
  1. Rich
    Rich says:

    Hi, Zana. I’m porting your fast_klt C++ tracker to Python, but I’m having trouble following you without code documentation. I’m having trouble grasping the idea behind the void FastKltTracker::keepStrongest (line 160 of main/fast_klt/FastKltTracker.cpp) function.

    It seems you sort the keypoints calculated by FAST and then erase them based on a threshold (or pivot), but I’m not sure since std:: nth_element is not quite clear for me in this context. Can you give me a little bit more details of this operation? What’s the criteria to sort the keypoints (it seems that you sort the vector like this: left side of the pivot: larger than the pivot, left side of the pivot, smaller than the pivot – right?).

    Then comes the keypoint filtering, using erase. You seem to delete all the elements starting from the position of the original pivot (now ordered) until the end, is that right? This last bit is confusing me, you are not looking at the actual value of the pivot, but rather, its position… and you delete all the items starting from this position (inclusive) until the end of the vector, I’m not sure why. It would be great if you could explain me a little bit more of the actual idea, so I can translate it to pure Python.

    Thanks!

    Reply
    • Zana Zakaryaie
      Zana Zakaryaie says:

      Hi Rich
      FAST might detect hundreds of points in a bounding box but estimating the motion between two boxes does not really need many points. We prefer to have fewer BUT strong points because they are easier and more robust for tracking. This is done by keepStrongest function. We sort keypoints based on their response (descending), peak the first N points, and get rid of the others. This kind of sorting (where we are only interested in the first N items and not the others) can be done much faster with std::nth_element function, compared to std::sort. I don’t know the alternative function in python but you can just sort your array of keypoints based on their responses and then peak the first N points. If the speed was not satisfactory, take a look at this link.

      Reply

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *

eight + nineteen =

Transfer learning for face mask recognition using libtorch (Pytorch C++ API)

In the previous post, we learned how to load a pre-trained model in libtorch and classify images with them. But real-world applications often include objects which are not necessarily inside ImageNet dataset. We need a network to classify our custom targets. In this tutorial, we will use transfer learning to fine-tune Resnet18 for face mask recognition. The code for this series of libtorch posts can be found here

Read more

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *

1 × 3 =

Image classification with pre-trained models using libtorch (Pytorch C++ API)

Deep learning has revolutionized computer vision. There are thousands of Python code snippets to start but few ones in C++. If you like C++ like me and want to deploy your models in edge, then this series of posts are for you. As a gentle introduction, I will explain how to use libtorch to do image classification using pre-trained models. But there will be much more exciting posts in the future ;) Stay tuned.

Read more

10 replies
  1. Ayhan
    Ayhan says:

    Hi zana,
    Thanks your sharing best post.

    1- How I can define torch includes in c++ path?
    #include
    #include

    2-Torch script the only way for using pre-tranied models in c++ api currenlty?

    Reply
    • Zana Zakaryaie
      Zana Zakaryaie says:

      Thanks Ayhan.
      1. CMake handles the header files in the code. But if you want to work inside an IDE, then give it the path of the pre-built libtorch. Depending on your IDE, there must be places to give the address of header files (include folder) as well as static and shared libraries (lib folder).

      2. To the best of my knowledge, yes

      Reply
      • Ayhan
        Ayhan says:

        Thanks,
        What’s your IDE for opencv c++ and pytorch c++ to simply define path? Where I must to set opencv and pytorch c++ path in your used IDE?

        I have some question about comparing c++ and python codes in embedded devices:
        I know for computing large operations like matrix , … It’s better to use c++, but suppose I want to do some operations like NMS in detectors or assign ID s and checking some matching boxes for object tracking, Is it better to use c++ or maybe I can same performance with python?

        Reply
        • Zana Zakaryaie
          Zana Zakaryaie says:

          Hi Ayhan

          Regarding IDE:
          I use Code::Blocks. Go in the “settings->compiler” tab. Then in the “search directories” set the path of header files of your library (either OpenCV or Libtorch). In the “linker settings” set the path of shared (.so) or static (.a) libraries of OpenCV or Libtorch. But if you got issues, don’t waste time on that. Use Code::Blocks (or any other IDE) just as a text editor. Once you think your code is ready, prepare a “CMakeLists.txt” and then use “cmake” command to generate a “Makefile”. Finally build your code with “make” command. You can find exerpts of this routine in my GitHuhb repos.

          Regarding Performance:
          I think Python can give you a good performance for the tasks you mentioned. But try to use numpy vectorized commands as much as possible. If it didn’t meet your requirements, then go with C++.

          Reply

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *

17 + 3 =

Low-latency Full-HD video streaming from Raspberry Pi

Raspberry Pi boards are getting more and more widespread. But when it comes to real-time video streaming, you may find yourself lost in a bunch of long-reptile shell commands! In this post, I will give you some crystal clear instructions to receive a low-latency stream from a CSI or USB camera. They key to achieve this is to do the h264 encoding on the RPi GPU (not CPU). The stream is then received frame-by-frame in an OpenCV code. You can later add text, layers, or do any other process you wish.

Read more

17 replies
  1. Deepu VP
    Deepu VP says:

    Hi Zana, this article is really helping me stream the video from my Pi 4. But I’m not able to play it with VLC on a Windows box. Are you able to help me with this?
    I tried:
    udp://@:5000
    rtp://@:5000
    with :demux=h264 and without but none of them seem to work.

    Thanks

    Reply
    • Zana Zakaryaie
      Zana Zakaryaie says:

      Hi
      Honestly, I didn’t test it with VLC. Regarding Windows, I don’t think that it requires something different than Linux because OpenCV, FFmpeg, and Gstreamer are all cross-platform libs.
      Just make sure you have installed them and you are using the correct IPs. My Linux IP was 10.42.0.1 and I used port #5000 but you can use other ports too. Btw, If you can see the frames on your laptop via cv::imshow, then you can use cv::VideoWriter to save them as a video and later play it by VLC or any other video player.

      Reply
  2. David B
    David B says:

    Not to ask a dumb question (I’m not a coder), I’d like to build this because I’m looking for a low latency solution for streaming from the Pi to a laptop. When you say “Back on the laptop build and run this code”, what exactly does that mean? What app do I use to build that, how do I run it etc? Again, sorry if that’s basic. I’m using windows 10.

    Thanks

    Reply
    • Zana Zakaryaie
      Zana Zakaryaie says:

      Hi David.
      Your question (as someone with no coding experience) is not dumb at all. The code below the section “Back in the laptop build and run this code”, is a C++ code that uses OpenCV library. To run this code (and generally every other C++ code), we have to first build it. This can be done by MinGW on Windows. So make sure you have it installed on your machine. The next step is to build the OpenCV library. This is not easy enough to be explained in a comment. I’ve mentioned the instructions for Linux here but since you are using Windows, I would suggest to read this post.

      Reply
  3. David
    David says:

    Do you know a solution how to stream this to a web server on the Pi? I don’t want to use any software just open the ip in my browser. I’ve tried with a Python web server and the picamera library but that gives me a very poor frame rate. Any ideas?

    Good post tho

    Reply
    • pete
      pete says:

      You could definitely run the given code from Zana, send it to a server (i.e. another raspberry pi, more powerful is better though, like 3b+) transcode it from UDP to HLS or for lower latency the new LL-HLS then you get something like playlist.m3u8 which you can make available through an apache or nginx webserver (on the raspberry pi) to the network and then request the stream in normal HTML through the tag. This is rather complicated but one of the only solutions I can think of.
      If you want to implement P2P streaming so from the camera directly to your phone or PC, you can just install an “IP cam viewer” and enter the according URL.

      Reply
  4. Farzad
    Farzad says:

    Hi Zana,

    I was searching for a method to stream RPi cameras with low latency and came across your blog. It was very helpful for me!

    I just wanted to mention, for streaming the camera, this command also works fine:

    ‘raspivid -t 0 -w 800 -h 600 -fps 30-hf -vf -b 50000000 -o – | gst-launch-1.0 -e -vvvv fdsrc ! h264parse ! rtph264pay pt=96 config-interval=1 ! udpsink host=xx.xx.xx.xx port=5000’

    Reply
    • Zana Zakaryaie
      Zana Zakaryaie says:

      I don’t remember to be honset. But since that post has been publishes at NOVEMBER 5, 2020 and since I have used git clone for getting the raspicam, you can find the latest available version at that time by checking their github repo.

      Reply
  5. Ulfberth
    Ulfberth says:

    Hi Zana! Thanks for your tutorial!
    Rpi newbie here :D
    What hardware used in your tutorial?
    Is zero w ( not zero 2 w) capable of doing 640×480 at 25 fps and <100 ms latency?

    Reply
  6. Nejc
    Nejc says:

    Hello, after running “sudo ./autogen.sh ……” I get error saying that I have to install or upgrade Gstreamer development packages. I have them already and versions are 1.22. Too new?
    Thank you

    Reply

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *

2 × five =

Looking for fast object detection on ARM CPUs?

In the previous post, I explained the idea behind cascade classifiers. In this post, I will give you some clear instructions to easily train an accurate custom object detector using my C++ toolbox. We will see how easily we can accelerate this accurate detector by 20%. Stay tuned :)

Read more

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *

four × 2 =

A brief explanation of cascade classifiers

Cascade classifier is an old algorithm which was originally proposed for real-time face detection on CPUs. In this post, I will cover it’s nice and powerful idea, and then in the next post give you some clear instructions to easily train an accurate custom object detector using my C++ toolbox.

Read more

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *

4 + 15 =

Advanced tips to optimize C++ codes for computer vision (III): Parallel Processing

Multicore processing was a paradigm shift in computer science. The move was such big that today its really hard to find single-core CPUs even on low power SBCs. Computer vision algorithms, from simple pixel manipulations to the more complex tasks like classification with deep neural networks, have the potential to run parallel on multi cores. In this post, we will see how to easily parallelize the Gaussian-blur function that we implemented in the previous post. Our code will run almost 3x faster than the single-threaded version.

Read more

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *

ten + 18 =

Advanced tips to optimize C++ codes for computer vision tasks (II): SIMD

Single Instruction Multiple Data (SIMD), also known as vectorization, is a powerful technique for accelerating computer vision algorithms. In this post, I will explain the concept and then introduce an easy way to use it inside your codes. We will see how we can benefit from SIMD to further reduce the runtime of the Gaussian-blur function that we implemented in the previous post. Read more

2 replies
  1. Wei
    Wei says:

    Hey, Zana,

    I downloaded the tutorial code and run the simd code in my x86 PC, got an error ” ‘CV_CPU_HAS_SUPPORT_SSE2’ was not declared in this scope”. I switched to run code on ARM platform, got a similar error ” ‘CV_CPU_HAS_SUPPORT_NEON’ was not declared in this scope”. Do you have any idea to solve it?

    Thanks for your reply.

    Reply
    • Zana Zakaryaie
      Zana Zakaryaie says:

      Hi Wei
      Sorry for my late reply
      For any runtime or compilation error, please make an issue in the Github repo. This way, other users can also see and follow the solution if they face the same issue

      Reply

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *

4 − four =