Exploring OpenCV’s ML module

OpenCV is a de-facto in the computer vision world. Besides many useful features, it has a machine learning module in which the community has paid less attention to it. In this tutorial, we will see how to use famous ML models to classify handwritten digits. As always, the code is in C++ and available on GitHub.

2 replies

Bartek Zdanowski says:
February 6, 2022 at 2:36 am

Hello.
Thank you for this and all other posts. I’ve just discovered you page and I’ve been reading article by article for past hour even that it’s 00:05 in the night :)

Will you continue to write such a great posts? Are you willing to write CNN article as mentioned in the end?
Best regards,
Bartek
Reply
- Zana Zakaryaie says:
  February 7, 2022 at 9:59 am
  
  Hi Bartek
  Thanks for your warm words :)
  I’m quite willing to write posts on CNNs and other computer vision topics but unfortunately, I’ve been very busy during the last couple of months. Hope to write new posts as soon as possible
  Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply Cancel reply

Become an embedded computer vision engineer

Being an embedded computer vision engineer has been my first goal in the last couple of recent years. It all raised from my interest to image processing and the thirst for running codes faster and faster! Now after 7 years, when I look back and compare myself with the early days, I see lots of things that worth sharing. In this post, I will try to shed some light on how to start this path, get good jobs, and remain updated in this fast-changing branch of engineering. Read more

0 replies

Want to join the discussion?
Feel free to contribute!

Leave a Reply Cancel reply

Object trackers you deserve!

You want to continuously detect an object but your object detection algorithm fails to do so in some frames. The easiest way to solve the issue is neither training the detector with more data nor using a fancier detection algorithm. Clone this code and read the post to see how adding a good tracker will give you cleaner outputs with higher FPS!

6 replies

Saeed says:
June 1, 2021 at 10:36 pm

Great job Zana. Do you happen to have this in python?
Reply
- Zana Zakaryaie says:
  June 2, 2021 at 8:15 am
  
  Thanks Saeed. Honestly, I’m not a Python coder. But you can use the python API of the Dlib library for running the multi-scale MOSSE tracker. For the multi-scale KCF, there is a Python wrapper here but you should change small parts so that your tracker can report failures. I have explained the details of this for the CPP code. I don’t think it would be hard to do the same for the Python wrapper. The FAST-KLT tracker however requires more work. You have to replace the OpenCV CPP functions with their python versions plus some object-oriented programming.
  Reply
Rich says:
December 21, 2021 at 7:18 am

Hi, Zana. I’m porting your fast_klt C++ tracker to Python, but I’m having trouble following you without code documentation. I’m having trouble grasping the idea behind the void FastKltTracker::keepStrongest (line 160 of main/fast_klt/FastKltTracker.cpp) function.

It seems you sort the keypoints calculated by FAST and then erase them based on a threshold (or pivot), but I’m not sure since std:: nth_element is not quite clear for me in this context. Can you give me a little bit more details of this operation? What’s the criteria to sort the keypoints (it seems that you sort the vector like this: left side of the pivot: larger than the pivot, left side of the pivot, smaller than the pivot – right?).

Then comes the keypoint filtering, using erase. You seem to delete all the elements starting from the position of the original pivot (now ordered) until the end, is that right? This last bit is confusing me, you are not looking at the actual value of the pivot, but rather, its position… and you delete all the items starting from this position (inclusive) until the end of the vector, I’m not sure why. It would be great if you could explain me a little bit more of the actual idea, so I can translate it to pure Python.

Thanks!
Reply
- Zana Zakaryaie says:
  December 21, 2021 at 11:11 am
  
  Hi Rich
  FAST might detect hundreds of points in a bounding box but estimating the motion between two boxes does not really need many points. We prefer to have fewer BUT strong points because they are easier and more robust for tracking. This is done by keepStrongest function. We sort keypoints based on their response (descending), peak the first N points, and get rid of the others. This kind of sorting (where we are only interested in the first N items and not the others) can be done much faster with std::nth_element function, compared to std::sort. I don’t know the alternative function in python but you can just sort your array of keypoints based on their responses and then peak the first N points. If the speed was not satisfactory, take a look at this link.
  Reply
  - Rich says:
    January 4, 2022 at 7:42 am
    
    Thanks, Zana. The port is (mostly) complete. At least an initial version – I’m still improving it, there are some tricks to reproduce the same functionality within Python (The sorting problem in my previous question, for example, was solved using a Heap Queue algorithm). Here’s the repository: https://github.com/gone-still/fastKLT-tracker
    Reply
    - Zana Zakaryaie says:
      January 4, 2022 at 5:23 pm
      
      Glad to hear your success Rich. I will mention your repo so that more python-friendly people can use it :)
      Thank you
      Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply Cancel reply

Transfer learning for face mask recognition using libtorch (Pytorch C++ API)

In the previous post, we learned how to load a pre-trained model in libtorch and classify images with them. But real-world applications often include objects which are not necessarily inside ImageNet dataset. We need a network to classify our custom targets. In this tutorial, we will use transfer learning to fine-tune Resnet18 for face mask recognition. The code for this series of libtorch posts can be found here.

0 replies

Want to join the discussion?
Feel free to contribute!

Leave a Reply Cancel reply

Image classification with pre-trained models using libtorch (Pytorch C++ API)

Deep learning has revolutionized computer vision. There are thousands of Python code snippets to start but few ones in C++. If you like C++ like me and want to deploy your models in edge, then this series of posts are for you. As a gentle introduction, I will explain how to use libtorch to do image classification using pre-trained models. But there will be much more exciting posts in the future ;) Stay tuned.

10 replies

Ayhan says:
November 17, 2020 at 11:04 pm

Hi zana,
Thanks your sharing best post.

1- How I can define torch includes in c++ path?
#include
#include

2-Torch script the only way for using pre-tranied models in c++ api currenlty?
Reply
- Zana Zakaryaie says:
  November 18, 2020 at 12:48 am
  
  Thanks Ayhan.
  1. CMake handles the header files in the code. But if you want to work inside an IDE, then give it the path of the pre-built libtorch. Depending on your IDE, there must be places to give the address of header files (include folder) as well as static and shared libraries (lib folder).
  
  2. To the best of my knowledge, yes
  Reply
  - Ayhan says:
    December 4, 2020 at 7:35 pm
    
    Thanks,
    What’s your IDE for opencv c++ and pytorch c++ to simply define path? Where I must to set opencv and pytorch c++ path in your used IDE?
    
    I have some question about comparing c++ and python codes in embedded devices:
    I know for computing large operations like matrix , … It’s better to use c++, but suppose I want to do some operations like NMS in detectors or assign ID s and checking some matching boxes for object tracking, Is it better to use c++ or maybe I can same performance with python?
    Reply
    - Zana Zakaryaie says:
      December 4, 2020 at 10:29 pm
      
      Hi Ayhan
      
      Regarding IDE:
      I use Code::Blocks. Go in the “settings->compiler” tab. Then in the “search directories” set the path of header files of your library (either OpenCV or Libtorch). In the “linker settings” set the path of shared (.so) or static (.a) libraries of OpenCV or Libtorch. But if you got issues, don’t waste time on that. Use Code::Blocks (or any other IDE) just as a text editor. Once you think your code is ready, prepare a “CMakeLists.txt” and then use “cmake” command to generate a “Makefile”. Finally build your code with “make” command. You can find exerpts of this routine in my GitHuhb repos.
      
      Regarding Performance:
      I think Python can give you a good performance for the tasks you mentioned. But try to use numpy vectorized commands as much as possible. If it didn’t meet your requirements, then go with C++.
      Reply
John Rogers says:
December 1, 2020 at 5:40 am

just found your blog, it’s very helpful, many thanks
Reply
- Zana Zakaryaie says:
  December 1, 2020 at 10:23 am
  
  Thanks John. I’m glad you find it useful :)
  Reply
Raghunath says:
May 14, 2022 at 7:00 pm

Amazing posts really impressed with contents especially w.r.t C++

Thanks Zana Zakaryaie ji
Reply
- Zana Zakaryaie says:
  May 15, 2022 at 1:33 pm
  
  Thanks for your kind word Raghuntah. I’m glad you find it useful
  Reply
jhnam says:
July 15, 2022 at 1:25 pm

hello, thanks for your post.
where is data? (txt file and animal pictures)
Reply
- Zana Zakaryaie says:
  July 15, 2022 at 5:36 pm
  
  Hi
  Please check the Github repository:
  https://github.com/zanazakaryaie/libtorch_examples/tree/main/image_classification_pretrained/data
  Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply Cancel reply

Low-latency Full-HD video streaming from Raspberry Pi

Raspberry Pi boards are getting more and more widespread. But when it comes to real-time video streaming, you may find yourself lost in a bunch of long-reptile shell commands! In this post, I will give you some crystal clear instructions to receive a low-latency stream from a CSI or USB camera. They key to achieve this is to do the h264 encoding on the RPi GPU (not CPU). The stream is then received frame-by-frame in an OpenCV code. You can later add text, layers, or do any other process you wish.

17 replies

Deepu VP says:
July 20, 2021 at 10:37 pm

Hi Zana, this article is really helping me stream the video from my Pi 4. But I’m not able to play it with VLC on a Windows box. Are you able to help me with this?
I tried:
udp://@:5000
rtp://@:5000
with :demux=h264 and without but none of them seem to work.

Thanks
Reply
- Zana Zakaryaie says:
  July 21, 2021 at 8:45 am
  
  Hi
  Honestly, I didn’t test it with VLC. Regarding Windows, I don’t think that it requires something different than Linux because OpenCV, FFmpeg, and Gstreamer are all cross-platform libs.
  Just make sure you have installed them and you are using the correct IPs. My Linux IP was 10.42.0.1 and I used port #5000 but you can use other ports too. Btw, If you can see the frames on your laptop via cv::imshow, then you can use cv::VideoWriter to save them as a video and later play it by VLC or any other video player.
  Reply
David B says:
March 4, 2022 at 12:33 am

Not to ask a dumb question (I’m not a coder), I’d like to build this because I’m looking for a low latency solution for streaming from the Pi to a laptop. When you say “Back on the laptop build and run this code”, what exactly does that mean? What app do I use to build that, how do I run it etc? Again, sorry if that’s basic. I’m using windows 10.

Thanks
Reply
- Zana Zakaryaie says:
  March 4, 2022 at 10:28 am
  
  Hi David.
  Your question (as someone with no coding experience) is not dumb at all. The code below the section “Back in the laptop build and run this code”, is a C++ code that uses OpenCV library. To run this code (and generally every other C++ code), we have to first build it. This can be done by MinGW on Windows. So make sure you have it installed on your machine. The next step is to build the OpenCV library. This is not easy enough to be explained in a comment. I’ve mentioned the instructions for Linux here but since you are using Windows, I would suggest to read this post.
  Reply
David says:
March 12, 2022 at 8:07 pm

Do you know a solution how to stream this to a web server on the Pi? I don’t want to use any software just open the ip in my browser. I’ve tried with a Python web server and the picamera library but that gives me a very poor frame rate. Any ideas?

Good post tho
Reply
- Zana Zakaryaie says:
  March 12, 2022 at 10:06 pm
  
  Hi David
  I’ve no idea how to receive the stream in a browser. I hope someone here will give you a hint
  Reply
- pete says:
  June 13, 2022 at 11:59 pm
  
  You could definitely run the given code from Zana, send it to a server (i.e. another raspberry pi, more powerful is better though, like 3b+) transcode it from UDP to HLS or for lower latency the new LL-HLS then you get something like playlist.m3u8 which you can make available through an apache or nginx webserver (on the raspberry pi) to the network and then request the stream in normal HTML through the tag. This is rather complicated but one of the only solutions I can think of.
  If you want to implement P2P streaming so from the camera directly to your phone or PC, you can just install an “IP cam viewer” and enter the according URL.
  Reply
Farzad says:
October 20, 2022 at 2:03 pm

Hi Zana,

I was searching for a method to stream RPi cameras with low latency and came across your blog. It was very helpful for me!

I just wanted to mention, for streaming the camera, this command also works fine:

‘raspivid -t 0 -w 800 -h 600 -fps 30-hf -vf -b 50000000 -o – | gst-launch-1.0 -e -vvvv fdsrc ! h264parse ! rtph264pay pt=96 config-interval=1 ! udpsink host=xx.xx.xx.xx port=5000’
Reply
- Zana Zakaryaie says:
  October 22, 2022 at 8:57 am
  
  Hi Farzad. Thank you very much for sharing
  Reply
- Nil Palau says:
  August 11, 2023 at 3:00 am
  
  Hi Farzad! Do you know how to connect to the stream using VLC once I run the command you suggested? Thanks!
  Reply
ali says:
May 22, 2023 at 5:57 pm

Hi Zana
What was the version of raspicam you used in first section of your test (csi)?
Reply
- Zana Zakaryaie says:
  May 30, 2023 at 6:41 pm
  
  I don’t remember to be honset. But since that post has been publishes at NOVEMBER 5, 2020 and since I have used git clone for getting the raspicam, you can find the latest available version at that time by checking their github repo.
  Reply
Ulfberth says:
September 9, 2023 at 7:35 pm

Hi Zana! Thanks for your tutorial!
Rpi newbie here :D
What hardware used in your tutorial?
Is zero w ( not zero 2 w) capable of doing 640×480 at 25 fps and <100 ms latency?
Reply
- Zana Zakaryaie says:
  September 10, 2023 at 6:56 am
  
  Hi. Glad to see you like the post!
  I used Rpi3B+. Since the encoding is done on GPU, you should get similar results to what I’ve reported because zero w and 3b+ have the same GPU
  Reply
  - Ulfberth says:
    September 10, 2023 at 1:58 pm
    
    For some reason I get “failed” on this step: sudo ./autogen.sh –prefix=/usr –libdir=/usr/lib/arm-linux-gnueabihf/
    So I had to also install libgstreamer-plugins-base1.0-dev.
    
    This is just in case someone will get similar error
    
    full log is here https://pastebin.com/tnrB9RbS
    Reply
Ulfberth says:
September 10, 2023 at 3:57 pm

I can’t get it to work :(

when i launch code on my laptop and then on pri, i get error on rpi : https://pastebin.com/g1VAgdMF
and this on laptop: https://pastebin.com/UHYkPUf3
Reply
Nejc says:
December 9, 2023 at 12:51 am

Hello, after running “sudo ./autogen.sh ……” I get error saying that I have to install or upgrade Gstreamer development packages. I have them already and versions are 1.22. Too new?
Thank you
Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply Cancel reply

Looking for fast object detection on ARM CPUs?

In the previous post, I explained the idea behind cascade classifiers. In this post, I will give you some clear instructions to easily train an accurate custom object detector using my C++ toolbox. We will see how easily we can accelerate this accurate detector by 20%. Stay tuned :)

0 replies

Want to join the discussion?
Feel free to contribute!

Leave a Reply Cancel reply

A brief explanation of cascade classifiers

Cascade classifier is an old algorithm which was originally proposed for real-time face detection on CPUs. In this post, I will cover it’s nice and powerful idea, and then in the next post give you some clear instructions to easily train an accurate custom object detector using my C++ toolbox.

0 replies

Want to join the discussion?
Feel free to contribute!

Leave a Reply Cancel reply

Advanced tips to optimize C++ codes for computer vision (III): Parallel Processing

Multicore processing was a paradigm shift in computer science. The move was such big that today its really hard to find single-core CPUs even on low power SBCs. Computer vision algorithms, from simple pixel manipulations to the more complex tasks like classification with deep neural networks, have the potential to run parallel on multi cores. In this post, we will see how to easily parallelize the Gaussian-blur function that we implemented in the previous post. Our code will run almost 3x faster than the single-threaded version.

0 replies

Want to join the discussion?
Feel free to contribute!

Leave a Reply Cancel reply

Advanced tips to optimize C++ codes for computer vision tasks (II): SIMD

Single Instruction Multiple Data (SIMD), also known as vectorization, is a powerful technique for accelerating computer vision algorithms. In this post, I will explain the concept and then introduce an easy way to use it inside your codes. We will see how we can benefit from SIMD to further reduce the runtime of the Gaussian-blur function that we implemented in the previous post. Read more

2 replies

Wei says:
September 8, 2021 at 7:17 am

Hey, Zana,

I downloaded the tutorial code and run the simd code in my x86 PC, got an error ” ‘CV_CPU_HAS_SUPPORT_SSE2’ was not declared in this scope”. I switched to run code on ARM platform, got a similar error ” ‘CV_CPU_HAS_SUPPORT_NEON’ was not declared in this scope”. Do you have any idea to solve it?

Thanks for your reply.
Reply
- Zana Zakaryaie says:
  September 8, 2021 at 2:22 pm
  
  Hi Wei
  Sorry for my late reply
  For any runtime or compilation error, please make an issue in the Github repo. This way, other users can also see and follow the solution if they face the same issue
  Reply

Want to join the discussion?
Feel free to contribute!