C++ – Simple object detection using OpenCV and machine learning

cimage processingobject-detectionopencv

I have to code an object detector (in this case, a ball) using OpenCV. The problem is, every single search on google returns me something with FACE DETECTION in it. So i need help on where to start, what to use etc..

Some info:

The ball doesn't have a fixed color, it will probably be white, but it might change.
I HAVE to use machine learning, doesn't have to be a complex and reliable one, suggestion is KNN (which is WAY simpler and easier).
After all my searching, i found that calculating the histogram of samples ball-only images and teaching it to the ML could be useful, but my main concern here is that the ball size can and will change (closer and further from the camera) and i have no idea on what to pass to the ML to classify for me, i mean.. i can't (or can I?) just test every pixel of the image for every possible size (from, lets say, 5×5 to WxH) and hope to find a positive result.
There might be a non-uniform background, like people, cloth behind the ball and etc..
As I said, i have to use a ML algorithm, that means no Haar or Viola algorithms.
Also, I thought on using contours to find circles on a Canny'ed image, just have to find a way to transform a contour into a row of data to teach the KNN.

So… suggestions?

Thanks in advance.
😉

Best Answer

Well, basically you need to detect circles. Have you seen cvHoughCircles()? Are you allowed to use it?

This page has good info on how detecting stuff with OpenCV. You might be more interested on section 2.5.

This is a small demo I just wrote to detect coins in this picture. Hopefully you can use some part of the code to your advantage.

Input: input img

Outputs: output opencv img

// compiled with: g++ circles.cpp -o circles `pkg-config --cflags --libs opencv`
#include <stdio.h>
#include <cv.h>
#include <highgui.h>
#include <math.h>

int main(int argc, char** argv)
{
    IplImage* img = NULL;

    if ((img = cvLoadImage(argv[1]))== 0)
    {
        printf("cvLoadImage failed\n");
    }

    IplImage* gray = cvCreateImage(cvGetSize(img), IPL_DEPTH_8U, 1);
    CvMemStorage* storage = cvCreateMemStorage(0);

    cvCvtColor(img, gray, CV_BGR2GRAY);

    // This is done so as to prevent a lot of false circles from being detected
    cvSmooth(gray, gray, CV_GAUSSIAN, 7, 7);

    IplImage* canny = cvCreateImage(cvGetSize(img),IPL_DEPTH_8U,1);
    IplImage* rgbcanny = cvCreateImage(cvGetSize(img),IPL_DEPTH_8U,3);
    cvCanny(gray, canny, 50, 100, 3);

    CvSeq* circles = cvHoughCircles(gray, storage, CV_HOUGH_GRADIENT, 1, gray->height/3, 250, 100);
    cvCvtColor(canny, rgbcanny, CV_GRAY2BGR);

    for (size_t i = 0; i < circles->total; i++)
    {
         // round the floats to an int
         float* p = (float*)cvGetSeqElem(circles, i);
         cv::Point center(cvRound(p[0]), cvRound(p[1]));
         int radius = cvRound(p[2]);

         // draw the circle center
         cvCircle(rgbcanny, center, 3, CV_RGB(0,255,0), -1, 8, 0 );

         // draw the circle outline
         cvCircle(rgbcanny, center, radius+1, CV_RGB(0,0,255), 2, 8, 0 );

         printf("x: %d y: %d r: %d\n",center.x,center.y, radius);
    }


    cvNamedWindow("circles", 1);
    cvShowImage("circles", rgbcanny);

    cvSaveImage("out.png", rgbcanny);
    cvWaitKey(0);

    return 0;
}

The detection of the circles depend a lot on the parameters of cvHoughCircles(). Note that in this demo I used Canny as well.

Related Solutions

Python – Peak detection in a 2D array

I detected the peaks using a local maximum filter. Here is the result on your first dataset of 4 paws: Peaks detection result

I also ran it on the second dataset of 9 paws and it worked as well.

Here is how you do it:

import numpy as np
from scipy.ndimage.filters import maximum_filter
from scipy.ndimage.morphology import generate_binary_structure, binary_erosion
import matplotlib.pyplot as pp

#for some reason I had to reshape. Numpy ignored the shape header.
paws_data = np.loadtxt("paws.txt").reshape(4,11,14)

#getting a list of images
paws = [p.squeeze() for p in np.vsplit(paws_data,4)]


def detect_peaks(image):
    """
    Takes an image and detect the peaks usingthe local maximum filter.
    Returns a boolean mask of the peaks (i.e. 1 when
    the pixel's value is the neighborhood maximum, 0 otherwise)
    """

    # define an 8-connected neighborhood
    neighborhood = generate_binary_structure(2,2)

    #apply the local maximum filter; all pixel of maximal value 
    #in their neighborhood are set to 1
    local_max = maximum_filter(image, footprint=neighborhood)==image
    #local_max is a mask that contains the peaks we are 
    #looking for, but also the background.
    #In order to isolate the peaks we must remove the background from the mask.

    #we create the mask of the background
    background = (image==0)

    #a little technicality: we must erode the background in order to 
    #successfully subtract it form local_max, otherwise a line will 
    #appear along the background border (artifact of the local maximum filter)
    eroded_background = binary_erosion(background, structure=neighborhood, border_value=1)

    #we obtain the final mask, containing only peaks, 
    #by removing the background from the local_max mask (xor operation)
    detected_peaks = local_max ^ eroded_background

    return detected_peaks


#applying the detection and plotting results
for i, paw in enumerate(paws):
    detected_peaks = detect_peaks(paw)
    pp.subplot(4,2,(2*i+1))
    pp.imshow(paw)
    pp.subplot(4,2,(2*i+2) )
    pp.imshow(detected_peaks)

pp.show()

All you need to do after is use scipy.ndimage.measurements.label on the mask to label all distinct objects. Then you'll be able to play with them individually.

Note that the method works well because the background is not noisy. If it were, you would detect a bunch of other unwanted peaks in the background. Another important factor is the size of the neighborhood. You will need to adjust it if the peak size changes (the should remain roughly proportional).

C++ – Image Processing: Algorithm Improvement for ‘Coca-Cola Can’ Recognition

An alternative approach would be to extract features (keypoints) using the scale-invariant feature transform (SIFT) or Speeded Up Robust Features (SURF).

You can find a nice OpenCV code example in Java, C++, and Python on this page: Features2D + Homography to find a known object

Both algorithms are invariant to scaling and rotation. Since they work with features, you can also handle occlusion (as long as enough keypoints are visible).

Enter image description here

Image source: tutorial example

The processing takes a few hundred ms for SIFT, SURF is bit faster, but it not suitable for real-time applications. ORB uses FAST which is weaker regarding rotation invariance.

Best Answer

Related Solutions

Python – Peak detection in a 2D array

C++ – Image Processing: Algorithm Improvement for ‘Coca-Cola Can’ Recognition

The original papers

Related Topic