Hand detection through Haartraining: A hands-on approach

Detection of objects from images or from video is no trivial task. We need to use some type of machine learning algorithm and train it to detect features and also identify misses and false positives. The haartraining algorithm does just this. It creates a series of haarclassifiers which ensure that non-features are quickly rejected as the object is identified.

This post will highlight the necessary steps required to build a haarclassifier for detection a hand or any object of interest. This post is sequel to my earlier post (OpenCV: Haartraining and all that jazz!) and has a lot more detail. In order to train the haarclassifier, it is suggested, that at least 1000 positive samples (images with the object of interest- hand in this case) and 2000 negative samples (any other image) is required.

As before for performing haartraining the following 3 steps have to be performed
1)      Create samples (createsamples.cpp)
2)      Haar Training (haartraining.cpp)
3)      Performance testing of the classifier (performance.cpp)

In order to build the above 3 utilities please refer to my earlier post OpenCV: Haartraining and all that jazz!

The steps required for training a haarcascade to recognize on normal open palm is given below

Create Samples: This step can be further broken down into the following 3 steps
a)      Creation of positive samples
b)      Superimposing the positive sample on the negative sample
c)      Merging of vector files of samples.
A)      Creation of positive samples:

a)Get a series of images with objects of interest (positive samples):For this step take photos using the webcam (or camera) of the objects that you are interested. I had taken several snapshots of my hand ( I later simply downloaded images from Google images for because of the excessive clutter in the snaps taken).

b) Crop Images:In this step you need to crop the images such that it only contains the object of interest. You can use any photo editing tool of your choice and save all the positive images in a directory for e.g. ./hands
c) Mark the object:Now the image with the positive sample has to be marked. This will be used for creating the samples.
The tool that is to be used for marking is objectmarker.cpp. You can downloaded the source code for this from Achu Wilson’s blogNow build objectmarker.cpp with the include directories and libraries of OpenCV. Once you have successfully built objectmarker you are ready to mark the positive samples. Samples have to be marked because the description file for the positive samples file must be in the following format

[filename] [# of objects] [[x y width height] [… 2nd object] …]

This will be used for generating the positive training samples. This file is will be used with the createsamples utility to create positive training samples.
The command to use to run objectmarker is as follows
$ objectmarker <output file> <dir>
for e.g.
$objectmarker pos_desc.txt ./images
where dir is the directory containing the positive images
and output file will contain the positions of the objects marked as follows
pos_desc.txt
/images/hand1.bmp 1 0 0 246 50
/images/hand2.bmp 1 187 26 333 400
….
The use of the utility objectmarker is an art 😉 and you need to be trained to get the proper data. Make sure you get sensible widths and heights. There are times when you get -ve widths and -ve heights which are clearly wrong.

An alternative easy way is to open the jpg/bmp in a photo editor and check the width and height in pixels (188 x 200 say) and create the description file as
/images/hand1.bmp 1 0 0 188 200
Ideally the objectmarker should give you values close to this if the sample image file has only the object of interest (hand)
Create positive training samples
The createsamples utility can now be used for creating positive training sample files. The command is
createsamples -info pos_desc.txt -vec pos.vec -w 20 -h 20

This will create positive training samples with the object of interest, in our case the “hand”. Now, you should verify that the tool has done something sensible by checking the training samples generated. The command to use to display the positive training samples captured in the pos.vec is to run
createsamples -vec pos.vec -w 20 -h 20
If the objects have been marked accurately you should see a series of hands (positive samples). If the samples have not been marked correctly the positive training file will give incorrect results.
Create negative background training samples
This step is used to create training samples with one positive image superimposed against a set of negative background samples. The command to use is
createsamples -img ./image/hand_1.BMP -num 9 -bg bg.txt -vec neg1.vec -maxxangle 0.6 -maxyangle 0 -maxzangle 0.3 -maxidev 100 -bgcolor 0 -bgthresh 0 -w 20 -h 20
where bg.txt will contain the list of negative samples in the following format
./negative/airplane.jpg
./negative/baboon.jpg
./negative/cat.jpg
…
The positive hand images will be superimposed against the negative background in various angles of distortion.

All the negative training samples will be collected in the training file neg1.vec as above. As before you can verify that the createsamples utility has done something reasonable by executing

createsamples -vec neg1.vec -w 20 -h 20.
This should show a series of images of hands in various angles of distortion against the negative image background.
Creating several negative training samples with all positive samples
The createsamples utility takes one positive sample and superimposes it against the negative samples in bg.txt file. However we need to repeat this process for each of the positive sample (hand) that we have. Since this can be laborious process you can use the following shell script, I wrote, to repeat the negative training sample with every positive sample with create_neg_training.sh

create_neg_training.sh
#!/bin/bash
let j=1
for i in `cat hands.txt`
do
createsamples -img ./image/$i -num 9 -bg bg.txt -vec “neg_training$j.vec” -maxxangle 0.6 -maxyangle 0 -maxzangle 0.3 -maxidev 100 -bgcolor 0 -bgthresh 0 -w 20 -h 20
let j=j+1
done
where the positive samples are under ./image directory. For each positive hand sample a negative training sample file is create namely neg_training1.vec, neg_training2.vec etc.

Merging samples:
As seen above the createsamples utility superimposes only positive sample (hand) against a series of negative samples. Since we would like to repeat the utility for multiple positive samples there needs to be a way for merging all the training samples. A useful utility (mergevec.cpp) has been created and is available for download in Natoshi Seo’s blog
Download mergevec.cpp and build it along with (cvboost.cpp, cvhaarclassifier.cpp, cvhaartarining.cpp, cvsamples.cpp, cvcommon.cpp) with usual include files and the link libraries.
Once built it can be executed by executing
mergevec pos_neg.txt pos_neg.vec – w 20 -h 20
where pos_neg.txt will contain both the positive and negative training sample files as follows
pos_neg.txt
./vec/pos.vec
./vec/neg_training1.vec
./vec/neg_training2.vec
….

As before you can verify that the entire training file pos_neg.vec is sensible by executing
createsamples -vec pos_neg.vec -w 20 -h 20
Now the pos_neg.vec will contain all the training samples that are required for the haartraining process.

HaarTraining
The haartraining can be run with the training samples generated from the mergevec utility described above.
The command is
./haartrainer -data haarcascade -vec pos_neg.vec -bg bg.txt -nstages 20 -nsplits 2 -minhitrate 0.999 -maxfalsealarm 0.5 -npos 7 -nneg 9 -w 20 -h 20 -nonsym -mem 512 -mode ALL
Several posts have suggested that nstages should be ideally 20 and splits should be 2.
npos indicates the number of positive samples and -nneg the number of negative samples. In my case I had just used 7 positive and 9 negative samples. This step is extremely CPU intensive and can take several hours/days to complete. I had reduced the number of stages to 14. The haartraining utility will create a haarcascade directory and a haarcascade.xml.

Performance testing : The first way to test the integrity of the haarcascades is to run the performance utility described in my earlier post OpenCV:Haartraining and all that jazz! If you want to use the performance utility you should also create test samples which can be used for testing with the command

createsamples -img ./image/hand-1.bmp -num 10 -bg bg.txt -info test.dat -maxxangle 0.6 -maxyangle 0 -maxzangle 0.3 -maxidev 100 -bgcolor 0 -bgthresh 0
The test.dat can be used with the performance utility as follows
./performance -data haarcascade.xml -info test.dat -w 20 -h 20

I wanted something that would be more visually satisfying that seeing the output of the performance utility. This utility just spews out textual information of hits, misses.
So I decided to use the facedetect.c with the haarcascade trained by my positive and negative samples to check whether it was working. The test below describes using the facedetect.c for detecting the hand.
The code for facedetect.c can be downloaded from Willow Garage Wiki.

Compile and link the facedetect.c as handetect with the usual suspects. Once this builds successfully you can use
handdetect –-cascade=haarcascade.xml hands.jpg

where hands.jpg was my test image. If the handdetect does indeed detect the hand in the image it will l enclose the object detected with a rectangle. See the output below

While my haarcascade does detect 5 hands it does appear shifted. This could be partly because the number of positive and negative training samples used were 7 & 9 respectively which is very low. Also I used only 14 stages in the haartraining. As mentioned in the beginning there is a need for at least 1000 positive and 2000 negative samples which has to be used. Also haartraining should have 20 stages and 2 nsplits. Hopefully if you follow this you would have developed a fairly true haarcascade.
If you are adventurous enough you could run the above with the webcam as
handdetect –cascade=haarcascade.xml 0
where 0 indicates the webcam. Assuming that your haarcascade is perfect you should be able to track your hand in real time.
Haarpy training!
See also
– OpenCV: Haartraing and all that jazz!
– De-blurring revisited with Wiener filter using OpenCV
– Dabbling with Wiener filter using OpenCV
– Deblurring with OpenCV: Wiener filter reloaded
– Re-working the Lucy-Richardson Algorithm in OpenCV

Find me on Google+

OpenCV: Haartraining and all that jazz!

Object detection in OpenCV can be done through Haartraining. OpenCV haartraining applications provide the ability to detect objects of interest to us like faces, eyes, moving cars etc. What is required is that we need to identify positive and negative samples and use the utilities that OpenCV provides us to train the application to be able to recognize the objects we want. The positive and negative samples are used to create classifiers which are then utilized to detect the objects that we intend to.

Fortunately the tar file (OpenCV-2.3.1a.tar.bz2) which can be downloaded from the OpenCV site already comes bundled with all the necessary utilities to create a fully trained haar classifier. We just to have to build and run the commands to create our our classifiers.

This post will look at the necessary steps for creating a haarclassifier. To get started with OpenCV please look at my earlier post Computer Vision: Getting started with OpenCV.

Once you have installed OpenCV look under modules/haartraining. All the necessary files are included.

There are 3 steps in this process

1) Create samples 2) Train and create a classified haar 3) Performance testing

Create Samples (createsamples.cpp) : This is the first utility that has to be executed. This utility takes as input positive samples (images of the object that we are interested in) and negative samples (images that do not include the object that we are interested in). The createsamples utility superimposes the objects that we want to recognize in various degrees of rotation against the background of the negative samples. The composite images file is then used to train the haar application.

The first step is to build createsamples.cpp. Make sure you include ../OpenCV-2.3./modules/haartraining. Build the following files

(createsamples.cpp, cvboost.cpp, cvhaarclassifier.cpp, cvhaartarining.cpp, cvsamples.cpp, cvcommon.cpp)

Once createsamples.cpp successfully builds we can create samples required for the training.

The command is

$./myhaartraining -img logo.png -vec samples1.jpg -bg bg.txt -w 20 -h 20

Where I have chosen the OpenCV logo as my object of interest.

The samples are created in samples1.jpg

The bg.txt contains a list of the negative samples included as below

./img/airplane.jpg

./img/baboon.jpg

./img/kid.jpg

./img/lena.jpg

-w stands for the width and -h stands for the height of the samples/

This will create superimposed positive samples in the negative sample background. A few of these a are shown below

Make sure that the width and height are small i.e < 50 otherwise the haartraining application core dumps because of lack of memory.

To check if the samples are created properly run a test round as follows run the command with the following options

$./myhaartraining -img logo.png -vec samples1.jpg -bg bg.txt -n 10 – show -w 20 -h 20

where -n is the number of samples to be generated (default is 1000) -show will show a series of images with the positive samples superimposed on the negative samples. This can be used to check if createsamples utility is working properly. For a more thorough and detailed explanation see my post Hand detection through Haartraining: A hands-on approach

Haartrainer (haartrainer.cpp): This utility takes as input the samples from the createsamples utility and creates a trained haar classifier. To build the haartrained files, build the haartraining.cpp. As before make sure you include the appropriate files along with the all the opencv libraries. Buid the following files

(haartraining.cpp, cvboost.cpp, cvhaarclassifier.cpp, cvhaartarining.cpp, cvsamples.cpp, cvcommon.cpp)

The command to use is

./haartrainer -data test2 -vec samples1.jpg -bg bg.txt -npos 1 -nneg 4 -nstages 20 -mem 500 -w 20 -h 20

In the above command test2 is the directory name in which the trained classifier is stored. The -vec option denotes the samples that were captured by the createsamples utility above. The bg.txt contains the negative samples file. The width and height have to be the same as used in the create samples utility. As mentioned before if the width and height are too large you the haartrainer will bail out with “Insufficient memory”

If the haartrainer executes successfully the test2 directory will have all the trained files and the directory will also have the haarclassifier as a test2.xml.

Once these two steps go through successfully we have to run the performance step

Performance (performance.cpp): This step is run to ensure that we have trained our application to properly recognize the object we intended to. As before the necessary file to build is performance.cpp. The files to build are

(performance.cpp, cvboost.cpp, cvhaarclassifier.cpp, cvhaartarining.cpp, cvsamples.cpp, cvcommon.cpp)

Once this built successfuly you can now run using the command

./performance -data test2.xml -info bg.txt -w 20 -h 20

Make sure that the bg.txt contains the images from which the object has to be detected in the following

format

[positive filename] [# of objects] [[x y width height]

bg.txt

/img/logo5.jpg 1 145 100 20 20

./img/logo6.jpg 1 145 100 20 20

./img/logo3.jpg 1 145 100 45 45

./img/logo4.jpg 1 145 100 45 45

./img/airplane.jpg 1 145 100 45 45

./img/baboon.jpg 1 145 100 45 45

./img/kid.jpg 1 145 100 45 45

./img/lena.jpg 1 145 100 45 45

./img/opencv-logo2.png 1 145 100 35 35

./img/logo.png 1 145 100 45 45

When this is run we get

ganesh@localhost Debug]$ ./performance -data test3.xml -info bg.txt -w 20 -h 20

+================================+======+======+======+

+================================+======+======+======+

| ./img/logo5.jpg| 0| 1| 43|

+——————————–+——+——+——+

| ./img/logo6.jpg| 0| 1| 51|

+——————————–+——+——+——+

| ./img/logo3.jpg| 1| 0| 37|

+——————————–+——+——+——+

| ./img/logo4.jpg| 0| 1| 7|

+——————————–+——+——+——+

| ./img/airplane.jpg| 1| 0| 226|

+——————————–+——+——+——+

| ./img/baboon.jpg| 0| 1| 236|

+——————————–+——+——+——+

| ./img/kid.jpg| 0| 1| 1291|

+——————————–+——+——+——+

| ./img/lena.jpg| 0| 1| 188|

+——————————–+——+——+——+

| ./img/opencv-logo2.png| 0| 1| 3|

+——————————–+——+——+——+

| ./img/logo.png| 0| 1| 33|

+——————————–+——+——+——+

| Total| 2| 8| 2115|

+================================+======+======+======+

As can be seen the training has been too accurate. There are hits and misses along with a false positive. The number of positive and negative samples, the co-ordinates and the number of stages all have to fine tuned to get the correct result.

Watch this space!

I will be back! Hasta la vista!

Please do take a look at my sequel to this post Hand detection through Haartraining: A Hands-on approach for a more robust haartraining method

Find me on Google+