The following tutorial describes one way to use a vision system to recognize hand gestures
from an overhead webcam looking down at hand gestures over a solid background.
The setup allows for various gestures (the digits 1 to 10) from one hand to be recognized by the
vision system.
The basic principal to this recognition is to compare the shape of the hand once extracted
from the background to a database of existing images that can be used to determine what
digit is being indicated. Note that the digits 1 to 10 can be shown on a single hand with
the hand in frontal view. We first start out this recognition process
with a example image of what the camera sees.
The hand can easily be extracted from the background using a
AutoThreshold module.
The problem now is that the length of the wrist/arm can vary depending
on how far in the image the hand appears. This is enough to throw a shape
detection system off from the real shape of the hand. To remove the wrist
from the shape we replace the hand blob with the largest inscribed
circle.
By dilating and subtracting this from the blob image we can separate out the individual
parts of the hand.
The result can then be processed to remove any blobs that are touching
the border of the image. Since the wrist should be the only part touching
the border of the image it can easily be removed using the
Blob Filter to result with just
the fingers.
By merging the remaining fingers with the part of the blob that was
removed by the inscribed circle we remain with just the palm and the
fingers of the hand. As the inscribed circle will be relative to the
size and location of the hand this process is quite repeatable as
long as the hand is frontal.
After smoothing this final shape to remove any irregularities the
Shape Match module which is trained
on the following images. Note that this database is trained on images
previously saved from this point in the pipeline to the file system.
This module will generate a SHAPE_LABEL variable which can be used
to indicate what shape graphic the module matched against.
The following robofile can
be used to replicate these results along with the following video. The database to train
the Shape Module on can be downloaded as a Zip File.
Note that the gesture for 10 is very inconsistent due to the failure of the
inscribed circle method. Once we move the hand to a profile view the inscribed
circle will move on each frame since the hand profile does not have a flat inner
palm area that dictates the placement of the inscribed circle. Once this moves sporadically the
rest of the algorithm will not function well.