Ingolf's Robotics Project:
            Hand - Eye  Coordination
Robots should have eyes, see the environment, and perform their actions guided by vision.   Of cause this has been a topic of artificial intelligence and as early as in the 1960's AI researchers like Marvin Minsky and Seymour Papert (the father of Lego Mindstorm) hooked together computers, TV cameras, and robotic actuators to build towers from individual blocks and to navigate through buildings.  Nowadays some industrial robots have vision. 
Here is a homebrew version:
My experimental setup consists of a webcam (QuickCam VC), a set of mirrors to generate a stereoscopic (3D) view, and a robotic arm with a hand.  The robotic arm can move left/ right, (x-direction), up/ down (y-direction), and back/ forth (z-direction).  The hand can close and open - all these movements are generated by 4 servos  (Futaba FP-S28).
The hand is from a Lady-Robot and it has two red-colored fingernails.


The software is written in Visual Basic VB5.
The camera comes with a Visual Portal Module to interface the camera to Visual Basic programs.  I wrote a program that calculates the coordinates x, y, z of the objects in the playground from the picture. Then my program calculates the necessary movements of the arm and sends the data via a serial port to the servos.  An 8031 Microcontroller converts the signals to Pulse Width Modulation (PWM).  An 8031 Assembler (written by my son Hanno)  allows the modification of the servo routines.

First, the robot should know where objects are - if possible in 3 dimensions:

3D Vision
A simple example of 3D vision is shown in the following picture:

lightpath.gif (19531 bytes)    lego3d.JPG (27944 bytes)

For stereoscopic vision I use one camera with 4 mirrors.   The lightpath is shown above.  The mirrors generate virtual eyes,  El and Er, which look at the scene from a left and a right viewpoint. 

lego3dcam.JPG (8543 bytes)    lego3d-LR.JPG (4153 bytes)    lego3dbird1.jpg (2107 bytes)   

The left and the right view are slightly different (disparity).  The next picture shows the light intensity along the scan line along the left and the right part of the image.   The program finds the corresponding edges from the objects and calculates the distance of the objects from the disparity.  The result is shown in the third picture - a bird's view:  there is a big object in the back and a small object in the front.


colorlegos.JPG (7222 bytes)
Test Colors


RGBcolors.JPG (12101 bytes)
Young's 3 Color Theory


The first picture shows several test colors.  According to Young's Theory colors are the sum of RED, GREEN, and BLUE components, e.g. yellow is the sum of red and green.    A scan across the test colors shows the RGB signals of the camera.  Yellow has strong red and green signals and only a weak blue component.  For further signal processing it is helpful to calculate color opponents;  the ratios red/green and blue/yellow are shown in the color map.  The different regions for blue, green, yellow, white, and red are clearly separated and can be evaluated by a computer program. 
The apparent colors depend on the illumination, fluorescent light contains more blue and incandescent light more red.  The automatic white balance algorithm in the camera adjusts the gain of the R,G, B channels so that the "white" will be in the middle of the color map.

colorscan.JPG (8396 bytes)
Scan across colors

colormap.JPG (3152 bytes)
Color Map

Object Recognition
Object recognition for simple objects is shown in the following pictures. 

objectrecognition.JPG (15116 bytes)     

correlation.JPG (4766 bytes)
Correlation of a pair of scissors and a wrench

First several objects are shown to the camera; the program frames the object, resizes it to a standard size, and adds the picture to the gallery.   In the recognition phase the test picture is compared to pictures in the gallery.   Comparison is done by correlation, where corresponding picture values are multiplied and summed. 
Vision Guided Grasp
Finally some pictures show how the robot grasps an object guided by vision.  The camera looks at the scene and sees the hand with the two red fingernails and a blue box. The program finds the red and blue objects, calculates their coordinates x, y, z,  and sends commands to move to the hand.
What I see What the camera sees Calculated bird's view
and side view

vgg1-eye.jpg (5453 bytes)    vgg1c.JPG (4612 bytes)    vgg1bv.JPG (2826 bytes) The hand is in the back in a high position.  
The program sends the hand more to the front.
vgg2-eye.jpg (5259 bytes)     vgg2c.JPG (4473 bytes)     vgg2bv.JPG (2823 bytes) Now the hand is over the blue box.

The program moves the hand down. 

vgg3-eye.jpg (5264 bytes) vgg3c.JPG (4671 bytes) vgg3bv.JPG (2854 bytes) The hand is in the right position.

The program closes the hand.

vgg4-eye.jpg (5019 bytes) vgg4c.JPG (4877 bytes) vgg4bv.JPG (2951 bytes) The hand grabs the blue box


Next, I want to teach my robot to build towers from blocks and pour wine from a bottle into a glass.  But as Minsky writes in "The Society of Mind" "easy things are hard".   What we think is intellectualy very difficult - like playing chess reasonably well - can be easily done by small computer programs, but things, easy even for a 2 year old kid - understanding the environment, picking up blocks, finding the center of gravity, and balancing pieces - are still very difficult for robots to do.


Ingolf's Projects
Ingolf's Professional Career Ingolf's Parallel Career
My First Personal Computers to measure is to know
Hand - Eye Coordination RoboGames
Model of a Windmill Jumping Jack
Dollhouse Ingolf's Big Picture
Sonogram Viewport
Line follower  

Webdesign by Ingrit