Stereo Depth Mapping
Anonymous from United States  [28 posts]
12 year
 I've been doing more and more research trying to find the best alternative to Stereo depth mapping. I've come across a couple of things and have just built a camera rig that houses two cameras at 4" (101.6mm) apart. I'm able to use the Mosaic, Stereo, and other modules to grab both images, but trying to do any type of matching between the two I'm at a loss. I'd like to do a feature map between then, SURF or SIFT style, just to get the major points, and then I could use my algorithm to calculate the offset of the other pixels to apply an 8bit gray scale to get the depth map. I'm not trying to track one object, as that has been pretty simple to determine the distance to it by flipping between camera one and camera two and checking the COG_X value between that item, however I am now needing to depth the entire scene, so I believe finding the feature points that match between the images, and then doing the triangulation between those matches will be the way to handle it... I could write an offsite plugin for the pipes or API using OpenCV but I was hoping that someone had some insight for doing this straight with RoboRealm. Steven, I noticed after reading a bit that the Stereo module is not supported. I couldnt get it work as I believe you are set up to run two cameras differently from my positioning (4"), and I believe I could get it to work better if I knew the offset I needed. Currently I can use this to figure depth: distance = tan((pi/2) - tan^-1(offset/factor)) * camera_separation factor = offset / tan((pi/2) - tan^-1(distance/camera_separation)) I had to build a table of factors for multiple distances, and am working on an algorithm to process it from there. This really depends on one object though, and I would like to do the entire scene. My camera_separation = 101.6mm, and here is a list of my factors for distance, the value is the pixel offset between the object in the two frames: 200 mm - offset = 190 300 mm - offset = 135 400 mm - offset = 100 500 mm - offset = 80 600 mm - offset = 67 700 mm - offset = 57 800 mm - offset = 47 900 mm - offset = 42 1000 mm - offset = 40 1200 mm - offset = 32 1500 mm - offset = 25 2000 mm - offset = 15 This was very rough estimates, but I believe it'll be close enough to do what I need it to. I think that by feature matching between the two images, I can run those pixels through my algorithm by checking the offset between the features. Doing this across the image with a given window size and search should yield pretty good results, I just need a way to run it through a feature scanner to match camera 1 to camera 2. Any ideas are greatly appreciated! I'm including a side by side of the images that I last captured. Again, the two cameras are parallel, and 4" (101.6mm) apart. Thanks! -Chris
Anonymous from United States  [28 posts] 12 year
 Did more calculating, and it seems that my table is a little off as I had thought. Going back to the Society of Robots site: http://www.societyofrobots.com/programming_computer_vision_tutorial_pt3.shtml#stereo_vision I've tracked back to figure my focal length as a value of 393.7 Being as my cameras are separated at 101.6mm, I can derive the Z_Actual by the following formula: Z_actual = (b * focal_length) / (x_camL - x_camR) Doing this backwards with the table I have above, I was uncertain what my focal_length was, but was able to find it with the tape measure against the offsets that I recorded between all the distances. It seemed that 393.7 was the number that kept popping up. My final formula would be: Z_actual = 40000 / (x_camL - x_camR) So as I'm doing my feature matching across the scene from two images, I can calculate the offset in the X values, and then figure out the millimeters I am from that object. Working the way across the entire scene will give me a whole scene of depth. However, I'm still lacking a way to check features from one camera to the next. Hopefully what I've done so far will help, but maybe someone could give me insight for the two images matching features. Thanks! Chris
Anonymous 12 year
 Chris, While valuable for later use the above does not help much with the actual determination of depth from an image. I've tried a couple things with mixed success and need more information as to intended purpose in order to make the best recommendation. You mention that you need a full depth map. Is this to be used for obstacle avoidance? Indoors or outdoors? The reason for my question is that most likely your 4" separation is too large for close purposes (note the tape is completely off in the second image which is causing matching issues). Are you able to change the rig such that the cameras are as close as is physically possible and take an image then? For close objects even 1 cm is sufficient and left/right matching is much improved by that. If not and your intent is for distant objects, perhaps you can try taking images in that situation ... the close proximity of the floor is causing a lot of errors. Note that you probably did get the stereo to work (just keep increasing the offset and check the left/right switch to see which one is better) but the best result I can get looks like crap anyhow! If you don't know what to expect its hard to fine tune the module to the best result. If you need to use the rig as is I can forward the best image I could come up with ... but again, in my opinion, it looks really bad and is not very usable. Closer images would improve it ... Thanks, STeven.
Anonymous 12 year
 You can see what I mean by the attached. STeven.
Anonymous from United States  [28 posts] 12 year
 Steven, This is for an outside rig, and yeah, will need to do the entire image for object avoidance mostly. I've been playing around with some of the other features trying to second guess my attempts. I've redone your tutorial of object detection/avoidance but I can't get it to work well with a edge detector outdoors with grass and side walks as it thinks there is an object in the way. I *can* move the cameras closer together. I was hoping for about 6" to around 84-96" of depth off of this rig, thats why I went with the 4" spacing. The stereo image that you have is a VAST improvement from what I was looking at, LOL.
Anonymous from United States  [28 posts] 12 year
 I will take the rig outside and get a snap shot of each camera and get em up here as soon as I can.
Anonymous from United States  [28 posts] 12 year
 Steven, Here is 4 pictures taken outside at two different scenes with a fair amount of 'objects' in them. I had to snap them individually while holding the rig, so the Y may be slightly off from me moving a little bit, but the basis idea is there. Two are from the left camera, two are from the right camera. I must say, i LOVE this webcam for outdoor use.
Anonymous from United States  [28 posts] 12 year
 I've got the pipe function running over into C++ for now, at least the data is there and I can grab/modify the pixels. Still playing with the routine to push that data into an openCv buffer to see if i can do any processing from there, and then back into RR format. I'm using the 3d viewer to correct the alignment of my right image, and setting it up as a side by side, then I'm piping the whole image (1280x360) over to C++, and then piping it back. It's really fast and I'm impressed! Right now I'm just overwriting the pixels with a solid color to make sure I can at least grab the left and right frames correctly out of the solid buffer that RR ships over. Seems easy enough, now to flip it into OpenCV will take me a few days, gotta read up on all of that... :) Steven: Is there a reason you always start with the bottom left pixel instead of the top left pixel? Just curious... I'd love to see a quick and easy way to merge the data between RR and OpenCV and back, so I plan to publish whatever I can get going to you guys.
Anonymous 12 year
 Chris, Sounds great ... I would skip a couple steps right now and test your image directly in OpenCV. We already did that and didn't like the results due to the large disparity (sorry, back to the 4" spacing again). So be sure you get good results in OpenCV before doing this integration. If you can't get the spacing to be smaller can you point the cameras towards each other slightly as in pivoting your eyes to focus on a closer object? This will help to align the images better. What you will probably notice in OpenCV is that a lot of noise is being generated by the near ground areas so unless you plan to just ignore that entire area you will need images with closer alignment. Does not make much sense if you can detect a 10ft away object when you are about to hit a 1ft away object that you cannot see! Anyhow, we are working to improve our current Stereo module so that the results will be better defined with less blurring and errors ... but that will take additional research/testing. STeven.
Anonymous from United States  [28 posts] 12 year
 STeven, Sounds great. I can easily redesign the bracket for less spacing between the cameras. The speed that we're hoping to travel is the reason I was hoping to shoot for objects farther away, but I don't mind testing and testing and testing, it's what this is all about. I think the next bracket I will put them as close together as I can and do more testing. I'll post a couple of images with that set up to give you something to play with as well. I did manage to play with your module a bit more, but I had to decrease min disparity to -20 or so, and then I couldn't tell what was directly in front of me, it only picked up distant objects. Really wish the Kinect would work for this application as it'd be a ton easier to implement :) Stupid sunlight.... If I can do anything to help you guys out, please let me know, completely willing to open up my rovers to your testing.
Anonymous from United States  [28 posts] 12 year
 Just changed my rig, will try to get images from the cameras tonight.
Anonymous from United States  [28 posts] 12 year
 Here's the latest indoor shot. The new rig i posted a pic of above has a 1.5" spacing between cameras.
Anonymous from United States  [28 posts] 12 year