Vision for navigation
John Christian from Norway  [25 posts]
15 year
I dont know if this is the right forum to ask these questions, but I have been thinking long and hard how I can make my robot autonomously navigate between rooms or POI's (Points of Interest). At the moment my robot is simply using an ultrasonic sensor in front to figure out where it can move and uses that to freely roam about. My robot has 2 wheels with encoders and a flaky magnetic compass. Both are very unreliable and my robots position in the "mental map" would be out of sync after a couple of meters of moving about.

The compass is a Devantech/Robot-Electronics one and I can only get somewhat reliable information in about 180 degrees of the whole rotation, and I suspect surrounding electronics to mess with the readings. (I have calibrated it lots of times with no luck of getting better results). The encoders are somewhat precise but naturally this is only good enough to get an approximate estimate. I have experimented with a behaviour where I ask the robot to rotate 360 degrees and based on the encoder values it seems to vary by +/- 5-10 degrees when it stops after a full rotate. So as you can see encoder data is only good for some estimated location data.

The other option is to buy very expensive localisation hardware that uses some sort of echo (ultrasonic or IR) for triangulation. But really I wanted to explore the ability to use the camera feed for figuring out where it is. This is naturally a very challenging task and I have thought hard about how do to this. Perhaps some blob detection and based on the size and distribution of these it can calculate its relative position to the blobs if a certain match is there? Not sure about how to go about doing this.

The other vision alternative is based on having markers in the room as in brightly two-colored patterns that are easily matched and based on how big these are in the vision it knows approximately how far away it is from these points (and its direction based on offset in camera vision). I could then use this for some sort of triangulation. And I could perhaps even use pictures on the walls as markers if I can find a way to detect which picture is which. Perhaps correlated with data from the Vanishing point module and the floor/wall finder I can make a better guesstimate.

Today I mostly use RoboRealm for some fun vision features where it can identify simple shapes, follow a tennisball and read text (using tesseract) on a piece of paper you hold in front of it. More about my robot at: http://robot.lonningdal.net

I hope to use RoboRealm for many more tasks. One that I will add soon is motion detection so that it can find the highest point of a motion and turn its head to that (chances are that my face detector will find a face there). The information can also be used to distinguish between people since the motion data would probably be a full person and I can use simple blob detection to find colors of clothing. This requires of course the robot to be introduced to a person first to learn the color of clothes. The same color blobs would be COGs for navigation too if it wants to drive to a person after it has found one it is i looking for.
John Christian from Norway  [25 posts] 15 year
I've played around a bit with flood fill, blob separate and blob filter and I think this could be used for general scene information. Using some pictures of our livingroom I was able to figure out the following just by visual analysis:

- The bottom biggest blog is always the floor, especially true if the color is brownish (wooden flooring).

- Windows are always the brightest white and if there are two bright blobs the top most is the window, the bottom is the reflection of the window in the floor (hence can be merged with brownish floor blob).

- Our white walls are never completely white but a light gray and are usually fairly large blobs in the picture. So generally big light gray blobs are walls. Naturally other walls would have other colors. But generally they are the top 3 biggest blobs (the two other being floor and shadows).

- Big black/dark colored blobs are usually shadows and often merged with other shadows and black items.

- In the current picture the large gray blobs are our gray sofa and recliner chairs and usually in a low location in the image.

I have several images which I can post analysis of and perhaps I can get some clues to how I can make this into navigational data. The general idea is that blob relativity say something about the robots likely position in a scene. E.g. a grey blob low in the picture is the sofa. If its to the left of the robot, the robot is facing northish, if its to the right in the vision the robot is facing southish. Based on identified blobs location in the view you can also say something roughly about its relative position in a room. Rough information is more than enough for a robot to find its way to another room so this might work.

I tried using the Vanishing Point module also but there were very few times I got a vanishing point from a rather cluttered livingroom scene.

Finally a question, how can I read the colors of the blobs?
John Christian from Norway  [25 posts] 15 year
I have found a couple of good reference points for localisation, and those are windows, doorways and pictures on the wall.

Windows are during the daylight usually the brigtest area in the view, so as I previously posted you can pick the big 100% white blobs as being windows, with a similar reflection blob in the floor immediately below it which is usually also 100% white.

Doorways are interesting also since they mark a point of interest that the robot has to go through to move between two rooms in its mental map. The idea is that a room is a known entity and there exists at least one door to traverse between the rooms. So how do I detect doors? I did some simple tests in RoboRealm and doorways usually have long parallell vertical lines, and if you can see the top or bottom you have a vertical line connecting these. The full doorway is usually not in the cameras view though since the robot might be close to the door and only see the vertical lines of the doorway. But how does the robot actually find the door opening because we need more information about these vertical lines to identify them as doorways. The distance between could be used, but we might also look at the uniformity of the colors in the areas to the left or right of the line. The trick then is to figure out what width area you should sample to analyse this, and it could be hard since this area could contain the door itself (in the case of a white room with white doors you would get two white areas on either side of the vertical line even when it is open since the perspective would show parts of the door). I still havent figured out a good discriminator to locate the door, but perhaps if the robot looks up to find the vertical line of the doorway it can make a better judgement).

Finally, pictures or any kind of rectangular frame on a wall would be excellent for localisation since they are usually easy to edge detect because of the uniform wall-colors around the area. The shape of the perspective lines it consist of say a lot about the angle you are seeing the picture, as well as distance. I am sure with some clever math you could transform the shape of the 4 lines (by looking at line angles and width between parallell vertical lines) to an estimate relative position to the picture (or tv for that matter). Cleverly positioned pictures hanging on the walls can then be a very good reference points for the robot to find its way about. The algorithm would require the width of the picture to find distance of course. The only problem might be if the robot can even see the picture since this only works if the robot is looking straight forward to get parallell vertical lines. So the robot can only use this information if the picture is at some distance.

I am not quite sure how to store this information but I guess my code needs to form some kind of graph with feature relations in a 2d space, perhaps structured by sky directions (although the robot doesnt really need to know which way is actually north, just that any position has a set of blobs/features of interest based on an estimate angle (wheel encoders is good for one rotation). The robot could then stand in a spot, rotate around and whenever it see a window/door/picture/big colored blob it can ask if thats a window or where a door leads, whether a box is a picture/tv or what a big colored blob is. As it rotates it stores some distance information (either calculated or read from the ultrasonic sensor) as well as the feature detected. The sensor can also give an indicator to where it should go next to perform a new scan. It could then move 1 meter in that direction.

Perhaps it should start by asking what room it is in and what color the walls and floors are also to better filtrate interesting blobs from the common ones. The room name would of course be the entity where all this information during a scan around the room is stored.

If anyone has done any work on this or has good references I would be very happy to get some pointers.
John Christian from Norway  [25 posts] 15 year
A further test to find doorways I found an interesting feature to add to my robot, and that is one that checks if the stairs gate is closed if the kids are upstairs. I have found a reasonable way of detecting if there are kids in the room simply because their motion is usually on the height of the robots vision, while an adult would be much higher in the vision (when the motion is also near to the robot). This gives me somewhat a reliable detection of kids in the robots vicinity.

The idea then is that it would turn or move to look towards the area where the safety gate to the stairs is. And using the Lines sample I can identify the gate very quickly simply because it will have lots of vertical lines next to eachother if its closed. Since the stairway is almost like a door, the real task then becomes to figure out where the "door" is, again detected by their high vertical lines.

I've had some problems with the the Straight Lines module as it sometimes has these vertical lines in it even if I define a Min angle of 80 and Max angle of 100 (assuming 90 degrees is vertical). I have attached the picture of the stairs gate and two doorways. I'd love to find some good solid way of detecting where the doors are. As I have mentioned before it seems I have too look at intensity of the areas on either side of the taller lines, where a bright area would be the wall and a darker area would be the open door. (To detect a closed door I am lucky our doors have square patterns in them so I could look for some rectangular shaped lines in it).

from United States  [214 posts] 15 year
Hey John,

Thanks for posting your results on navigation.  I am working on a similar robot project and I am also using RoboRealm to process vision.  I've read some papers on animal navigation and I think you are on a good track; i.e. determine a bunch of landmarks characteristic of each room and/or doorway and store a kind of topological map linking the landmarks together.

Perhaps you are already doing this, but I wonder if you would get better recognition of landmarks if you combined visual features with sonar readings.  For example, a pan of a sonar beam across a doorway will reveal a "rectangular hole".  Also, a complete 360 degree sonar scan of a room will reveal a certain topographical profile that could be used to characterize the particular room and possibly even determine robot orientation using autocorrelation on the profile.  Or one could simply use the average distance reading from the scan to figure out if you're in a "big" room or a "small" room.  If the rooms in your house can be more-or-less rank ordered by size, this average value might give a good clue as to the room the robot is currently in.  Of course, there will be a lot of noise due to furniture, etc., so combining this with some visual data might do the trick.

Another cool approach I saw was to use a 360-degree video image.  For example, take a snapshot looking down at a polished Christmas ornament, then use RoboRealm to get the color histogram of the image.  Then use this histogram as a kind of color signature for that particular room.

When I get time to get back to my own project, I'll let you know if I come up with anything that works.  In the meantime, keep us posted!


John Christian from Norway  [25 posts] 15 year
Hi Patrick and thank you for your reply.

I currently have a sonar mapping feature using one ultrasonic sensor in the front. My robot rotates 360 degrees using the encoders on the wheels (but misses by +/- 5 degrees, depending on position of caster wheel I think, I want to swap this with a ball caster some day). The sonar then creates a radar map (I can view this on the robot gui through a VNC client). I then have a behaviour that enables the robot to pick the angle where there was most "opening" and move in that direction. This is how I plan to automap the area too, having a treshold on how far it should be able to go in one direction to add that to the "explore-queue". The only problem is that it is out of sync after some meters of exploration simply because it has no recalibration. As you say, markers in the room is probably the best way, colored blobs of significance, vertical lines that could be doors, lines on walls that form a square (obviously different shape from other angles, but still identifiable). These would be very nice re-alignment points for the robot and could help it even on every meter it explores of the room, by centering certain markers on a rotation so that its "virtual compass" is corrected on every "step".

About using the sonar for detecting doors that would only work when the robot is really close to the open door. I'd like the robot to mark its "mental map" with possible doorways as early as possible, in which case the sides of the doors would be detected as obstructions. The sonar has a fairly wide detection area so frequently picks up objects that is on the left or right side of the robot as well. Still, if two tall vertical lines are detected I could assume there is a doorway and position the robot in a way where the sides would not be regarded an obstruction and see if the sonar "see" the opening.

As soon as I have the missing cables to get my new Intel mini-itx board going I will start prototyping some and post results here. For now I am just trying to analyse pictures I take around the rooms it is supposed to explore, trying to find a good way of mapping it and how to represent the "mental map" so that it is useful for the robot navigation. In this process I have tons of interesting things I want to try to add as well like calculate distance to an object (a well defined colored blob of some size) that seem to be rooted on the floor simply based on its lowest y coordinate. I will also experiment with robot alignment to straight lines detected, so I can have it face the wall, and generally align itself with a rooms angles. A good way might be for the robot to look up so that it see the roof connecting to the walls and rotate to "straighten" the lines. Many rooms often use a non-diagonal alignment of furniture also so openings are more likely to be continuous for some length when one is detected and parallell to the walls.

Btw, Patrick (or anyone else), how can I get the array of blobs from the blob filter? It only seems to provide XY COG info and not the color/pixel count of the blobs. I really want to just use RoboRealm for quick data gathering and have the code on the robot do the calculations instead of reducing the data to the minimum set in RoboRealm all the time (that would require many CPU expensive passes in RoboRealm). The general idea is to perform many common analysis on the picture in one sweep and get all the data back and only "look" at the data needed for the current task. The sum of these data can make the robot behave more intelligently based on several stimuli instead of simple insect-like monkey-see-monkey-do behaviour.
Anonymous 15 year
Hi John. You're obviously way ahead of me in what I want to accomplish with NINA's visual navigation! Bravo!

A little suggestion with your robot's "mental compass": If you're going to use roborealm, why not just use the visual anchor module? That way, you wouldn't need a compass--your robot could just track the x-axis of rotation to see how far its turned. (I think they said 90' was 600 pixels).

I've never tried it yet (haven't gotten that far with the NINA project yet). But I'm willing to bet its very useful!

Good luck with your robot!

Loren John Presley
John Christian from Norway  [25 posts] 15 year
Hi, and thank you Loren for that tip.

I havent looked into that module, but it seems to be a good candidate for aligning the vision input so that it can align itself to the same direction after some turning and movement. This was just what I had in mind but using "important" blobs and other key visual elements  stored in each location. This module could probably do this also but on a much higher detail level. Just have to figure out how picky it is about a scene and if it allows much variation (people and non-stationary objects can clutter the scene).

Visual navigation is a fun challenge. There is something special about being able to only use monoscopic vision as the only navigational input as it tries to mimic how humans navigate also (although we also have mental maps that at least works blindfolded for a few meters, our feet have built in encoders in that sense). In time I also plan to detect obstructions this way too, by assuming that the objects are rooted to the floor the robot can get some idea about how close it is to an object. For now I use the ultrasonic sonar for this. I'd prefer the robot to be able to position itself in such a way that it doesnt have to rely too much on sonar data to get itself out of a tight spot. Now it frequently sweeps too close to objects that it bangs it backside into the object if it turns. :)
from United States  [214 posts] 15 year
Hi John and Loren,

This is a most excellent thread.  Before I forget to ask: John, do you have any video clips of your robot in action?  I checked your website but did not see any there.  I am still working on a simple website on which I will post some pictures and videos of my own project.

Regarding the imprecision of sonar scans, yes I remember running into that issue too when scanning for doorways from too far away.  A laser scanner would be ideal but that would pretty much triple the cost of my current robot with just one component!

I too have been thinking that a good way to store a "mental map" of a set of rooms would be some form of directed graph data structure where the nodes are landmarks and the links are relations between landmarks.  These could be quite rough, like giving driving directions: "Go straight until you see the drug store, then turn left until you see the big yellow arches."  Like John said, one of the properties of each node would be the name of the room it is in.  So imagine your robot is facing a big black blob which is a chair, then it rotates left and sees a big rectangular blob which is your TV.  We could then store "black blob"(living room) and "rectangular blob"(living room) with the link "left 30 degrees" between "black blob" and "rectangular blob".  The "30 degrees" part does not have to be very accurate.  All the robot needs to know is that if it has identified the "black blob" in the living room, then looking left it should see the TV.  Once the TV comes into view, then the robot can center itself on the image, then move on to the next landmark in its map.

If you let your robot explore and store enough of these relations, you might be able to use standard graph methods to get from A to B.  Perhaps the biggest issue will be identifying a landmark as the same as it has seen before.  So the more distinct and robust each landmark can be, the better.

Regarding your Blob Filter question, I haven't used the array variables yet, but it seems that if you click the "Create array variable" checkbox beneath the Weight Threshold box, and then add a Watch Variables module, you can see that each feature you are filtering on gets a variable and value assigned.  Not sure if that's what you were looking for.

John Christian from Norway  [25 posts] 15 year
Hi again. I have not yet made any videos, but I can assure you that some are coming as soon as I get it up and running again (probably post em on YouTube with links on my page). Still waiting for a couple things to get it back on wheels and will kit it with a nice Optimus Mini Three screen/button thingy as well as a slim slotin dvd drive.

About the mental map you describe approximately what I had in mind, as I think humans also navigate using these associations between landmarks. I planned to have an object oriented structure with a Room class and a Location class. Each Location has an array of Links to interesting Landmarks and in each of these Links I store angle and distance to the landmark (rough estimate based on sensor and vision distance on floor). Each Landmark will perhaps have subclasses for Door, Colored blob, Picture/Frame, Circular object as these can be identified using the circle detector (many of these will probably subclass Colored blob as that is more generic). As the robot asks about these landmarks you can provide it with names on them (door to XYZ, picture of XYZ, tv, sofa, piano, etc) as well as telling which blobs to ignore (non-stationary objects).

The challenge as you say is finding the same landmarks when the robot moves, and especially matching them from different angles. Thankfully the robot can make a fair amount of assumptions since the facing direction will limit number of previously found Landmarks based on which angle they were spottet at. As matches are found during the mapping stage the Links objects will point to the same landmarks. It will be interesting however to figure out which representation I will make for the current location of the robot since it can be between scanned locations, and a good pathfinding algorithm doesnt require the robot to strictly move on the these locations, but can be in any position between these, and indeed it can be hard to know if its even at a specific location although we need to know approximately which it is to base further movement on.

About the Blob Filter question, I believe the two arrays only returns COGs for the blobs and the weights, and not the color or size of the blobs. I need the blob color and size. Although I could get the picture back in the robot and do this myself after a flood fill, I'd love if the RoboRealm authors could add support for this so one could further process the blobs in your own code.

Anonymous 15 year
Hi All,

First great thread! We've not had a chance to process all that has been discussed here which we will.

I just wanted to point out to John that you can get the color and size of all the blobs into your own application by using the Geometry_Statistics module after the blob filter and clicking on the "individual blobs" radio button in that Geometry GUI. This creates a bunch of arrays with all the associated attributes that can be exported to other applications. You can see all those arrays using the Watch module.

Let me know if this does not work for you'all.

Hopefully we can add more help to this thread but you everyone here seems to be on-top of that!

John Christian from Norway  [25 posts] 15 year
Hi Steven and thanks for answering!

Yes the Geometry Statistics and Color Statistics were just what I was looking for! Actually the Color Statistics seem to provide both color and area (pixel count).

The Geometry statistics can probably give me some extra clues to the shape of the blob detected which might help somewhat to pick out interesting landmarks. For example a blob that is very irregular and goes all over the image is probably not a very interesting one since its probably a merge between several objects (probably shadows of several objects). The two that stand out as interesting though is the floor and the walls which might both be irregular too however. A problem however is that objects of interest often have several shades of a color which a flood fill might have split up in several blobs. So perhaps a good vision algorithm does several passes with sets of fill thresholds to make a better guess at the shape of an object.

As an example of the problem I outlined above, what would be a good way of picking out an object on the floor? Looking at the colored blobs does not work in this case as the filler would probably divide the object into many blobs, so in some cases a mask from the floor colors might be the best way to go, but probably also something that would fail if the object is lying against a colored wall or furniture. So I'll probably end up with several passes for one image, and I guess as long as I can process 1 or 2 images per second, a robot should be able to navigate around the rooms. Btw here is a sample picture of a Teletubby toy that needs identification. Basically I need to separate the shape of this from the floor and get an average color (weighted in the most bright colors of it) for the whole blob it defines. This will be a good general object detector as I can use the shape detector for this if I provide it with several angles of the object as well and categorize the different objects in color ranges and real size.

Note however that the pictures have a terrible color cast, and I doubt the webcamera would provide me with an image this "off". A manual correction should be applied here based on indoor lighting (auto color filter tends to turn things blue imo).

John Christian from Norway  [25 posts] 15 year
Just a quick question. I have realised that a blob's smoothness or roughness can be of great interest when distinguishing blobs (and can distinguish a similar colored blob from another). How would I go about using the smooth and rough texture filters for getting amount of roughness? They only labeled with weights so they are relative to the most smooth or rough one in the scene (the most being 1.0 and all others weigthed down to 0.0). I'd rather have a non-relative smoothness or roughness value. How can I get that?
from United States  [214 posts] 15 year
Hi John et. al.,

Check out this interesting paper.  Looks like I better get back to setting up a 360 degree camera...



This forum thread has been closed due to inactivity (more than 4 months) or number of replies (more than 50 messages). Please start a New Post and enter a new forum thread with the appropriate title.

 New Post   Forum Index