Visual Localization

The ideas introduced in this tutorial have not been verified to be according to FIRST rules. Please verify with the rule manual on any modifications or unique usage of items provided in the FIRST kit of parts.

This tutorial assumes that you have acquired an image of some form and need to process it to determine the distance from the camera to the target. This information would be used to determine the speed of the shooter in order to effectively range the shooter for the appropriate distance or to position the robot in the best possible location for shooting.

Let's assume the following four images are to be processed. They show the target at different distances since the techniques used are meant to scale to different sizes so that we can range near and far targets.

By using a model of the pattern to detect we can use the Target Localization module to provide us with additional information about the target. The model is just an image that indicates specific areas using gray, white and black areas that are used to match against potential targets.

The white areas represent the retro-reflective tape, the black areas represent the background areas where no edges should exist, and the gray represent the "don't care" areas (i.e. they do not contribute towards the pattern matching and should be ignored).

The results of the module appear as

where the green represents the detected model and the blue quadrilateral represents the 4 points that were used to match the image against the target. Note that not all of the target needs to be visible in order to match. This allows your robot to move closer to the target or behind the pyramid without losing track of the target. Also note that the same module configuration is used for IR or LED enhanced images.

Once the target is detected, several variables are set that indicate the location and distance to the target. These include:

TL_Y_ROTATION - the Y axis rotation of the target relative to the robot. This value reflects what orientation the targeting system needs to rotate to in order to focus on the center of the target (middle goal).

TL_TARGET_DISTANCE - the distance from the robot to the target. This can be used to calibrate the power of the shooting system.

These variables can be transmitted to the CRio for translation into motor or servo values using the Network Tables module.


Lighting - 3 ringsYou will need to be either using the Kinect or have a very bright LED light with the Axis. One LED light ring is not enough to provide sufficient contrast. We recommend that you get 3 ring lights and place them inside of each other. Refer to the image on the right as to the specific configuration that we are referring to. Note that you will need to test each wire as to the polarity (negative versus positive) and connect then in parallel (negative to negative, positive to positive) and attach them to a 12V source. Note that as the battery voltage lowers so will the lighting so if suddenly detection starts to fail, check your light power levels.

Specifically, we used the following 60, 80 and 100 mm rings from superbrightleds to create the 3 ring lighting that does a great job.

LED Angel Eye Headlight Accent Lights White 60mm
LED Angel Eye Headlight Accent Lights White 80mm
LED Angel Eye Headlight Accent Lights White 1000mm

According to the SuperBrightLed documentation the Natural White LEDs provide the highest lumens (light level).

Many teams are attempting to use colored LED lighting. This is not recommended for two reasons. The first being that blue or red LED lighting will get confused with the red and blue backboards for the two alliances. When looking for contrast, you do NOT want to light the reflective tape with the same color as what is immediately next to it. The second reason is with how information is transmitted from the Axis camera. The camera uses JPEG as the image compression format. This compression GREATLY reduces the color information within the image prior to transmission unless you are using very VERY low compression (i.e. high quality images). The human eye is much less sensitive to color than to intensity changes which the JPEG compression takes advantage of. As our eyes are more sensitive to intensity the compression will preserve intensity much better than color. Thus, intensity changes will be MUCH sharper than color changes and provide better alignment.

Image Use - The ideal camera image for the target localization module is one with a high contrast white on black image. I.e. the retro-reflective tape is white with the rest of the image being black. While ideal for the module, this image will not provide sufficient usage for any driver. If your intent is to use the Axis camera to feed back human visible images then you will have to use a second camera or an onboard computing solution.

The target localization module is tuned to work on 640x480 images. While it will function on smaller images, the precision will be less and the generated numbers will experience more noise. It is NOT recommended to stream 640x480 images from the Axis camera over WiFi as there are enforced bandwidth throttles that will be exceeded by this image size and your video stream will become very choppy.

Computing - The Target Localization module requires a decent amount of CPU power in order to keep up with fast movements. The Classmate PC provided in the Kit of Parts will not be sufficient. As fast a CPU as possible (preferably an i7) would be desired, but i3 or i5 (i.e. multi-core CPU) should be sufficient. Keep in mind that vision processing is very CPU bound, a machine with limited graphics, screen size, disk size, etc. is fine with regards to vision processing and can help to keep the costs under the FRC purchase limit. For example, this laptop from microcenter at $270 should work fine and is well below the $400 purchase limit. Other possibilities are:

NewEgg - i3 - $355 - DELL Inspiron
NewEgg - i3 - $380 - HP Pavilion
NewEgg - i3 - $380.00 - DELL Inspiron
TigerDirect - $380 - i3 - ASUS X54C-BBK19
FRYS - $360 - i3 - Acer E1-571-6659

If you're looking for something smaller but not as fast (just dual core atom's) have a look at
Mini-Box Car PC Dual Core Atom 2.13 Ghz - $300 (still need an OS)

* This list is provided for convenience only. We have not tested any of the aforementioned systems. Purchase at your own risk!

We will continue to attempt to optimize and reduce the system requirements for the module in the next couple of weeks but having more CPU power than needed is always advised when working with vision processing.

Onboard versus Remote - While it is possible to use the module on a remote machine using a streaming image, it is not recommended to process images using a remote machine. The reasons are as follows:

  • Bandwidth - This year the bandwidth usage of every team will be throttled below what a 30fps 640x480 image can safely pass under. Even with a 320x240 image, FIRST is recommending a compression ratio which while acceptable for human usage will lose a lot of precision for a localization calculation.
  • Reality Lag - It takes time to compress, send and uncompress a JPG image. By the time the remote laptop gets the image (even at 30fps) the real time of the image can be up to 1 second old. While this is fine for slow motion and stationary analysis, if your robot is moving based on a processed image, it will be reacting to data up to 1 second in the past. This can cause oscillations or waggling of your robot.
  • Image Use - As mentioned above in Image Use, once you use your Axis camera for target localization, its settings are not appropriate for human use. If you plan to use your camera to aid your driver its best left as a regular RGB image. (The recommended settings of the Axis camera for target detection are much too dark to see anything but the targets).

Target Construction - You will need a FULL target setup to test. Having just a single target (i.e. one square) will NOT work. You also have to adhere to the correct aspect ratio and relative placement when creating your targets (i.e. you can't cheat!). The Target Localization module expects a certain pattern to be seen (as indicated in the above model image) and will not identify the targets correctly if the dimensions are off.

This year, an 18ft target may be difficult to assemble for many teams. In lieu of a wooden target you can get the same desired effect using cardboard. We created our target from colored project display board purchased from HobbyLobby. The reason we purchased these was that they are pre-colored cardboard and still inexpensive. They do have pre-bent folds in them but those can be eliminated/straightened with enough tape. By taping 6 of these together we got the desired width.

We'd also recommend picking up some colored tape with the same color as the board. For example, this red duct tape from Home Depot was useful to secure the front size of the poster boards without changing the color of the face (blue is also available).

If you do not have the room for something this large, you can also create a smaller version by printing out the image to the left and using small strips of retro-reflective tape to tape over the gray areas to create a scaled target. We also glued ours to a cardboard back to allow for easy usage. You can click on the image to the left to get a full image that you can print out. As it is much smaller than the real thing, you will need a steady hand to cut and tape the reflective-tape over the appropriate areas.

Miniature Target

While not ideal, this miniature target will allow you to test your vision system. Keep in mind that as everything is much smaller the workable distances will be very small (less than 2 ft) so you will have to keep that in mind when testing.

Retro-Reflective Tape - Naturally, to create the target and other test boards you'll need quite a bit of retro-reflective tape. The tape used in the actual competition is really expensive (~$500 a roll) so we instead used this White Reflective Tape from ULINE. At $20 per roll (you'll need at least 2) its much less expensive than the competition tape. So what's the catch? Well, this tape is ok in terms of reflectivity from straight ahead but is not as good from the sides. I.e. you'll need more light in order to get a good reflection ... but for testing purposes it is more than sufficient. If you test with this tape, then at the actual regionals/finals your lighting will work even better.

Configurations - So what's the best configuration for the target localization module?

  1. Best - Onboard Computing, Kinect - The best configuration is to use an onboard computing solution plugged into the Kinect with a blurring mask covering the Kinect projector. This solution provides the cleanest image free of most background objects which can be processed the fastest. In this case an i3 will have no trouble keeping up with the needed processing. No bandwidth issues are present, no calibration of the camera is needed (the Kinect is precalibrated) which allows for maximum speed of processing and quickest results. As the Kinect acts like a webcam, there is minimal lag from reality and the image information is not compressed or changed in any way. For those looking to further increase the range, IR LEDs such as the 315mW 850nm Infrared LED from SuperBrightLeds can help with the IR illumination seen by the Kinect's IR camera.

    The disadvantage is that you need an onboard laptop with appropriate padding to help protect from any sudden crashes (although laptops are quite durable these days). It also requires mounting the Kinect in the appropriate spot which is much larger than the typical webcam or Axis camera. The Kinect also requires a 12V power supply and cutting of its power cable in order to attach to your onboard battery.

  2. Good - Onboard Computing, Webcam, LED lighting - If you don't have a Kinect handy, with the 3 ring lighting described above with any regular webcam that can be set to a fast shutter speed or low exposure will offer similar advantages. With enough lighting an image very similar to the Kinect can be produced that offers the same speed advantages to using the Kinect. With the webcam directly connected to a PC you get the same minimal lag in communication and quickest response time with no bandwidth concerns.

    An advantage to the above solution is a webcam is less expensive than the connect and typically much smaller in size. You still need to wire a power supply and purchase the additional LEDs so in total the price will be similar but slightly under the cost of a Kinect (~$120).

    A disadvantage is that you may need to have a calibration module like the Radial Distortion module to help straighten the target lines prior to processing with the Target Localization module.

  3. Ok - Onboard Computing, Axis, LED lighting - The main disadvantage is the time taken to compress and decompress the JPG image will cause a slight delay from reality. Since the Axis is local and connected directly to the PC you can set the lowest compression (best quality image) so that JPG artifacts and color reduction are reduced.

    The disadvantage is that if you use the Axis for target detection, the image will not be very usable for the driver station (i.e. it will be too dark).

  4. Poor - Remote Computing, Axis, LED lighting - You will have significant time lag so you robot will need to be moving very slowly or stopped. You will not have as much precision due to the need to stream the image at 320x240. You will suffer from JPG artifacts and the image will still not be usable to the driver. You will also need the calibration module in order to straighten out the target lines (the Axis camera has significant radial distortion).

  5. Forget it! - Classmate, Axis, no LED - Not even worth trying ... the Classmate is too slow, the images will be lagged from reality, and the target will not be reliably detected due to the lack of significant contrast.

You read more about this module at the Target Localization module documentation.

To try this out yourself:

  1. Download RoboRealm
  2. Install and Run RoboRealm
  3. Load and Run the following zip file which contains the FRC 2013 target graphic and contest field which should produce the above results. It includes the Axis module to grab images, Radial Distortion to correct the image, Target Localization to locate the target and finally the Network Tables module to send out the values.
  4. You may need to configure the camera's Field Of View depending on which camera you are using. You can also change that number slightly to help improve the accuracy of your results.
  5. You may need to change the focal length if matching from the side view is not very accurate.
  6. Configure the Axis module IP address for your camera
  7. Configure the Network Tables module to the IP address of the CRio.
  8. Note that this will send out the 3 hotpot locations in the target to the CRio.

If you have any problems with your images and can't figure things out, let us know and post it in the forum.