Low-cost Localization

My system is distributed as a ROS package in the Stanford AI Lab (SAIL) ROS package repository sail-ros-pkg.svn.sourceforge.net .

I don't currently have a scalable way to distribute my test data files, but let me know if you want access and we'll figure something out. The next paragraphs describe how to run the system on test data to produce a localization estimate which uses WiFi sniffs and computer vision.

Performance Considerations

I have tried to parallelize my implementation wherever possible. Be sure to set NUM_CORES to the number of threads your CPU can handle (e.g., 8 for a current Core i7). This symbol is defined in the top-level CMakeLists.txt file.

Ground Truth

Ground-truth data is at the heart of my system. It is used to build maps of how the cheap sensors (wifi, camera, etc.) respond to the environment at various positions and orientations.

./localize_lidar gates2a.backpack_height.png backpack-06mar2010.laser backpack-06mar2010.walking

That program will generate a movie of the particle cloud as well as a file containing the centroid of the cloud for each laser scan.

Gathering your own data

If you need/want your own datasets, you can use the laser fannypack. This is a fragile experimental device so please please please be careful with it! Once you have the “locloc” ROS package built via rosmake from the sail-ros-pkg repository, you can strap on the fanny pack, plug its USB do this:

roscd locloc
roslaunch hardware/log_data.launch

This will fire up a node that talks to the laser range finder and accelerometers onboard the fanny pack, as well as a rosbag instance to log all messages to disk. It will produce a ROS “bag file” full of your messages. Please note that the fanny pack is powered by a pack of ten AA NiMH batteries. The laser draws about 10 watts. This means that, at best, the laser can run for two hours. Use the power switch on the side to turn off the laser when you aren't actually gathering data! Measure the battery voltage between data runs. When it drops to 11 volts or so, the batteries are exhausted and we need to recharge them to prevent damage.

Once you have your .bag file, you can run the fannylog_to_ascii program to extract ASCII logs from the (binary) rosbag format:

roscd locloc
bin/fannylog_to_ascii BAGFILE

That will drop BAGFILE.laser.txt and BAGFILE.imu.txt files in the same directory that held BAGFILE. Note that the bagfiles are (by default) stored in $ROS_HOME (typically ~/ros/ros or ~/.ros). You can tweak the launch file to change this.

First, you need to extract the segments from your log where you were actually walking. To do this, run the detect_walking_fannypack.m script in the src/pedometry directory. This will produce a .walking file, which is just a two-column ASCII file with a row for each accelerometer timestamp, and a “0” or “1” indicating if the person was walking or not at that time. This is used later to select which motion model to employ at each timestep.

Now, you should have a BAGFILE.laser.txt file and a BAGFILE.walking file. We can now fire up localize_fannypack. This program assumes that you wore the fannypack so the laser was facing forwards.

bin/localize_fannypack MAP LASER WALKING

It should fire up a GUI which shows the particles. It may require motion-model or measurement-model tweaking, depending on how your walking style differs from mine. It will be writing a ground-truth file with the centroid of the particle cloud, which you can use as ground-truth data for the subsequent steps of this system.

Feature Extraction

This is the most computation- and disk-intensive part of the process. For each frame in the test data, the programs will extract visual features using SURF, HoG, and color-histogram methods.

feature_extract VIDEO_LOG surf
feature_extract VIDEO_LOG densehog
feature_extract VIDEO_LOG color

where VIDEO_LOG is a raw video stream. I'm using logs created by my


package in ROS. This works with UVC webcams (Logitech makes the best linux-friendly ones right now), and just dumps the raw YUV frames to disk, interleaving a timestamp.


To perform the bag-of-words vision algorithm, the feature descriptors must be “binned” somehow to ensure that each image has a descriptor that is independent of the number of features in it. My current system uses k-means to find reasonable centroids in the high-dimensional feature descriptor space.

kmeans_surf SURF_LOG NUM_MEANS
kmeans_hog  HOG_LOG  NUM_MEANS

where SURF_LOG and HOG_LOG were created from the previous step, and will be in the same directory as the original video log. I haven't tried various K, but the system seems to run fine with 150. It would be interesting to plot performance as a function of K. Doing so will require a fair amount of CPU time, maybe a few days or so with my current parameter values.

Unfortunately, k-means is never “done,” since you don't know if the solution is just a local minima. My program just restarts k-means 10 times, saving the best one as it goes along. You can ctrl-c it if you don't want to wait for all 10 attempts to converge.

Image Histograms

With the clusters produced by k-means in the previous step, the next task is to create histograms for each image. This is done by assigning every feature in an image to the nearest k-means cluster, and counting how many of each cluster appear in each image. To simplify later steps, I merge this with the ground-truth data by assigning an image to the ground-truth position which is closest to the image timestamp.


Image-only Localization

./localize_surf clusters.150.txt backpack-06mar2010.feature.surf.hist.pyr.txt backpack-08mar2010-train.feature.surf.hist.pyr.txt gates2a.backpack_height.png backpack-08mar2010-train.laser.groundtruth 2000 10000 2 10000 100

Wifi Localization

First, you have to make “heat maps” of each access point from a training set.