Showing posts with label photosynth. Show all posts
Showing posts with label photosynth. Show all posts

2009-07-03

Building Bundler v0.3 on Ubuntu

'The Office Box' requested help with running Bundler on linux ( specifically Ubuntu 9- I have a virtualbox vm of Ubuntu 8.04 fully updated to today, but I'll try this out on Ubuntu 9 soon) so I went through the process myself.

The binary version depends on libgfortran.so.3, which I couldn't find with aptitude, so I tried the building from source- it turned out to be not that hard. The is no 'configure' for bundler 0.3 to search for dependencies that aren't installed, so I built incrementally and installed packages as I ran into build failures. I might be missing a few I already had installed for other purposes, but do a sudo aptitude install on the following:

build-essentials
gfortran-4.2
zlib1g-dev
libjpeg-dev


A missing gfortran produces the cryptic 'error trying to exec 'f951': execvp: No such file or directory)' message.

These might be necessary I'm not sure:

lapack3
libminpack1
f2c


After that run the provided makefile, add the bundler bin folder to your LD_LIBRARY_PATH, and then go into the examples/kermit directory and run ../../RunBundler.sh to see there are good ply files in the bundle directory. Bundler is a lot slower than Photosynth for big jobs, I haven't tried the intel math libs though.

The full output from a successful kermit RunBundler run looks like this:

Using directory '.'
0
Image list is list_tmp.txt
[Extracting exif tags from image ./kermit000.jpg]
[Focal length = 5.400mm]
[Couldn't find CCD width for camera Canon Canon PowerShot A10]
[Found in EXIF tags]
[CCD width = 5.230mm]
[Resolution = 640 x 480]
[Focal length (pixels) = 660.803
[Extracting exif tags from image ./kermit001.jpg]
[Focal length = 5.400mm]
[Couldn't find CCD width for camera Canon Canon PowerShot A10]
[Found in EXIF tags]
[CCD width = 5.230mm]
[Resolution = 640 x 480]
[Focal length (pixels) = 660.803
[Extracting exif tags from image ./kermit002.jpg]
[Focal length = 5.400mm]
[Couldn't find CCD width for camera Canon Canon PowerShot A10]
[Found in EXIF tags]
[CCD width = 5.230mm]
[Resolution = 640 x 480]
[Focal length (pixels) = 660.803
[Extracting exif tags from image ./kermit003.jpg]
[Focal length = 5.400mm]
[Couldn't find CCD width for camera Canon Canon PowerShot A10]
[Found in EXIF tags]
[CCD width = 5.230mm]
[Resolution = 640 x 480]
[Focal length (pixels) = 660.803
[Extracting exif tags from image ./kermit004.jpg]
[Focal length = 5.400mm]
[Couldn't find CCD width for camera Canon Canon PowerShot A10]
[Found in EXIF tags]
[CCD width = 5.230mm]
[Resolution = 640 x 480]
[Focal length (pixels) = 660.803
[Extracting exif tags from image ./kermit005.jpg]
[Focal length = 5.400mm]
[Couldn't find CCD width for camera Canon Canon PowerShot A10]
[Found in EXIF tags]
[CCD width = 5.230mm]
[Resolution = 640 x 480]
[Focal length (pixels) = 660.803
[Extracting exif tags from image ./kermit006.jpg]
[Focal length = 5.400mm]
[Couldn't find CCD width for camera Canon Canon PowerShot A10]
[Found in EXIF tags]
[CCD width = 5.230mm]
[Resolution = 640 x 480]
[Focal length (pixels) = 660.803
[Extracting exif tags from image ./kermit007.jpg]
[Focal length = 5.400mm]
[Couldn't find CCD width for camera Canon Canon PowerShot A10]
[Found in EXIF tags]
[CCD width = 5.230mm]
[Resolution = 640 x 480]
[Focal length (pixels) = 660.803
[Extracting exif tags from image ./kermit008.jpg]
[Focal length = 5.400mm]
[Couldn't find CCD width for camera Canon Canon PowerShot A10]
[Found in EXIF tags]
[CCD width = 5.230mm]
[Resolution = 640 x 480]
[Focal length (pixels) = 660.803
[Extracting exif tags from image ./kermit009.jpg]
[Focal length = 5.400mm]
[Couldn't find CCD width for camera Canon Canon PowerShot A10]
[Found in EXIF tags]
[CCD width = 5.230mm]
[Resolution = 640 x 480]
[Focal length (pixels) = 660.803
[Extracting exif tags from image ./kermit010.jpg]
[Focal length = 5.400mm]
[Couldn't find CCD width for camera Canon Canon PowerShot A10]
[Found in EXIF tags]
[CCD width = 5.230mm]
[Resolution = 640 x 480]
[Focal length (pixels) = 660.803
[Found 11 good images]
[- Extracting keypoints -]
Finding keypoints...
1245 keypoints found.
Finding keypoints...
1305 keypoints found.
Finding keypoints...
1235 keypoints found.
Finding keypoints...
1220 keypoints found.
Finding keypoints...
1104 keypoints found.
Finding keypoints...
1159 keypoints found.
Finding keypoints...
949 keypoints found.
Finding keypoints...
1108 keypoints found.
Finding keypoints...
1273 keypoints found.
Finding keypoints...
1160 keypoints found.
Finding keypoints...
1122 keypoints found.
[- Matching keypoints (this can take a while) -]
../../bin/KeyMatchFull list_keys.txt matches.init.txt
[KeyMatchFull] Reading keys took 1.020s
[KeyMatchFull] Matching to image 0
[KeyMatchFull] Matching took 0.010s
[KeyMatchFull] Matching to image 1
[KeyMatchFull] Matching took 0.170s
[KeyMatchFull] Matching to image 2
[KeyMatchFull] Matching took 0.380s
[KeyMatchFull] Matching to image 3
[KeyMatchFull] Matching took 0.560s
[KeyMatchFull] Matching to image 4
[KeyMatchFull] Matching took 0.740s
[KeyMatchFull] Matching to image 5
[KeyMatchFull] Matching took 0.960s
[KeyMatchFull] Matching to image 6
[KeyMatchFull] Matching took 1.060s
[KeyMatchFull] Matching to image 7
[KeyMatchFull] Matching took 1.210s
[KeyMatchFull] Matching to image 8
[KeyMatchFull] Matching took 1.410s
[KeyMatchFull] Matching to image 9
[KeyMatchFull] Matching took 1.600s
[KeyMatchFull] Matching to image 10
[KeyMatchFull] Matching took 1.760s
[- Running Bundler -]
[- Done -]

2009-01-29

Bundler - the Photosynth core algorithms GPLed

bundler 212009 65922 AM.bmp
[update- the output of bundler is less misaligned looking than this, I was incorrectly displaying the results here and in the video]

Bundler (http://phototour.cs.washington.edu/bundler) takes photographs and can create 3D point clouds and camera positions derived from them similar to what Photosynth does- this is called structure from motion. It's hard to believe this has been out as long as the publically available Photosynth but I haven't heard about it- it seems to be in stealth mode.


Bundler - GPLed Photosynth - Car from binarymillenium on Vimeo.

From that video it is apparent that highly textured flat surfaces do best. The car is reflective and dull grey and so generates few correspondences, but the hubcaps, license plate, parking strip lines, and grass and trees work well. I wonder if this could be combined with a space carving technique to get a better car out of it.

It's a lot rougher around the edges lacking the Microsoft Live Labs contribution, a few sets I've tried have crashed with messages like "RunBundler.sh: line 60: 2404 Segmentation fault (core dumped) $MATCHKEYS list_keys.txt matches.init.txt" or sometimes individual images throw it with "This application has requested the Runtime to terminate it..." but it appears to plow through (until it reaches that former error).

Images without good EXIF data trip it up, the other day I was trying to search flickr and find only images that have EXIF data and allow full view, but am not successful so far. Some strings supposed limit search results by focal length, which seems like would limit results only to EXIF, but that wasn't the case.

Bundler outputs ply files, which can be read in Meshlab with the modification that these two lines be added to ply header:

element face 0
property list uchar int vertex_index

Without this Meshlab will give an error about there being no faces, and give up.

Also I have some Processing software that is a little less user friendly but doesn't require the editing:

http://code.google.com/p/binarymillenium/source/browse/trunk/processing/bundler/


Bundler can't handle filenames with spaces right now, I think I can fix this myself without too much work, it's mostly a matter of making sure names are passed everywhere with quotes around them.

Multi-megapixel files load up sift significantly until it crashes after taking a couple of gigabytes of memory (and probably not able to get more from windows):

...
[Found in EXIF tags]
[CCD width = 5.720mm]
[Resolution = 3072 x 2304]
[Focal length (pixels) = 3114.965
[Found 18 good images]
[- Extracting keypoints -]

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.


Resizing them to 1600x1200 worked without crashing and took only a few hundred megabytes of memory per image, so more megapixels may work as well.

The most intriguing feature is the incremental option, I haven't tested it yet but it promises to be able to take new images and incorporate them into existing bundles. Unfortunately each new image has a matching time proportional to the number of previous images- maybe it would be possible to incrementally remove images also, or remove found points that are in regions that already have high point densities?

2008-08-31

Photosynth Export Process Tutorial

It looks like I have unofficial recognition/support for my export process, but I get the feeling it's still too user unfriendly:

http://getsatisfaction.com/livelabs/topics/pointcloud_exporter


What to do

Get Wireshark http://www.wireshark.org/

Allow it to install the special software to intercept packets.

Start Wireshark. Put

http.request

into the filter field.

Quit any unnecessary network activity like playing youtube videos- this will dump in a lot of data to wireshark that will making finding the bin files harder.

*** Update ***
Some users have found the bin files stored locally in %temp%/photosynther, which makes finding them much easier than using Wireshark, but the for me on Vista the directory exists for one user but not others- but the bin files have to be stored locally somewhere right?
***

Open the photosynth site in a browser. Find a synth with a good point cloud, it will probably be one with several hundred photos and a synthiness of > 70%. There are some synths that are 100% synthy but have point clouds that are flat billboards rather than cool 3D features- you don't want those. Press p or hold ctrl to see the underlying point cloud.

*** Update ***
Use the direct 3d viewer option to view the synth, otherwise you won't get the synth files. (thanks losap)
***

Start a capture in Wireshark - the upper left butter and then click the proper interface (experiment if necessary).

Hit reload on the browser window showing the synth. Wireshark should then start show ing what files are being sent to your computer. Stop the capture once the browser has finished reloading. There may be a couple screen fulls but near the bottom should be a few listings of bin files.



Select one of the lines that shows a bin file request, and right-click and hit Copy | Summary (text). Then in a new browser window paste that into the address field. Delete the parts before and after /d8/348345348.../points_0_0.bin. Look back in Wireshark to discover what http address to use prior the that- it should be http://mslabs-nnn.vo.llnwd.net, but where nnn is any three digit number. TBD- is there a way to cut and paste the fully formed url less manually?

If done correctly hit return and make the browser load the file- a dialog will pop up, save it to disk. If there were many points bin files increment the 0 in the file name and get them all. If you have cygwin a bash script works well:

for i in `seq 0 23`
do
wget http://someurl/points_0_$i.bin
done


Python

Install python. If you have cygwin installed the cygwin python with setup.exe, otherwise http://www.python.org/download/ and download the windows installer version.
*** UPDATE *** It appears the 2.5.2 windows python doesn't work correctly, which I'll look into- the best solution is to use Linux or Cygwin with the python that can be installed with Linux ***

Currently the script http://binarymillenium.googlecode.com/svn/trunk/processing/psynth/bin_to_csv.py works like this from the command line:

python bin_to_csv.py somefile.bin > output.csv


But I think the '>' will only work with cygwin and not the windows command prompt. I'll update the script to optionally take a second argument that is the output file.

If there are multiple points bin files it's easy to do another bash loop to process them all in a row, otherwise manually do the command above and create n different csvs for n bin files, and then cut and paste the contents of each into one complete csv file.

The output will be file with a long listing of numbers, each one looks like this:

-4.17390823, -1.38746762, 0.832364499, 24, 21, 16
-4.07660007, -1.83771312, 1.971277475, 17, 14, 9
-4.13320493, -2.56310105, 2.301105737, 10, 6, 0
-2.97198987, -1.44950056, 0.194522276, 15, 12, 8
-2.96658635, -1.45545017, 0.181564241, 15, 13, 10
-4.20609378, -2.08472299, 1.701148629, 25, 22, 18


The first three numbers are the xyz coordinates of a point, and the last three is the red, green, and blue components of the color. In order to get a convention 0-255 number for each color channel red and blue would have to be multiplied by 8, and green by 4. The python script could be easily changed to do that, or even convert the color channels to 0.0-1.0 floating point numbers.

Point Clouds - What Next?
The processing files here can use the point clouds:
http://binarymillenium.googlecode.com/svn/trunk/processing/psynth/

Also programs like Meshlab can use them with some modification- I haven't experimented with it much but I'll look into that and make a post about it.

2008-08-28

Color Correction



I have the colors figured out now: I was forgetting to byteswap the two color bytes, and after that the rgb elements line up nicely. And it's 5:6:5 bits per color channel rather than 4 as I thought previously, thanks to Marvin who commented below.

The sphinx above looks right, but earlier the boxer shown below looked so wrong I colored it falsely to make the video:



The Boxer - Photosynth Export from binarymillenium on Vimeo.

But I've fixed the boxer now:



The python script is updated with this code:


bin.byteswap()
red = (bin[0] >> 11) & 0x1f
green = (bin[0] >> 5) & 0x3f
blue = (bin[0] >> 0) & 0x1f

2008-08-27

Exporting Point Clouds From Photosynth


Since my last post about photosynth I've revisited the site and discovered that the pictures can be toggled off with the 'p' key, and the viewing experience is much improved given there is a good point cloud underneath. But what use is a point cloud inside a browser window if it can't be exported to be manipulated into random videos that could look like all the lidar videos I've made, or turned into 3D meshes and used in Maya or any other program?

Supposedly export will be added in the future, but I'm impatient like one of the posters on that thread so I've gone forward and figured out my own export method without any deep hacking that might violate the terms of use.

Using one of those programs to intercept 3D api calls might work, though maybe not with DirectX or however the photosynth browser window is working. What I found with Wireshark is that http requests for a series of points_m_n.bin files are made. The m is the group number, if the photosynth is 100% synthy then there will only be one group labeled 0. The n splits up the point cloud into smaller files, for a small synth there could just be points_0_0.bin.

Inside each bin file is raw binary data. There is a variable length header which I have no idea how to interpret, sometimes it is 15 bytes long and sometimes hundreds or thousands of bytes long (though it seems to be shorter in smaller synths).

But after the header there is a regular set of position and color values each 14 bytes long. The first 3 sets of 4 bytes are the xyz position in floating point values. In python I had to do a byteswap on those bytes (presumably from network order) to get them to be read in right with the readfile command.

The last 2 bytes is the color of the point. It's only 4-bits per color channel, which is strange. The first four bits I don't know about, the last three sets of 4 bits are red, blue, and green. Why not 8-bits per channel, does the photosynth process not produce that level of precision because it is only loosely matching the color of corresponding points in photos? Anyway as the picture above shows I'm doing the color wrong- if I have a pure red or green synth it looks right, but maybe a different color model than standard rgb is at work.

I tried making a photosynth of photos that were masked to be blue only- and zero synthiness resulted - is it ignoring blue because it doesn't want to synth up the sky in photos?

Anyway here is the python script for interpreting the bin files.

The sceneviewer (taken from the Radiohead sceneviewer) in that source dir works well for displaying them also.

Anyway to repeat this for any synth wireshark needs to figure out where the bin files are served from (filter with http.request), and then they can be downloaded in firefox or with wget or curl, and then my script can be run on them, and processing can view them. The TOC doesn't clearly specify how the point clouds are covered so redistribution of point clouds, especially those not from your own synths or someone who didn't CC license it, may not be kosher.

2008-08-23

Photosynth

When I first saw the original demo I was really impressed, but now that is been released I feel like it hasn't advanced enough since that demo to really be useful. I tried a few random synths when the server was having problems, it looks like it isn't being hammered any longer so I ought to try it again soon when I'm using a compatible OS.

Overall it's confused and muddled to use and look at- like a broken quicktime VR.

Photosynth seems to work best in terms of interface and experience when it is simply a panoramic viewer of stitched together images- where all the images are taken from a point of buildings or scenery around the viewer. It's easy to click left or right to rotate left or right and have the view intuitively change. But we've had photostitching software that produces smooth panoramas that look better than this for years, so there's nothing to offer here.

When viewing more complicated synths, the UI really breaks down. I don't understand why when I click and drag the mouse the view rotates to where I'd like, but then it snaps back to where it used to be when I let go of the button. It's very hard to move naturally through 3D space- I think the main problem is that the program is too photo-centric: it always wants to feature a single photograph prominently rather than a more synthetic view. Why can't I pull back to view all the photos, or at least a jumble of outlines of all the photos?

It seems like there is an interesting 3D point cloud of points found to be common to multiple photos underlying the synth but it can't be viewed on it's own (much less downloaded...), there are always photos obscuring it. The photograph prominence is constantly causing nearby photos to become blurry or transparent in visually disruptive ways.

Finally, it seems like the natural end-point of technology like this is to generate 3D textured models of a location, with viewing of the source photos as a feature but not the most prominent mode. Can this be done with photosynth-like technology or is all the aspects I don't like a way of covering up that it can't actually do that? Maybe it can produce 3D models but they all come out horribly distorted (so then provide a UI to manually undistort them).

Hopefully they will improve on this, or another well-backed site will deliver fully on the promise shown here.