I just downloaded the OSM globe, and am testing out running an extract using the demonstration code and osmosis:

 bzcat downloaded.osm.bz2 | osmosis\
  --read-xml enableDateParsing=no file=-\
  --bounding-box top=49.5138 left=10.9351 bottom=49.3866 right=11.201 --write-xml file=-\
  | bzip2 > extracted.osm.bz2

Roughly, how long does it take to get this extract? I'm using a computer with four 2.6 GHz cores, and 4 GB of ram and using Ubuntu 10.10. I'm curious whether this should take 4 hours or 4 days...

This the output from my terminal:

user@computer:~/Downloads$  bzcat downloaded.osm.bz2 | osmosis\
>   --read-xml enableDateParsing=no file=-\
>   --bounding-box top=49.5138 left=10.9351 bottom=49.3866 right=11.201 --write-xml file=-\
>   | bzip2 > extracted.osm.bz2
May 4, 2011 11:36:20 AM org.openstreetmap.osmosis.core.Osmosis run
INFO: Osmosis Version 0.34
log4j:WARN No appenders could be found for logger (
log4j:WARN Please initialize the log4j system properly.
May 4, 2011 11:36:21 AM org.openstreetmap.osmosis.core.Osmosis run
INFO: Preparing pipeline.
May 4, 2011 11:36:21 AM org.openstreetmap.osmosis.core.Osmosis run
INFO: Launching pipeline execution.
May 4, 2011 11:36:21 AM org.openstreetmap.osmosis.core.Osmosis run
INFO: Pipeline executing, waiting for completion.

It is now 13.37, and nothing else has appeared in the terminal. I'm not in a rush - I just want to check that this is a reasonable amount of time to wait, or whether there is something I have done incorrectly. I'm completely new to running queries on data-sets of this size, and osmosis in general. Thanks!

I can't tell you for sure, but I would expect it to take hours rather than days.

However, there are two potential ways of speeding things up.

1) Use the binary format .osm.pbf instead of the xml encoded version of the planet file. It is much more efficient to parse.

2) If you only need a city, you probably don't want to start off with the whole planet. There are country sized extracts of the planet available that are much more manageable.

@apmon Can I just download the .osm.pbf format, and run the query in the same way if I change the file name? I did look for country level extracts to use after the test city runs, however I want data for singapore, and did not see any regional extracts.

(04 May '11, 18:47) celenius

Otherwise, yes you should be able to run the command the same way. Although you will presumably have to replace the --read-xml with something like --read-pbf

(05 May '11, 01:00) apmon

Thanks for the links - I previously looked at cloudmade and geofabrik and they are missing a lot of information. I'll try the --read-pbf

(05 May '11, 15:45) celenius

You can also use bboxSplit, which is much faster than osmosis.

g++ -O2 bboxSplit.cpp -o bboxSplit

To run it:

bzcat planet-latest.osm.bz2 |  ./bboxSplit \
  -85.05113   73.12500    9.44906  180.00000 gzip someRegion.osm.gz \

The 4 values are minlat, minlon maxlat and maxlon. If a node is in the bbox it is included in someRegion. If a way refers to 1 or more nodes in the bbox it is included in someRegion. If a relation refers to 1 or more node or way that has already been included in someRegion then that relation is also include in someRegion. I think osmosis follows the same rules.

If you don't want to compress the output, change gzip to cat.

You can have as many regions as you like. (I've tested up to 200)

