The simplest version of the command I'm using is (pretty formatting for easier reading - it's usually one line):
I get more complex by adding
I've tried adding
In any of my attempts, with 2 or more cores, the process tops out at nearly 200% CPU use (when viewed in I verified that I can get a process to use all cores (4 in this case) by running
asked 16 Jan '14, 19:12 JamesChevalier |
I would suggest to place one "buffer" before and after each --bounding-polygon directive. I haven't tried more than about 50 "tee" threads but I guess more would still be possible - but keep in mind that the number of "point in polygon" checks that osmosis has to make is the size of your input file multiplied by the number of --bounding-polygon threads you're using - each object will be checked against all (thousands of) polygons. Therefore it is more efficient to first divide up your input file into a couple of smaller regions and extract your files from them. Here's an older blog entry that describes how we used to run the Geofabrik extracts. We nowadays use the history splitter which offers better performance when doing a large number of polygon splits at once. answered 16 Jan '14, 19:26 Frederik Ramm ♦ Thanks! (I actually came across your blog post the other day, and thought about contacting you directly).
I've mostly used a country OSM file (like germany-latest.osm.pbf), but I did run some tests with a 'state' level file (like brandenburg-latest.osm.pbf). Those definitely finished quicker, but still didn't make full use of the CPU.
I'll try again with your suggestion of
(16 Jan '14, 19:37)
JamesChevalier
|