Hi, I just got a dedicated server for our planet osm2pgsql database. It has 32 GB RAM, two Opteron processors (6212) (16 cores in total), a software RAID1 for the system and a software RAID0 for the database. The database disks are 10k Raptor drives, so nothing too slow. I tested an import of the planet during the weekend and I think the performance of the import could be better, especially when reviewing Frederiks SotM presentation in 2010 where the import is done within some hours (regard the slim planet import whith R0, -C8000). The planet.osm.bz2 is on the RAID1, the import command was:
so actually plenty of cache available. During the process I observed that the node processing speed never got over 80k/s which is somewhat low I think for that machine. Using top I watched the osm2pgsql process and it was always at 100%. In order to find out if there's a disk speed limit I used iostat and the values I saw let me assume that the drives were never near to their limits. So I tend to believe that I run into a CPU limit while processing the XML. Unfortunately the import process was interrupted. There were too many connections on the database (the limit was set too low by the pgtune tool). So I don't have any final numbers, just the XML processing stats below. So finally my question. Can someone confirm that Opterons suck at XML processing? Does someone have experience with Opterons and a osmpgsql import? What's your node processing speed? The switch2osm site has a quote which shows a nearly tripled speed (200k/s), that's mainly why I ask here for experience values from other planet "importers" ... My stats for the processing:
Many thanks Frank asked 02 Apr '12, 09:04 frabron |
First of all, forget my performance figures, they are two years old and that's a very long time in OSM! You can use the .pbf planet instead of the .osm.bz2, this will save the time used for bz2 decompression and XML parsing. However, I would expect over 80% of processing time to be spent building geometries and indexes, so even if your system should be slow while reading the file, that should not impact overall performance too much. answered 02 Apr '12, 09:27 Frederik Ramm ♦ |
You might check this wiki page collecting several osm2pgsql performance measurements (and add yours): answered 02 Apr '12, 12:21 Pieren |
Thanks all, I am downloading the planet.pbf and will post the results probably on wednesday and also in the wiki. Right now I want to test if my Intel (i4) at home performs better than the Opteron at work on processing nodes
Ok, I downloaded great-britain.osm.bz2 and the node processing speed definitely faster than the Opterons speed. Right now it processes nodes with 93,6k/s and the ways between 10 and 11k/s. My testing setup is simple, I use a Virtualbox with the latest Ubuntu server release and osm2pgsql from source. The host is a i4 2400 Intel processor and 8GB RAM with 4GB given to the machine. Osm2pgsl runs with -C 2048 ...
Today I started a new import with the latest planet.osm.pbf. Node import speed is at approx. 140k/s, already much better than the XML import. One thing I notice though, now the postgres process is at 100%. It is the
action that now is limiting a faster import. Disks are nowhere at their limits and osm2pgsql is between 30% - 60% CPU usage
FYI: I added my server and results (~50hrs) to the benchmark table ...
Dear frabron, I'm so glad you have solved you problem. And now I run into the same qusetion like yours, so your advice is using the .pbf format data to import instead of the .bz2 ? Do you have some method to accelerate the importing speed of the bz2 file? Does the command " --number-processes 16" in you command line count a lot?
With many thanks!! David