I have successfully imported a planet file (Mar 04 2016) into postgres (9.4) using osm2pgsql (0.91.0-dev).

I have the following system:

MOBO = SuperMicro DAL-i-O
CPU = dual Xeon E5-2620V3 (24 cores total)
RAM = 8x16G DDR4 ECC RDIMM (128 gig total)
OS = Ubuntu 15.04 (I'm about to upgrade to 16.04 so don't hate)

The OS (as well as postgres database) lives on a 500GB (Samsung EVO) SSD (This drive has a 16GB swap partition.) The pbf planet file lives on a separate HDD. I also have a 1TB (Samsumg PRO) SSD where main_data, main_idx, slim_data, and slim_idx tablespaces are stored. Each tablespace is on its own partition which comes in handy when profiling with df and dstat (You really get a good picture of what is happening during the import.)

My postgresql.conf settings are all set to defaults except for:

max_connections = 200
shared_buffers = 32GB
work_mem = 100MB
maintenance_work_mem = 48GB
effective_io_concurrency = 2
fsync = off
synchronous_commit = off
full_page_writes = off
wal_buffers = 1MB
checkpoint_segments = 1024
checkpoint_timeout = 1h
checkpoint_completion_target = 0.9
random_page_cost = 2.5
effective_cache_size = 60GB
autovacuum = off

Note that the above settings are for IMPORT ONLY.

Here is a summary of the results:

osm2pgsql -c -d gis -U maddoxw --number-processes 16 --slim -C 30000 --flat-nodes /var/lib/mod_tile/planet.cache --tablespace-main-data main_data --tablespace-main-index main_idx --tablespace-slim-data slim_data --tablespace-slim-index slim_idx /media/Borg_LS/terrain/osm/pbf/planet-latest.osm.pbf
osm2pgsql version 0.91.0-dev (64 bit id space)

Using built-in tag processing pipeline
Using projection SRS 3857 (Spherical Mercator)
Setting up table: planet_osm_point
Setting up table: planet_osm_line
Setting up table: planet_osm_polygon
Setting up table: planet_osm_roads
Allocating memory for dense node cache
Allocating dense node cache in one big chunk
Allocating memory for sparse node cache
Sharing dense sparse
Node-cache: cache=30000MB, maxblocks=480000*65536, allocation method=11
Mid: loading persistent node cache from /var/lib/mod_tile/planet.cache
Allocated space for persistent node cache file
Maximum node in persistent node cache: 0
Mid: pgsql, scale=100 cache=30000
Setting up table: planet_osm_nodes
Setting up table: planet_osm_ways
Setting up table: planet_osm_rels

Reading in file: /media/Borg_LS/terrain/osm/pbf/planet-latest.osm.pbf
Using PBF parser.
Processing: Node(3333342k 2100.4k/s) Way(344218k 62.62k/s) Relation(4175840 263.11/s)  parse time: 22955s
Node stats: total(3333342858), max(4145675901) in 1587s
Way stats: total(344218973), max(413214616) in 5497s
Relation stats: total(4175849), max(6168030) in 15871s
Maximum node in persistent node cache: 4146069503

Going over pending ways...
    231812682 ways are pending

Using 16 helper-processes
Finished processing 231812682 ways in 5223 s

231812682 Pending ways took 5223s at a rate of 44383.05/s

Going over pending relations...
    0 relations are pending

Using 16 helper-processes
Finished processing 0 relations in 0 s

All indexes on planet_osm_roads created in 592s
All indexes on planet_osm_point created in 3441s
All indexes on planet_osm_line created in 6995s
All indexes on planet_osm_polygon created in 11933s
Stopped table: planet_osm_nodes in 0s
Stopped table: planet_osm_ways in 16613s
Stopped table: planet_osm_rels in 119s
Maximum node in persistent node cache: 4146069503
node cache: stored: 3303017669(99.09%), storage efficiency: 84.00% (dense blocks: 449090, sparse nodes: 126610290), hit rate: 99.10%

Osm2pgsql took 56882s overall

I monitored the whole install process with dstat and a homebrew script to record the results of df (and du) calls. I noticed that about 5 minutes into the "going over pending ways" section, my swap usage started climbing steadily. By the time I hit the "Building index on table: planet_osm_ways" section, the swap usage was 12GB and then 2 minutes into building the planet_osm_ways indexes the swap was completely full and stayed that way for the remainder of the import.

My postgresql memory settings (shared buffs, work mem, maintenance work mem, effective cache size) as well as the osm2pgsql cache setting were all calculated based on the wikis and the numerous forums and blog posts I've found. However, most of the suggestions I've seen are 2+ years old and none of them give good settings for systems with large SSDs and large amounts of RAM (I think 32GB is the largest system I've seen someone benchmark.) I've been experimenting with OSM imports for about 2 months now and have tried various combinations of the above settings, but have only recently (today) noticed that my swap is being used heavily.

Does anyone have any advice on how to best calculate the values of each of the memory settings such that osm2pgsql does not eat into the swap? For you experienced guys, do any of my settings look too high (or too low)?


asked 06 May '16, 08:18

placebo10's gravatar image

accept rate: 0%

edited 06 May '16, 08:20

I am also wondering the best tuning settings for resource heavy databases. I have a lot of ram (50G) and IOPS available, and I wonder what the best settings area.

(08 Aug, 07:16) sprutner
Be the first one to answer this question!
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here



Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:


question asked: 06 May '16, 08:18

question was seen: 1,905 times

last updated: 08 Aug, 07:20

powered by OSQA