I am new to nominatim, and I was under the impression that nominatim uses osm2pgsql to import the data into a Postgresql/Postgis database.

However, I noticed there is a huge difference between the database size of a pure osm2pgsql imported map, and the same map imported by Nominatim.

A map of just 3.2Mb ends up in a database 80Mb with osm2pgsql. The same map ends up 265Mb when imported with Nominatim using setup.php --osm-file --osm2pgsql-cache 1024.

Why is nominatim creating so much extra data for the same thing? It is 3 times as much as pure osm2pgsql. I noticed the database tables are also quite different.

Is there a way to reduce this data size used by nominatim? Do I have something wrong?

I want to import the full planet OSM file, and I read that the full planet ends up taking like 800Gb. But with the ratio I got, it is going to end up taking 3Tb!

Would be good if someone could explain the difference and the relationship between the two, because it is not clear to me what is nominatim doing with osm2pgsql if the data structures are different.

Thanks.

asked 05 Mar '17, 13:19

jbx1's gravatar image

jbx1
11335
accept rate: 0%


Nominatim uses a different configuration file (well runs contains C++ code, so it's more like a plugin, https://github.com/openstreetmap/osm2pgsql/blob/master/output-gazetteer.cpp) when importing. It needs some map features and tags that are irrelevant for rendering e.g. postal code boundaries or the operator tag.

The first step to reduce the Nominatim data size is to import less data, e.g. is every country in the world needed? Next would be not importing the US TIGER data. A third option is not (hourly/daily/weekly) updating data, in that case you can delete a lot of tables.

The 800GB requirement (https://wiki.openstreetmap.org/wiki/Nominatim/Installation#Hardware) might be outdated. The last full planet import I ran came just short of 500GB. OpenStreetMap data keeps growing every month and Postgres fragments the database with every small update (let's say it leaves empty spots on the harddrive when deleting rows) so I wouldn't recommend a standard 480GB drive yet. But it's certainty not the 3TB you calculated based on a 3.2MB input file.

it is not clear to me what is nominatim doing with osm2pgsql if the data structures are different.

Nominatim uses osm2pgsql to import the data structures it needs based on its own configuration and use-case. When you setup a setup a server for map tile rendering and search (Nominatim) that data needs to go into separate database. (related question: https://help.openstreetmap.org/questions/52978/possible-to-use-one-db-for-nominatim-and-tile-server) Yes, it's seems like a huge waste to have most data twice but the systems aren't connected and there's no plans to have them use the same data structures.

permanent link

answered 05 Mar '17, 20:52

mtmail's gravatar image

mtmail
4.1k1261
accept rate: 30%

Thanks for your reply. Just to clarify, my use case is geocoding and reverse geocoding, not a tile server. I am not importing any US data, the test I have done is with a small european country. My worry was because if a 3.2Mb map ended up taking 265Mb, by simple proportion, without TIGER or anything, if the osm full planet file is 37GB, I was concerned it would end up some 3TB. 500GB (even 800GB) is quite managable.

(05 Mar '17, 23:00) jbx1
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×614
×229

question asked: 05 Mar '17, 13:19

question was seen: 3,670 times

last updated: 05 Mar '17, 23:00

powered by OSQA