This is a static archive of our old OpenStreetMap Help Site. Please post any new questions and answers at community.osm.org.

can I set-up an overpass server that holds only certain node types to limit database size?

Hi,

currently I am operating my own overpass-api server based upon the full osm data replication.

This is quite resource intensive as about 450 Gbyte of local storage are required for this as it holds ALL OSM data and can be used to query all of it.

But the use-case I have only requires certain amounts of data to be available.

Is someone here that could help me finding out how I can set-up the overpass server so that only the required objects get stored and updated/filtered in the overpass database to have a smaller footprint?

Specifically this is the query that needs to be answered:

[out:json] [timeout:25]; ( nodeamenity=toilets; wayamenity=toilets; relationamenity=toilets; ); out body;

; out skel qt;

overpassapi overpass

asked 16 Aug '20, 09:00

bietiekay
11●1●1●2
accept rate: 0%

One Answer:

Most systems that use Overpass in a "production" environment turn out to be not very well designed (or, to put it more favourably, turn out to be a proof of concept that has been promoted to production mode). Overpass is an all-purpose databsae system, striking a balance between being able to query very specific objects and also return data for large areas. Your use case would be much better served by an osm2pgsql import into a PostgreSQL database, using a style file that just imports toilets.

However - no matter which database system you use - you need to remember one thing regarding updates: If a way gets created then the nodes it uses will not be in the update. This means if someone constructs a way from pre-existing nodes and tags it amenity=toilets then you need to already have these nodes, or you don't know where the toilet is. Same for relations; if someone should construct a relation from two already-existing ways and tag it amenity=toilets, and you haven't kept the ways because they looked un-interesting, then you're screwed.

If you can live with an update frequency of once a day or less, then the least resource-intensive process might be this:

keep a full planet file lying around
update it at the end of the day, or week, with either osmosis or pyosmium-up-to-date (the latter is faster)
filter the planet file to keep just amenity=toilets nodes/ways/relations, using osmosis or osmium (the latter is a lot faster)
import the resulting toilet dataset into Overpass or PostgreSQL

answered 16 Aug '20, 10:04

Frederik Ramm ♦
82.5k●92●720●1273
accept rate: 23%

AH! I see. My knowledge of the underlying data structures isn't good enough to have seen this connection of way and nodes.

Thanks a lot for pointing it out. I will give both your proposals a try - the postgresql and the overpass solution.

Right now the big-full-sync works but as the machine does only have mechanical hard disks right now at the required size I fear for their lives with the minute updates running constantly.

thanks a lot!

(16 Aug '20, 10:34) bietiekay