I am using Overpass API (on overpass-api.de) to query for transport-related elements along a recorded GPS track/journey. Currently, I perform several "around"-queries for each track point of the journey to get the IDs of e.g. public transport routes and stops, roads, railway etc. within 50m. This takes a few seconds for each track point, and since each journey consists of about 100 track points, collecting all data takes minutes instead of desired <10s. I have the following idea for optimizing this:
Since I am not experienced with OSM and this requires considerable effort:
Thanks in advance! Arik asked 11 Dec '15, 13:35 Arik
showing 5 of 9
show 4 more comments
|
A smaller database for Overpass API would indeed help to speed things up a bit: Here's a small comparison I ran for your sample query http://overpass-turbo.eu/s/ddY
Now comparing that with a smaller Berlin extract provided by Geofabrik and various database compression settings (lz4, gzip or no compression):
Database was populated using:
or with one of the following compression settings:
Query was run using:
answered 12 Dec '15, 10:14 mmd 1
Thanks for the effort! It's great news that reducing the database to a certain region helps. In my case, I want to reduce the database by only including the elements in my query (transport-related elements). I guess, this is possible with Osmosis and will speed it up as well. I'll test it with the Berlin-extract first, thanks for the idea :)
(13 Dec '15, 11:20)
Arik
|
Is there any reason you particularly want to use Overpass locally? Simple PostGIS queries should be able to do this if you load OSM data into a spatial database using the standard tools.
Thanks for your comment! I was actually not aware of the fact that there is a different way to access the data. So, let's forget about Overpass in my question... I'll read about PostGIS to query a local database. Now, does the assumption with the prefiltering make sense?
@Arik Would it be possible to describe the sort of queries that you might want to do? Not in query language - just at the level of "I want to find houses near a road called X" or similar.
Sure, for example: I want to know whether there is at least one road within 50m around given coordinates. Then replace "road" with several other elements like "bus stop", "train station", "subway route" and query again. For routes, the ID of the relation would be useful as well.
P.S.: more generally, my goal is to find out what means of transport could have been used while recording the GPS track
Can you please post your original Overpass Query as overpass turbo link, so we can check if the issue persists with the latest beta version. Thanks.
It is not an issue with Overpass, I guess. Querying a lot of elements just takes its time. Instead, I will use PostGIS as suggested by Richard, I guess. The only remaining question is whether it makes sense to create a reduced dataset first which only includes all transport-related elements. I would strongly guess that the answer is yes and would be happy for a confirmation by an experienced user before I waste my time setting everything up.
You mentioned a response time of several seconds. In the current beta, I would expect sub-second response time. I'm just asking to confirm that it's no longer an issue.
How much the savings of a reduced data set are in reality is hard to tell without running some tests. Biggest issue being: setting up a full planet db for comparison - that may easily take a few days. In any case I recommend to start out with some small extract (from download.geofabrik.de) and get familiar with the toolchain.
Okay, sure, so below you find the link to an example. I use the regex because I experienced it to be significantly faster than using the union of the queries of all the key-value pairs. And you're right, it takes about one second. I recently also tried to get all the nodes belonging to the elements (see the part that is commented out), which takes longer.
http://overpass-turbo.eu/s/ddY
ok, thanks! On the dev instance I get 880ms response time for the official 0.7.52 version and about 550ms on the beta (running with
osm3s_query
, data in buffer cache). Response times on overpass-api.de are higher due to overall higher CPU utilization there. overpass-api.de only accepts one query at a time, but rambler.ru instance doesn't have that limit. If you set up your own instance, you can of course run as many queries in parallel as you wish (or your hardware can handle).