Hi, I want to know if there is a script to standardise OSM data. Some examples:
This would result in an OSM file that isn't suited for editing anymore, but can more easily be used by other tools. This should be possible, since it's only a fraction of the work that nominatim does, but I wonder if there is also a separate tool or script to do it. asked 14 Dec '10, 13:50 Sanderd17 |
In general, there is no current tool available to do what you want. What you are looking for is a script that, for everything in the world that can be represented in OpenStreetMap in two or more different ways, pick one of the representations and convert all the others. If you are just interested in sorting out tagging, then I would recommend the TagTransform plugin for Osmosis. I use this plugin daily in my rendering toolchain, for example to take the multiple different was of tagging a Zebra Crossing and consolidate them into one tag. This makes the stylesheets easier to deal with. More advanced geometry manipulation is done in a variety of tools, most noticeably during import by osm2pgsql / nomintim, but these are specific-purposed transformations into a specific data format rather than general-purpose routines that output openstreetmap data. There is nothing stopping such a utility being created, it simply hasn't been done yet. The ultimate tool would let you pick between the representations depending on your needs - for example some people want answered 16 Dec '10, 13:42 Andy Allan The tagtransform plugin comes close. The only disadvantage is that it can't change between relations, ways and nodes. It can only edit the tags. for osm2pgsql, I've read that it loses the info if two ways are connected. And I would not want to lose that data. I agree on your vision about what the ultimate tool would be.
(16 Dec '10, 13:58)
Sanderd17
|
You could probably do this sort of thing with PostGIS. You can use Osmosis or osm2pgsql to read from an .osm file, and output to a PostGreSQL database. Then you can use PostGIS queries to find if an object is within a boundary polygon, or near a place, and add tags as appropriate. You can also collapse areas into points (e.g. make a POI out of a building or so). Such queries could be scripted and automated if you want. Though I don't know of any ready made scripts to do the processing/standardisation that you describe. As you say, Nominatim does some of this processing, and its source code is available, which might be helpful. answered 16 Dec '10, 04:32 Vclaw Frederik Ramm ♦ |
I'm afraid I think the reason no one's answered this question is that, like me, they don't understand it. I don't know what a script where all data is kept means. I thought you wanted address data, but now I'm just too confused, and I'm afraid others here are as well. I'm going to take a stab at answering what I think your question is: The first part seems to be "How can I programatically access OSM data?" If so, the answer is several fold. OSM is a database of geographic information and we provide several mechanisms of storing that data. The most popular consumable format of OSM data is the OSM XML file format. Usually when you see a .osm file, that file is in the OSM XML format, which is described in detail at http://wiki.openstreetmap.org/wiki/.osm The second part seems to be about programatically modifying OSM features to (using your terminology) "standardise them". The short answer is that we don't provide a mechanism to do this, and you shouldn't consider modifying OSM data in an automated way without having a deep understanding of the data you're modifying (and often not even then). Let's take your example of modifying addresses, using the is_in to add street or town data. Speaking just about the US, I know this wouldn't work. There are places where the city/town you might think corresponds to an address doesn't. There's a long and complex story around this having to do with the names used for towns historically being the name of the post office and not the city, but the bottom line is you don't know. Similarly, you might assume that a building in proximity to a street is addressed to that street, but you can't know that. And if you make an assumption like that, modifying the data, then you make it more difficult to find the error and correct in the future. Of course if you know the information for an area, you should modify it yourself, but this is not the same as carrying out an automated edit (ie running a bot). Bots are generally frowned upon because of their negative history with the project. More information on bots can be found here: http://wiki.openstreetmap.org/wiki/Bot answered 16 Dec '10, 01:59 emacsen The file shoudn't be used on OSM back again. It's just that, if there is less info availiable, the script guesses what it is and places it in the file. That way, a tool which uses the file doesn't have to guess again. But more important: if the data is availiable, it is sometimes as a node, a way or a relation. I would like to simplify it to nodes if possible. I don't want te reupload the file, I know this would do bad things to the OSM database since the database is made for editing, not for usage by tools.
(16 Dec '10, 09:49)
Sanderd17
|
I'm not entirely sure what you're asking in this question. When you say "OSM data"- what specific kind of data do you mean? Addresses? Something else?
Well,
I would love to have a general script: where all data is kept, as long as it's documented and in one format.
But if you could give me an example that does the things I mentionned, I think I would be able to also adapt it to other generalisations.
what I would want the most is:
apply a tag to all objects in a closed way or a closed relation
place a node on the centre of a closed way or relation with tags that come from that way/relation
Off cource in script form, so that it can be automised.
Thanks