I've been exporting .osm files to .geojson using the osmium-export tool. I'm trying to keep the osm_ids(including the osm_type) so that we can lookup the osm objects downstream(e.g. through nominatim:

I'm using the '--add-unique-id=type_id' command to include osm_ids, so my command is:

osmium export --add-unique-id='type_id' -o tmp/processed-data/planet-ports-231009.geojson tmp/processed-data/planet-ports-231009.osm --verbose

This produces rich objects and reading the output into a geopandas dataframe does indeed produce an 'id' column as expected.

According to the docs:

Or the TYPE is type_id in which case the ID is a string, the first character is the type of object (‘n’ for nodes, ‘w’ for linestrings created from ways, and ‘a’ for areas created from ways and/or relations, after that there is a unique ID based on the original OSM object ID(s).

However, when I look at the output, there are 2 entries with the name 'Heysham Port' (I'd post screenshots here but I'm not allowed) and they represent seemingly the same geographical object, a port in the UK. These 2 objects have different osm_ids in the .geojson output: w974090785 and a1948181570.

I'm able to lookup the former on nominatim: But the latter is nowhere to be found, neither as node, relation or way. (tried n1948181570, r1948181570 and w1948181570) I might be misunderstanding this part of the docs: based on the original OSM object ID.

My question is, can I tell osmium-export to keep the original id? Alternatively, how could I match the true osm_id from the object of type 'a'? (Since I might prefer those objects, there are more of them)

Thanks for any help!

There is no area datatype in OSM, areas are generated by Osmium from either closed ways or relations of type multipolygon. To still generate unique ids Osmium uses a simple "trick", it multiplies the id of the way or relation by 2 and adds 1 for relations. So if you see an id for an area, divide it by 2 to get the original id. If the id you see is even, it was generated from a way, if it is odd it was generated from a relation. So in your case w974090785 and a1948181570 are both generated from the same way 974090785.

This should be documented in the man page, but isn't. I have put it on my todo list to add that.

Answers and Comments

