This is a static archive of our old OpenStreetMap Help Site. Please post any new questions and answers at community.osm.org.

How to extract relation members from .o5m files

Edit: An update, I'll put it at the top because it is most relevant now:

I made some progress.

When I use the following code:

import osmium as osm
import pandas as pd

class OSMHandler(osm.SimpleHandler):
    def __init__(self):
        osm.SimpleHandler.__init__(self)
        self.osm_data = []

    def tag_inventory(self, elem, elem_type):
        for tag in elem.tags:
            if elem_type == 'relation':
                members = elem.members
            else:
                members = 'None'

            self.osm_data.append([elem_type, 
                               elem.id, 
                               elem.version,
                               elem.visible,
                               pd.Timestamp(elem.timestamp),
                               elem.uid,
                               elem.user,
                               elem.changeset,
                               len(elem.tags),
                               tag.k, 
                               tag.v,
                               members
                               ])


    def node(self, n):
        self.tag_inventory(n, "node")

    def way(self, w):
        self.tag_inventory(w, "way")

    def relation(self, r):
        self.tag_inventory(r, "relation")

osmhandler = OSMHandler()
osmhandler.apply_file("../data/world_mtb_routes.o5m")

I get the error:

RuntimeError: Relation callback keeps reference to OSM object. This is not allowed.

Strangely if I just use:

if elem_type == 'relation':
    print(elem.members)

Instead of trying to add the the info to self.osm_data, I do see the information I want scrolling by!

What am I missing? Why can I add elem.id etc to the self.osm_data but not elem.members?

====> The original question:

All,

I've been trying to build a website (in Django) which is to be an index of all MTB routes in the world. I'm a Pythonian so wherever I can I try to use Python.

I've successfully extracted data from the OSM API (Display relation (trail) in leaflet) but found that doing this for all MTB trails (tag: route=mtb) is too much data (processing takes very long). So I tried to do everything locally by downloading a torrent of the entire OpenStreetMap dataset (from planet.openstreetmap.org]2), converted the osm file to o5m, and filtering for tag: route=mtb using osmfilter (part of osmctools in Ubuntu 20.04), like this:

osmfilter $unzipped_osm_planet_file --keep="route=mtb" -o=$osm_planet_dir/world_mtb_routes.o5m

This produces a file of about 83.6 MB and on closer inspection seems to contain all the data I need. My goal was to transform the file into a pandas.DataFrame() so I could do some further filtering en transforming before pushing relevant aspects into my Django DB. I tried to load the file as a regular XML file using Python Pandas but this crashed the Jupyter notebook Kernel. I guess the data is too big.

My second approach was this solution: How to extract and visualize data from OSM file in Python. It worked for me, at least, I can get some of the information, like the tags of the relations in the file (and the other specified details). What I'm missing is the relation members (the ways) and then the way members (the nodes) and their latitude/longitudes. I need these to achieve what I did here: Plotting OpenStreetMap relations does not generate continuous lines.

I'm open to many solutions, at the moment I feel that I don't understand how pyosmium works, but I must be close. Any help is appreciated.

mtb o5m relations pyosmium

asked 28 Aug '21, 21:52

FreekvH
11●1●1●3
accept rate: 0%

edited 29 Aug '21, 09:08

2 Answers:

pyosmium gives you only a temporary view of the data from the file. This makes processing fast because it avoids unnecessary copying but it means that you have to make a deep copy of any data you want to keep. The id field is just an integer, so assigning it to another variable gives you the full deep copy. members is an array which needs to be copied explicitly:

members = [(m.type, m.ref, m.role) for m in elem.members]

The same is true for the tag list (elem.tags). As a rule: don't just blindly copy all information you can get. Just keep what you really need later.

answered 29 Aug '21, 13:21

lonvia
6.2k●2●57●89
accept rate: 40%

Yes, this works! I didn't realize there was a difference in "id" and "members". Thanx a million!

I will make my code complete and indeed make it more minimal (I just need a dict with some values, like data[relation][ways][nodes] and some meta data, I'll post my final version here when it works.

(29 Aug '21, 17:35) FreekvH

I have finally solved this, I posted the answer on stackoverlow, I'll link to it here because then I have one place to update.

The answer: stackoverflow.com/questions/68622198/how-to-extract-relation-members-from-osm-xml-files

answered 14 Nov '21, 15:25

FreekvH
11●1●1●3
accept rate: 0%

edited 14 Nov '21, 15:26