Edit: An update, I'll put it at the top because it is most relevant now:

I made some progress.

When I use the following code:

import osmium as osm
import pandas as pd

class OSMHandler(osm.SimpleHandler):
    def __init__(self):
        osm.SimpleHandler.__init__(self)
        self.osm_data = []

    def tag_inventory(self, elem, elem_type):
        for tag in elem.tags:
            if elem_type == 'relation':
                members = elem.members
            else:
                members = 'None'

            self.osm_data.append([elem_type, 
                               elem.id, 
                               elem.version,
                               elem.visible,
                               pd.Timestamp(elem.timestamp),
                               elem.uid,
                               elem.user,
                               elem.changeset,
                               len(elem.tags),
                               tag.k, 
                               tag.v,
                               members
                               ])


    def node(self, n):
        self.tag_inventory(n, "node")

    def way(self, w):
        self.tag_inventory(w, "way")

    def relation(self, r):
        self.tag_inventory(r, "relation")

osmhandler = OSMHandler()
osmhandler.apply_file("../data/world_mtb_routes.o5m")

I get the error:

RuntimeError: Relation callback keeps reference to OSM object. This is not allowed.

Strangely if I just use:

if elem_type == 'relation':
    print(elem.members)

Instead of trying to add the the info to self.osm_data, I do see the information I want scrolling by!

What am I missing? Why can I add elem.id etc to the self.osm_data but not elem.members?

====> The original question:

All,

I've been trying to build a website (in Django) which is to be an index of all MTB routes in the world. I'm a Pythonian so wherever I can I try to use Python.

I've successfully extracted data from the OSM API (Display relation (trail) in leaflet) but found that doing this for all MTB trails (tag: route=mtb) is too much data (processing takes very long). So I tried to do everything locally by downloading a torrent of the entire OpenStreetMap dataset (from planet.openstreetmap.org]2), converted the osm file to o5m, and filtering for tag: route=mtb using osmfilter (part of osmctools in Ubuntu 20.04), like this:

osmfilter $unzipped_osm_planet_file --keep="route=mtb" -o=$osm_planet_dir/world_mtb_routes.o5m

This produces a file of about 83.6 MB and on closer inspection seems to contain all the data I need. My goal was to transform the file into a pandas.DataFrame() so I could do some further filtering en transforming before pushing relevant aspects into my Django DB. I tried to load the file as a regular XML file using Python Pandas but this crashed the Jupyter notebook Kernel. I guess the data is too big.

My second approach was this solution: How to extract and visualize data from OSM file in Python. It worked for me, at least, I can get some of the information, like the tags of the relations in the file (and the other specified details). What I'm missing is the relation members (the ways) and then the way members (the nodes) and their latitude/longitudes. I need these to achieve what I did here: Plotting OpenStreetMap relations does not generate continuous lines.

I'm open to many solutions, at the moment I feel that I don't understand how pyosmium works, but I must be close. Any help is appreciated.

asked 28 Aug, 21:52

FreekvH's gravatar image

FreekvH
113
accept rate: 0%

edited 29 Aug, 09:08


pyosmium gives you only a temporary view of the data from the file. This makes processing fast because it avoids unnecessary copying but it means that you have to make a deep copy of any data you want to keep. The id field is just an integer, so assigning it to another variable gives you the full deep copy. members is an array which needs to be copied explicitly:

members = [(m.type, m.ref, m.role) for m in elem.members]

The same is true for the tag list (elem.tags). As a rule: don't just blindly copy all information you can get. Just keep what you really need later.

permanent link

answered 29 Aug, 13:21

lonvia's gravatar image

lonvia
5.7k25381
accept rate: 40%

Yes, this works! I didn't realize there was a difference in "id" and "members". Thanx a million!

I will make my code complete and indeed make it more minimal (I just need a dict with some values, like data[relation][ways][nodes] and some meta data, I'll post my final version here when it works.

(29 Aug, 17:35) FreekvH
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×219
×14
×3
×3

question asked: 28 Aug, 21:52

question was seen: 152 times

last updated: 29 Aug, 17:35

powered by OSQA