NOTICE: help.openstreetmap.org is no longer in use from 1st March 2024. Please use the OpenStreetMap Community Forum

on opensuse 13.1 i try to do some gis-works with a large file: france-latest.osm.bz2 which i gathered from here: [url]http://download.geofabrik.de/europe.html[/url]

what do i do with that file france-latest.osm.bz2

   bzcat france-latest.osm.bz2

what is aimed? i want to extract all things that belong to the POI restaurant which is long lat name adress etc - etx.

i have the following things up and running:

package perl-XML-Twig and run xml_split

with a command available on openSUSE to split xml files named xml_split (it is part of the package perl-XML-Twig) Now we try to run the following command (I hope we have enough hard disk space since the output is roughly 20GB).

 bzcat france.osm.bz2 | xml_split -s 100M -b france -n 3 -

this will result in a bunch of 100 Mb large xml files france-001.xml,france-002.xml and so on. Weu then have the xslt (the name of the root element) and of course we will need a loop in the bash to process the several files and collect all the results together.

<xsl:stylesheet version = '1.0'
        xmlns="http://www.w3.org/1999/xhtml"
        xmlns:xml_split="http://xmltwig.com/xml_split"
        xmlns:xsl='http://www.w3.org/1999/XSL/Transform'>

    <xsl:output method="text" encoding="UTF-8"/>
    <xsl:template match="/">

            <xsl:for-each select="xml_split:root/node/tag[@k='amenity' and @v='restaurant']">
            <xsl:value-of select="../@id"/>
            <xsl:text>&#x09;</xsl:text>
            <xsl:value-of select="../@lat"/>
            <xsl:text>&#x09;</xsl:text>
            <xsl:value-of select="../@lon"/>
            <xsl:text>&#x09;</xsl:text>
            <xsl:for-each select="../tag[@k='name']">
                <xsl:value-of select="@v"/>
            </xsl:for-each>
            <xsl:text>&#x0A;</xsl:text>
        </xsl:for-each>
    </xsl:template>

</xsl:stylesheet>

question: what do i need to get all the aimed data out of the dataset - i.e.

long lat name adress etc - etx.

here below we have a data-chunk out of the xml-file that we have parsed: see it

<node id="52768810" lat="48.2044749" lon="11.3249434" version="7" changeset="9490517" user="wheelmap_visitor" uid="290680" timestamp="2011-10-07T20:24:46Z">
    <tag k="addr:city" v="Olching" />
    <tag k="addr:country" v="DE" />
    <tag k="addr:housenumber" v="72" />
    <tag k="addr:postcode" v="82140" />
    <tag k="addr:street" v="Hauptstraße" />
    <tag k="amenity" v="restaurant" />
    <tag k="cuisine" v="mexican" />
    <tag k="email" v="info@cantina-olching.de" />
    <tag k="name" v="La Cantina" />
    <tag k="opening_hours" v="Mo-Su 17:00-01:00" />
    <tag k="phone" v="+49 (8142) 444393" />
    <tag k="website" v="http://www.cantina-olching.com/" />
    <tag k="wheelchair" v="no" />

well - how to get all the data out of the above mentioned file with the xslt-processing

asked 11 Apr '14, 11:38

say_hello_to_the_world's gravatar image

say_hello_to...
19232427
accept rate: 0%

edited 11 Apr '14, 11:41

1

I would NOT recommend using xslt to extract data from OSM XML files. It's just a lot more work and more complicated than using some of the available OSM tools.

(11 Apr '14, 15:39) SK53 ♦

hello dear sk53 many many thanks - i will follow your advices and will do as you recommend. btw - if i have a big big file such as the one of germany - should i separate it into pieces using xml-split!?

(12 Apr '14, 13:39) say_hello_to...

Personally, I'd use osmosis to extract data from within a large downloaded .osm or osm.pbf file

(12 Apr '14, 14:16) SomeoneElse ♦

See also this forum question which seems to be related (and has a bit more info).

(12 Apr '14, 15:48) SomeoneElse ♦

I wouldn't get hung up on the fact that OSM data's in XML format. As @SK53 suggested above, there are lots of existing OSM tools for extracting data (most of which have had questions asked about before here).

I'd extract (an initially small) geographical area using osmosis, then look at using osmfilter to extract the data (possibly having used osmconvert to convert the data into a format that osmfilter can understand). Also perhaps consider osmium.

permanent link

answered 12 Apr '14, 15:56

SomeoneElse's gravatar image

SomeoneElse ♦
36.9k71370866
accept rate: 16%

many many thanks for all your ideas - i will add all those packages on opensuse 13.1 .- hopefully i will get them installed - either via commandline or yast

(12 Apr '14, 21:07) say_hello_to...

Are you aware that bzcat and bz2 file name extension is a hint to a compressed osm file?

You have to uncompress it in the very right way, so I would NOT recommend to use anything like a pipe in your console prompt.

Instead of downloading france.osm.bz2 ... try the osm.pbf file ... it is a kind of binary format.

Then you should get familiar with the tools calles osmconvert and osmfilter ... see the OSM wiki how to use them.

and before processing the whole France, I recommend to try some tests before with a smaller country extract or a region extract available also via geofabrik.de

With osmconvert you can produce a CSV file from raw OSM data, to load in a database or spreadsheet programm.

Success?

permanent link

answered 11 Apr '14, 12:12

stephan75's gravatar image

stephan75
12.6k556210
accept rate: 6%

1

He already uncompresses it via bzcat.

(11 Apr '14, 21:23) scai ♦

hello dear stephan many many thanks - i will follow your advices and will do as you recommend.

i try to work on a smaller country-extract or region - guess that geofabrik has some.

love to do some conversions to csv - or to load into a db-or spreadsheet

(12 Apr '14, 13:37) say_hello_to...

btw - if i have a big big file such as the one of germany - should i separate it into pieces using xml-split!?

(12 Apr '14, 13:39) say_hello_to...

btw - i also installed osmfilter: see here

http://wiki.openstreetmap.org/wiki/Osmfilter

Download and build in one run:   wget -O - http://m.m.i24.cc/osmfilter.c |cc -x c - -O3 -o osmfilter

As usual: There is no warranty, to the extent permitted by law.

linux-70ce:/home/martin #  wget -O - http://m.m.i24.cc/osmfilter.c |cc -x c - -O3 -o osmfilter
--2014-04-12 22:34:49--  http://m.m.i24.cc/osmfilter.c
Auflösen des Hostnamen »m.m.i24.cc (m.m.i24.cc)«... 80.67.17.148, 2a00:1158:0:300:432f::1
Verbindungsaufbau zu m.m.i24.cc (m.m.i24.cc)|80.67.17.148|:80... verbunden.
HTTP-Anforderung gesendet, warte auf Antwort... 200 OK
Länge: 213497 (208K) [text/plain]
In »»STDOUT«« speichern.

100%[==========================================================================================================================================>] 213.497     1,14MB/s   in 0,2s

2014-04-12 22:34:49 (1,14 MB/s) - auf die Standardausgabe geschrieben [213497/213497]

<stdin>: In function ‘oo__close’:
<stdin>:5166:5: warning: call to function ‘read_close’ without a real prototype [-Wunprototyped-calls]
<stdin>:1079:13: note: ‘read_close’ was declared here
linux-70ce:/home/martin #

i am not sure if it succeedet or failed!?

permanent link

answered 12 Apr '14, 21:43

say_hello_to_the_world's gravatar image

say_hello_to...
19232427
accept rate: 0%

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×83

question asked: 11 Apr '14, 11:38

question was seen: 7,499 times

last updated: 12 Apr '14, 21:43

NOTICE: help.openstreetmap.org is no longer in use from 1st March 2024. Please use the OpenStreetMap Community Forum