Hi,

I am looking to compile a list of UK street names for an infographic project. Would it be possible to scrape the OSM for every street name, and output that data into a spreadsheet? If anyone out there can help with this I'd be very grateful.

Thanks,

Chris

asked 18 Apr '12, 11:26

chrishall's gravatar image

chrishall
16111
accept rate: 0%


We reserve the word "scraping" for people who, to our dismay, write clumsy scripts that make tons of individual requests against our API or web site. Don't do that - we're an open data project and we make our data available for download!

Grab a data extract for the UK e.g. from the Geofabrik download server, then use a program like Osmosis to filter out only highways:

osmosis --read-pbf file.osm.pbf --tf accept-ways highway=\* --write-xml myfile.osm

From the resulting XML file, extract all names - easiest on Linux with something like

grep 'k="name"' myfile.osm | cut -d\" -f4

and you have your list. (If you prefer DBF files to XML, you could probably download the shp.zip file from the download server and simply open the roads.dbf file.)

Caveats:

  1. This procedure will yield names for everything tagged "highway", including cycleways, footways, steps, roundabouts.

  2. This procedure does not allow you to count how often each name occurs in reality, because a road may consist of several parts in OSM, so the same road might feature multiple times in your file. Should you want to eliminate such double mentions, some programming or GIS magic will be required.

permanent link

answered 18 Apr '12, 11:41

Frederik%20Ramm's gravatar image

Frederik Ramm ♦
60.3k74563941
accept rate: 23%

edited 18 Apr '12, 11:49

Frederik,

Thanks for coming back to me. Apologies re. 'scraping', I'm not looking to inconvenience anyone!

The second point you make is probably the most relevant - and thanks for bringing it to my attention. I'm not sure I know how to solve this myself - can you help, or recommend anyone who can? If it's time-consuming work I'm willing to pay for the research/make an appropriate donation.

Many thanks,

Chris

(18 Apr '12, 11:54) chrishall

It seems like the data set mentioned by Richard and Ed would conveniently circumvent this problem!

(18 Apr '12, 11:59) Frederik Ramm ♦

It might be better to start with something like OS OpenData, particularly the Locator dataset I think.

http://www.ordnancesurvey.co.uk/oswebsite/products/os-locator/index.html

As yet, OpenStreetMap does not have as comprehensive a coverage as the OS data.

permanent link

answered 18 Apr '12, 11:44

EdLoach's gravatar image

EdLoach ♦
16.3k13133235
accept rate: 23%

OpenStreetMap is arguably not the best data source for your application. You would be better served by using OS Locator, from the Ordnance Survey OpenData release, which has a better licence, a simpler file format, more consistent data, and is more complete.

permanent link

answered 18 Apr '12, 11:44

Richard's gravatar image

Richard ♦
25.2k35225333
accept rate: 19%

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×164
×88
×38
×2

question asked: 18 Apr '12, 11:26

question was seen: 7,335 times

last updated: 18 Apr '12, 11:59

powered by OSQA