NOTICE: is no longer in use from 1st March 2024. Please use the OpenStreetMap Community Forum

Using Osmosis I've read the entire world PBF file. However, so far I've been unable to find house outline data. For example, my neighborhood in Forest Knolls, San Francisco has house outlines when viewed on the OSM website, but when I filter all the ways/nodes in the world PBF file to a bounding box of my neighborhood, I don't see house outlines. So, the question is, where can the house outline data be found in the PBF file?

In response to the comment requesting further information...

I downloaded the current "planet-latest.osm.pbf" as of about six weeks ago (the date on my copy of the file shows Aug 3, 2015). This file is about 30 GB and and contains something close to 3.5 billion records (about 3 billion nodes and about 500 million ways). When filtered to US only records it has about 560 million nodes and I forget, maybe 30 million ways.

Describing the full filtering process is fairly complex, since the amount of data is quite large and various algorithms I attempted exceeded my 32GB of RAM (partially because Java built-in data structures have huge amounts of overhead). So I created a fairly elaborate multipass algorithm that first filtered to find all ways and nodes in the US and wrote the data out to intermediate files and then repeated this process starting with the intermediate files to create state level files (one of course being California). Then finally filtered to a particular test neighborhood (my neighborhood in SF). So there's a lot of things that could have gone wrong, but if I knew where to start looking (where the house outline data is) then it would be easier for me to track what might have gone wrong.

So basically I'm wondering if house outlines are stored as ways, or as nodes, or what? I thought they would be stored as ways that then referred to a list of node Ids that would specify the exact shape of the house outline. However, so far I can't see anything in the PBF file that confirms my guess.

TIA, Roger

asked 22 Sep '15, 00:25

Mirage's gravatar image

accept rate: 0%

edited 22 Sep '15, 19:49

aseerel4c26's gravatar image

aseerel4c26 ♦


Could you amend your question with a description of what data you downloaded and how you filtered it? Perhaps there is something wrong with the process.

(22 Sep '15, 01:23) Frederik Ramm ♦

Your process is, lets say, overly complex.

First read up on the OSM data model and other pages (it helps to know what you are looking for before you start looking).

Then get a metro extract from or a state level extract. In general it is easier to first pair down size/area of an extract before doing anything else. Then use for example osmfilter ( to extract the nodes and ways for the building outlines (you will need to convert to standard XML format first). You are looking for ways that have a building tag.

Note after this process you will have a file containing nodes and ways, no already built geometries.

permanent link

answered 22 Sep '15, 08:32

SimonPoole's gravatar image

SimonPoole ♦
accept rate: 18%


Some elements of your suggestion are exactly what I don't want to do. Mainly I don't want to work with XML at all, partially because it's hugely inefficient, and partially because I loath it and think it's one of the worst file formats to ever be foisted on the world of programmers.

However, you do answer the core of my question, for which I'm truly grateful. So it is pretty much as I expected, now I just have to trace the migration of the building tagged ways from the original PBF file to the final state way file and determine where data was lost.

I won't go into my reasons for creating this, in your words, "overly complex" approach, let's just say that I needed to do all these steps anyway, so it didn't seem overly complex to me.

BTW, I did read the pages, but reading and being sure that you understand what every element is truly representing are different things entirely.

Thanks again for your answer.

(22 Sep '15, 09:49) Mirage

You don't need to use XML for osmfilter; you can use .o5m instead.

(22 Sep '15, 10:08) Richard ♦

Working with a small XML file is far faster than trying to prune down the planet, but extracts are available as PBFs too, and osmfilter, osmconvert, and osmosis all work fine with them.

If, starting from an extract, you still have problems, you could post a new question, including the command lines.

(22 Sep '15, 10:26) pnorman

XML may be ugly but you can easily do a "grep building myfile.osm" and you'd probably have saved yourself a lot of time (and maybe wouldn't even have had to ask this question). It is not the worst idea to start with XML when you do something with OSM, and upgrade to an efficient format like PBF later - for easier debugging.

(22 Sep '15, 10:57) Frederik Ramm ♦

Thanks for all the help, after a small fix here is my first trivial render of Forest Knolls in San Francisco. So now I have house outlines. Now just need to work on OpenGL rendering so it will all look pretty.

house outlines

(22 Sep '15, 12:48) Mirage

Was it a bug in your code or another issue? Maybe you can shed some light for others running into the same problem.

(22 Sep '15, 13:43) scai ♦

Yeah, it was a bug in my code, so probably not much help for others. I made a stupid error where in one pass I was attempting to grab node data from one of my HashMaps, but rather than looking up each way's indexed node Id in my HashMap, I looked up the index itself in my HashMap. Looking up a "for" loop index in my HashMap is basically a meaningless operation, so everything that happened from there forward was just garbage. Almost surprising that the program didn't just crash. I'm not sure if this is an indication that I write good fault tolerant code, or that I write bad code that helps hide actual bugs.

(23 Sep '15, 03:59) Mirage
showing 5 of 7 show 2 more comments

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here



Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text]( "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:


question asked: 22 Sep '15, 00:25

question was seen: 4,432 times

last updated: 23 Sep '15, 03:59

NOTICE: is no longer in use from 1st March 2024. Please use the OpenStreetMap Community Forum