For my master's thesis, I need to geocode many postal addresses (my dataset provides zip code, city and street) that are all located in Germany. However, this was not successful for a subset of around 70,000 observations. Apparently, the geocoder failed to convert the addresses due to minor mistakes in the data. Eyeballing led me to the impression that the most prevalent reasons are the following

  • minor spelling mistakes in terms of the city name. Example: "Neunburg v. Wald" instead of "Neunburg vorm Wald" or "Hessheim" instead of "Heßheim"
  • missing part of the name. Example: "Pohlheim" instead of "Pohlheim-Watzenborn"
  • the string in the city column refers to village that belongs to the city of interest
  • wrong match between zip code and city name (maybe because the address was recorded before a change in the zip code occured, e.g. two zip codes were merged)

Photon appears to be able to identify many of the addresses which is why I would like to use their API to auto-correct my addresses. I would like to run a Java code on my local machine that queries a GeoJSON object for each of my addresses stored in a csv file using the Photon API and then save the answer in the very same file. This seems to be a standard problem but unfortunately I could not find a straight forward tutorial (I have little to no knowledge of Java). Is there available code which I can build upon? Thank you for your help!

asked 08 Mar '15, 22:06

trefixxx's gravatar image

trefixxx
11224
accept rate: 0%

edited 09 Aug '16, 11:22

aseerel4c26's gravatar image

aseerel4c26 ♦
31.7k15234543


While I don't quite see what significant hurdle learning enough Java to query an API and write the results to a file could be, you can obviously use any other language to access the API.

A further note: while we, obviously, can't speak for Komoot here, I suspect you should plan on running a local install of Photon for your project.

permanent link

answered 09 Mar '15, 14:06

SimonPoole's gravatar image

SimonPoole ♦
37.0k13278591
accept rate: 19%

@SimonPoole thanks for your reply. Would you bother to give me a link that sets me up with sufficient material to tackle this problem? The problem is that I do not know what specifically I need to work through and with regards to my deadline I would like to avoid a lengthy search.

(09 Mar '15, 14:20) trefixxx

You haven't indicated at all what level of (computer/programming) knowledge you have, further is java is a hard requirement or if there is a language you are comfortable with, that you would prefer etc.

(09 Mar '15, 19:59) SimonPoole ♦

I have a very basic knowledge of programming: I started with PHP, continued with C (completed half an online course) and at some point took an introductory course in Java at university. I think that I know most basic concepts of programming but lack experience.

(09 Mar '15, 21:42) trefixxx
2

I beleive any beginners JAVA textbook plus a google search for JAVA API JSON (turns up literally dozens of suitable examples) will be suficient.

Simply read the original addrss file, query photon, write the original address and the best match from Photon to the output file.

(11 Mar '15, 08:13) SimonPoole ♦
1

@SimonPoole As you claimed, it turned out not to be a major problem. Though it took me a whole day, now I have my code compiling and running using a csv file. However, I am not sure yet what logic I should implement to pick "the best match" from Photon to avoid that the coordinates refer to a completely different point in the country. One possibility I can imagine is that I exploit the fact that the postal address additionally provides the state. Thus, I can check whether Photon's response and my original data match. But this is a pretty poor check as states are big. Do you have a suggestion?

(13 Mar '15, 08:36) trefixxx

As a response to your note in your original post: I have checked with Komoot to ensure that I stay within their usage policy, so there's no problem in this regard.

(13 Mar '15, 08:38) trefixxx
showing 5 of 6 show 1 more comments
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×225
×107
×7
×1

question asked: 08 Mar '15, 22:06

question was seen: 4,112 times

last updated: 09 Aug '16, 11:22

powered by OSQA