I was recently looking at the Indian state of Arunachal Pradesh, and noticed that many rivers had been provided with Chinese names. Soon I found Villages and lakes with Chinese names too.

This state has been administered by India, since India's independence, but is claimed by China, as part of Tibet, and therefore part of China. I suspect that this claim is related to the fact that some Chinese names have ended up on objects here.

I have been working on moving Chinese names to the name:zh tag, but it's hard to find everything. I would like to be able to construct a overpass query that only returned objects that have name tags with characters in unicode's Chinese range (or the CJK range, as this would be accurate enough).

This is way over my head, as I know next to nothing about overpass queries or character encoding. Can someone lend a hand?

asked 12 Jul, 08:32

keithonearth's gravatar image

keithonearth
1.9k294667
accept rate: 25%

edited 13 Jul, 08:19

1

I don't see anything like that in the Overpass documentation. Could you just download objects with name tags and do the analysis yourself offline?

(12 Jul, 15:16) neuhausr

Try to download all names, then sort them. This should lead to all Chinese names being next to each other. Maybe you can use the CSV output to generate a list with name,type,OSM ID or something similar.

(12 Jul, 15:36) scai ♦

I guess you could find some of them with a regular expression that tests for some common characters in names of villages.

(12 Jul, 16:59) escada

One of the Overpass-API developers runs a server with prototype support for ICU character ranges in regex:

https://www.openstreetmap.org/user/mmd/diary/40197

This makes the query straightforward:

http://overpass-turbo.eu/s/qlv

permanent link

answered 12 Jul, 18:00

maxerickson's gravatar image

maxerickson
6.5k74795
accept rate: 29%

Osmose-QA have check for this, look at: http://osmose.openstreetmap.fr/en/map/#zoom=7&lat=27.858&lon=94.465&layer=Mapnik&overlays=FFFFFFFFFFFFFFFFFFFFT&item=5070&level=1%2C2%2C3&tags=&fixable=

The check matchs the language, or the default for the country for "name" tag, with the content.

permanent link

answered 14 Jul, 18:53

frodrigo's gravatar image

frodrigo
64559
accept rate: 13%

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×244
×159
×85
×4
×3

question asked: 12 Jul, 08:32

question was seen: 256 times

last updated: 14 Jul, 18:53

powered by OSQA