There are many, many thousands of objects in OSM source data where any of these is so close to another in the same class that such scenario is totally unnatural (impossible) in the real world. For illustration see an example (of thousands) taken from the river-bank data-layer in the latest dump, here https://goo.gl/vpxQeQ. Of cause, most of the solid data preparation programs detect these anomalies, yet there is a dilemma - which one of those "close" objects to chose and that is my question. Namely, there are many options like keep the larger one, the one with more details (nodes), the latest, both... and so on. asked 27 Sep '17, 14:53 sanser |
In addition to what SomeoneElse has said: Our goal is to deliver quality data curated by humans. OpenStreetMap is not a dump where everyone can throw in some public data sets and then somehow users (like yourself) have to pick which of three overlapping river outlines they want to use. What you see here is a badly executed import in an area that has too few active mappers to actually notice and repair the problem. The import is 6 years old; today, we have much stricter import guidelines that would hopefully avert such disasters. Data as bad as this is not "normal" for OSM, and must never become. answered 27 Sep '17, 15:43 Frederik Ramm ♦ Thanks to you both for the quick answer. Implicitly, there are two suggestions: one, ignore all replications and two, don’t do anything (wait for human corrections). In my opinion both are the extremes and hardly acceptable options for vector based OSM GIS and digital map makers. The first would result in too many river brakes and empty places (the distribution of the rivers’ geometry related anomalies is presented here https://goo.gl/CaHTgz where there are over ten thousand replication markers). The second suggestion results in too many logical anomalies in rendering like the replication from the link in the question, huge number of missing/overwritten islands like here https://osm.org/go/9DnlbY, not to mention formal traps in GIS apps, data generalisation… just to mention some. By the way, I am selecting the one with the most details/node points from the replicated (outer or inner) border polygons. But yet my question/dilemma is open – is this the right decision?
(30 Sep '17, 10:08)
sanser
1
The third option, obviously, is: Help fix the problem! Sometimes individual imports can be identified that caused the problem, and the data from those imports can then simply be removed. If that is not the case, or proves unworkable since it would remove too many good data sets, then the second best option is to use a QA tool to help mappers fix the problems. Since you seem to have identified many problematic locations already, you could think about creating a "MapRoulette" task that would point mappers to the areas in question, and they can then be fixed one by one. Don't assume the OSM community has no interest in fixing the issue - they might simply not be aware of it. You can help!
(30 Sep '17, 10:33)
Frederik Ramm ♦
Aha - I think I misunderstood the question that you're asking. From the perspective of a data consumer who doesn't get the option to remap, but only to pick and choose what they use then yes, there will be occasions when you need to not use a certain class of data from a particular source. For example, at one point I dropped leisure=common from a rendering I maintain because it was at the time frequently misused. I can think of another example where someone who needed an urban/rural divide used a non-OSM source on a mostly OSM map because getting that data from OSM was considerably more cumbersome.
(30 Sep '17, 10:38)
SomeoneElse ♦
1
@sanser, I am the author of MapRoulette and am happy to assist you to set up a challenge to crowdsource the clean up of these almost-duplicates. Send me an OSM message or email me at m@rtijn.org and we can take it from there if you are interested. An example of another MapRoulette 'cleanup' challenge that is currently running is here: http://maproulette.org/map/2774 -- cleaning up not existing schools from an old import in the United States. Thanks Frederik for pointing me to this discussion.
(30 Sep '17, 19:30)
mvexel
|
The linked area in OSM https://www.openstreetmap.org/#map=19/46.94866/-0.93791 shows a bunch of duplicate crap river imports in France. So in this case - none of them. Look at the best sources you can find (survey it yourself if you can) the best imagery and map it properly. It would also help to comment on the importing changesets explaining the problems that occurred. answered 27 Sep '17, 14:59 SomeoneElse ♦ |