The protocol buffer specified for StringTable states that it stores repeated bytes.

message StringTable {
   repeated bytes s = 1;
}

To the best of my knowledge this could easily be stated as

message StringTable {
   repeated string s = 1;
}

As ProtocolBuffer already defines string to be a utf-8 or equivalent ASCII subset. In its current state I don't see any definition on how strings should be encoded (in PBF), and that is very bad.

asked 26 Jul '13, 06:32

he_the_great's gravatar image

he_the_great
1.2k61423
accept rate: 14%

edited 26 Jul '13, 06:33

1

This is a very technical question and you are more likely to hear answers to this on the dev list (lists.openstreetmap.org/listinfo/dev).

(26 Jul '13, 08:49) Frederik Ramm ♦

From the Protobuf documentation it looks like "bytes" and "string" are treated almost the same everywhere. Internally they seem to be the same, only when setting or getting the data, there might be differences depending on the language used, because some languages have special types for UTF-8 strings. Using Protobuf from C++ there is no difference between these two types.

I don't know what the original reason was, maybe to optimize away any UTF-8 validity check that the library might do internally. But I don't really see any difficulties. All strings in OSM are always UTF-8, so that's what those are, too. If that's not documented it should be.

permanent link

answered 28 Jul '13, 08:01

Jochen%20Topf's gravatar image

Jochen Topf
4.3k54363
accept rate: 33%

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×46
×7

question asked: 26 Jul '13, 06:32

question was seen: 2,598 times

last updated: 28 Jul '13, 08:01

powered by OSQA