I'm curious if anyone knows where proper documentation lives that walks through how osm manages to pull minutely diff files from the API database?
I know the documentation states that they use osmosis --read-apid. Warning this gets a little in the weeds, but I am trying to find the proper channel to ask my question.
Looking at osmosis source code, this is the query passed to the DB:
(CREATE TEMPORARY TABLE tmp_nodes ON COMMIT DROP AS SELECT node_id, version FROM nodes WHERE (((xid_to_int4(xmin) >= ## AND xid_to_int4(xmin) <= ##))) AND redaction_id IS NULL;)
I don't see how osm is able to read the nodes table and filter down by transation IDs in < 1 minute. Scaling up from my small API db it would seem that the nodes table in osm production API db would have to be around 350GB alone.
The data shows that the Main server housing the APIdb has 252 GB RAM. This wouldn't be enough to read the entire nodes table in to RAM, so I must be missing something.
If anyone has an idea how this is accomplished or where it is documented, I would be really interested in hearing about it.
asked 15 Feb, 16:06