Fast eventcounter etags for geoextracts #1657

brontolosone · 2025-10-26T16:23:52Z

Tests won't succeed, #1654 needs merging first.

Information leakage through ETags

Yes, we could hash (or deterministically obfuscate otherwise) the counter value instead of using it verbatim in the ETag.

I chose not to.

It'd be weird to be authorized to see a resource, yet not be authorized to know how many events have taken place affecting it since you last looked. I also don't think that those authorized users can do bad things when they induce or deduce that the etag is counter-derived, because the fact that it's the string representation of a counter is immaterial to how it's processed for revalidation (opaquely, not as a number, and certainly not as a number with counter-semantics).

Benchmarks:

10.000 feature geojson body
response is 64 KB compressed
response is 2223 KB uncompressed (this is what the DB needs to shovel out to Node, if response bodies need to be recomputed in order to do revalidation)
benched with hey, concurrency 1, 1000 requests, with the If-None-Match client header set to the etag of the collection. All responses are 302s.

Before, old style: 17 revalidation requests/second
After this PR: 478 revalidation requests/second

So 30x 🥳
The difference will only grow with a slower DB and/or heavier contention.

Benchmark logs

hey, oldstyle

hey -c 1 -n 1000 -m GET -t 0 -T 'application/json' -H 'Authorization: Bearer thetoken' -H 'If-None-Match: W/"22bda5-We1sERCWxUOQw5sNcOfxqZvkO0c"' http://127.0.0.1:8383/v1/projects/1/forms/geotest/submissions.geojson
Summary:

Total:	59.7844 secs

Slowest:	0.1203 secs

Fastest:	0.0560 secs

Average:	0.0598 secs

Requests/sec:	16.7268
Response time histogram:

0.056 [1]	|

0.062 [959]	|■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■

0.069 [27]	|■

0.075 [0]	|

0.082 [2]	|

0.088 [3]	|

0.095 [6]	|

0.101 [0]	|

0.107 [0]	|

0.114 [0]	|

0.120 [2]	|
Latency distribution:

10% in 0.0577 secs

25% in 0.0584 secs

50% in 0.0593 secs

75% in 0.0602 secs

90% in 0.0612 secs

95% in 0.0622 secs

99% in 0.0870 secs
Details (average, fastest, slowest):

DNS+dialup:	0.0000 secs, 0.0560 secs, 0.1203 secs

DNS-lookup:	0.0000 secs, 0.0000 secs, 0.0000 secs

req write:	0.0000 secs, 0.0000 secs, 0.0003 secs

resp wait:	0.0597 secs, 0.0559 secs, 0.1197 secs

resp read:	0.0001 secs, 0.0000 secs, 0.0002 secs
Status code distribution:

[304]	1000 responses

hey, newstyle

hey -c 1 -n 1000 -m GET -t 0 -T 'application/json' -H 'Authorization: Bearer thetoken' -H 'If-None-Match: W/"20220"' http://127.0.0.1:8383/v1/projects/1/forms/geotest/submissions.geojson
Summary:

Total:	2.0935 secs

Slowest:	0.0335 secs

Fastest:	0.0012 secs

Average:	0.0021 secs

Requests/sec:	477.6793
Response time histogram:

0.001 [1]	|

0.004 [987]	|■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■

0.008 [11]	|

0.011 [0]	|

0.014 [0]	|

0.017 [0]	|

0.021 [0]	|

0.024 [0]	|

0.027 [0]	|

0.030 [0]	|

0.034 [1]	|
Latency distribution:

10% in 0.0013 secs

25% in 0.0013 secs

50% in 0.0018 secs

75% in 0.0026 secs

90% in 0.0031 secs

95% in 0.0034 secs

99% in 0.0045 secs
Details (average, fastest, slowest):

DNS+dialup:	0.0000 secs, 0.0012 secs, 0.0335 secs

DNS-lookup:	0.0000 secs, 0.0000 secs, 0.0000 secs

req write:	0.0000 secs, 0.0000 secs, 0.0003 secs

resp wait:	0.0021 secs, 0.0012 secs, 0.0329 secs

resp read:	0.0000 secs, 0.0000 secs, 0.0001 secs
Status code distribution:

[304]	1000 responses

What has been done to verify that this works as intended?

Manual testing.

Why is this the best possible solution? Were any other approaches considered?

For the engineering background, see getodk/central#1439.

How does this change affect users? Describe intentional changes to behavior and behavior that could have accidentally been affected by code changes. In other words, what are the regression risks?

N/A

Does this change require updates to the API documentation? If so, please update docs/api.yaml as part of this PR.

N/A

Before submitting this PR, please make sure you have:

run make test and confirmed all checks still pass OR confirm CircleCI build passes
verified that any code from external sources are properly credited in comments or that everything is internally sourced

matthew-white · 2025-10-28T05:19:13Z

lib/resources/geo-extracts.js

+
+    const acteeVersion = await Actees.getEventCount(foundDataset.acteeId);
+    // Weak etag, as the order in the resultset is undefined.
+    return withEtag(acteeVersion, createResponse, true);


Is it expected that the ETag doesn't change when query parameters change? I think so, just wanted to check.

+1 subscribing to this query

Very much expected, yes. In HTTP caching semantics, (with a few caveats, most prominently having to do with the Vary response header), the identity of a resource is the URL, including the query parameters.

If a browser sets an If-None-Match on a request for resource with identity A to the ETag received in an earlier response for a resource with identity B, then that'd be a serious bug (in the browser) ;-)

The fact that query parameters are part of the resource identity is actually even exploited for certain so called "cache busting" approaches.

The downside is that the order of parameters in the query matters. So two URLs that effectively deliver the same data, as they have the same meaning for the application (eg ?offset=10&limit=20 vs ?limit=20&offset=10) are distinct resources to HTTP caches, and they don't reuse the cache of the one for the other. They fortunately don't as they absolutely shouldn't, because they don't know the application semantics!
I would be well within my rights to write an application that does something completely different for ?a=1&b=2 vs ?b=2&a=1, and HTTP caching should still work.

(quoting myself)

The downside is that the order of parameters in the query matters. So two URLs that effectively deliver the same data, as they have the same meaning for the application (eg ?offset=10&limit=20 vs ?limit=20&offset=10) are distinct resources to HTTP caches, and they don't reuse the cache of the one for the other. They fortunately don't as they absolutely shouldn't, because they don't know the application semantics!

So, to expand on that a bit, corollary:

If you want share a cache of computation results between requests to ?offset=10&limit=20 and ?limit=20&offset=10, you can't really do that with HTTP caching semantics. Intermediate caching proxies don't want to presume that these are effectively the same to your application, and there's no way to tell them (or indeed the browser cache) otherwise.†
The component best situated to understand the application semantics is... the application! surprise!
So, when there is a desire to share a cache between ?offset=10&limit=20 and ?limit=20&offset=10, one would do (potentially additional) caching at the application. For us that'd mean we would, all "from nodejs", compute the result, come up with a caching key (a component of which in this case would be a normalized form of the query parameters), and then store the result in something like redis or memcached (or even just plain postgresql, or files in the filesystem). That's quite a common setup!

†) Although, nginx accommodates "bring your own caching key" setups. But that's in a reverse proxy role where you have knowledge of the application semantics.
Forward caching HTTP proxies such as Squid (largely outmoded because everything has become E2E TLS in the last 10-15 years) can maybe not even be configured to bend the identity rules.

In HTTP caching semantics, (with a few caveats, most prominently having to do with the Vary response header), the identity of a resource is the URL, including the query parameters.

👍 makes sense to me

Same, this makes sense to me.

I don't think we need to do anything special at this point related to the order of query parameters.

lib/resources/geo-extracts.js

lognaturel · 2025-11-24T19:46:17Z

Moving back to draft while considering #1654 (comment)

brontolosone requested a review from alxndrsn October 26, 2025 16:24

brontolosone marked this pull request as ready for review October 26, 2025 16:24

matthew-white mentioned this pull request Oct 27, 2025

Show feature count while map loads getodk/central#1432

Closed

matthew-white reviewed Oct 28, 2025

View reviewed changes

brontolosone force-pushed the 1439_fast-eventcounter-etags-for-geoextracts_pr branch from 40eca0f to 42856a9 Compare October 28, 2025 15:38

alxndrsn reviewed Oct 29, 2025

View reviewed changes

lib/resources/geo-extracts.js Outdated Show resolved Hide resolved

brontolosone mentioned this pull request Oct 29, 2025

Event counter migrations, towards getodk/central#1439 #1654

Draft

2 tasks

brontolosone moved this to 🕒 backlog in ODK Central Nov 1, 2025

brontolosone added this to ODK Central Nov 1, 2025

brontolosone mentioned this pull request Nov 1, 2025

Revisit caching headers and etags for our endpoints getodk/central#1465

Open

brontolosone added 2 commits November 4, 2025 15:24

allow for generation of weak etags

2f33fd5

eventcounter-etag-based revalidation for geodata endpoints

b56d56c

brontolosone force-pushed the 1439_fast-eventcounter-etags-for-geoextracts_pr branch from 42856a9 to b56d56c Compare November 4, 2025 15:27

lognaturel marked this pull request as draft November 24, 2025 19:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fast eventcounter etags for geoextracts #1657

Fast eventcounter etags for geoextracts #1657

Uh oh!

brontolosone commented Oct 26, 2025 •

edited

Loading

Uh oh!

matthew-white Oct 28, 2025

Uh oh!

alxndrsn Oct 28, 2025

Uh oh!

brontolosone Oct 28, 2025

Uh oh!

brontolosone Oct 28, 2025 •

edited

Loading

Uh oh!

alxndrsn Oct 29, 2025

Uh oh!

matthew-white Oct 29, 2025

Uh oh!

Uh oh!

lognaturel commented Nov 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Fast eventcounter etags for geoextracts #1657

Are you sure you want to change the base?

Fast eventcounter etags for geoextracts #1657

Uh oh!

Conversation

brontolosone commented Oct 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Information leakage through ETags

Benchmarks:

Benchmark logs

What has been done to verify that this works as intended?

Why is this the best possible solution? Were any other approaches considered?

How does this change affect users? Describe intentional changes to behavior and behavior that could have accidentally been affected by code changes. In other words, what are the regression risks?

Does this change require updates to the API documentation? If so, please update docs/api.yaml as part of this PR.

Before submitting this PR, please make sure you have:

Uh oh!

matthew-white Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

alxndrsn Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

brontolosone Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

brontolosone Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alxndrsn Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

matthew-white Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

lognaturel commented Nov 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

brontolosone commented Oct 26, 2025 •

edited

Loading

brontolosone Oct 28, 2025 •

edited

Loading