add caching for service catalog indexes #6672
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds caching of the indexes created in the
ServiceCatalog
.Problem
The
ServiceCatalog
is what powers theServiceNameParser
by providing indexes that allow fast lookup of service indicators to services (e.g., finding the service by its signing name or endpoint prefix). Creating the indexes involves loading all 300-something services, and iterating over all 11k-something operations. This takes around 1500 ms on my machine. This happens the first time an HTTP call is made that triggers theServiceNameParser
, making the first call take at least 1500ms.Solution
@alexrashed raised some concerns in the past about shipping an index pre-built with the distribution, which was a valid point since that would involve an error-prone build step. So instead I opportunistically create the cache and store it in the localstack volume directory cache. I restructured the service catalog a bit to separate concerns (providing access to the index, vs actually holding the data). I added both the localstack and botocore version identifier to the file name so we always re-load the cache on new versions.
Result
Loading the cache takes 8ms vs. creating it lazily which takes about 1500ms. This has a dramatic effect on cold starts, specifically the first invocation of a service now only loads the service model, which is much faster (previously we would also load all service models to build the index).
Here's me executing
awslocal opensearch list-domains
. "process" is the end-to-end server latency (timer aroundGateway.process
, andby_signing_name
is what's being executed to look up the service):Without caching:
With caching:
I also tried cold-starting several different services in a row. There's only a measurable effect for the first one.
But a pretty dramatic one: 10x speedup for the first cold start.
Drawbacks/Limitations
The index obviously has to be computed at some point. Previously we would do that on the first invocation of any HTTP request for every localstack run. Now we do it once when localstack starts up for the very first time. Any subsequent localstack runs will have no startup delay. This means that this PR has no effect on ephemeral instances of localstack (e.g., in CI environments). For that we would have to ship the index with the distribution.