8000 implement new runtime framework for bootstrapping localstack by thrau · Pull Request #10942 · localstack/localstack · GitHub
[go: up one dir, main page]

Skip to content

implement new runtime framework for bootstrapping localstack #10942

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 12, 2024

Conversation

thrau
Copy link
Member
@thrau thrau commented Jun 3, 2024

Motivation

This is part of the effort to generalize certain aspects of the codebase to make them more re-usable (see #10946). One concept that was difficult to re-use was infra.py, which is responsible for starting what we now call the localstack Runtime. The reason is that it imports all sorts of aws-specific code.

The Runtime is roughly responsible for

  • running runtime lifecycle hooks
  • managing the localstack filesystem (creating directories and cleaning up afterwards)
  • clean up thread and process pools
  • serving the main Gateway

Another annoying thing with the previous runtime is that the server (either hypercorn or twisted), couldn't properly register signal handlers, such as a code reloader, because it can only do that from the main thread, and we were starting the server in a separate thread. Now the runtime server, i.e., the control/event loop of the underlying I/O framework, now (imo correctly) blocks the main thread.

💡 The primary change though is that you now have a dynamic and re-usable python -m localstack.runtime.main bootstrapping procedure, which follows the open-closed principle.

Usage

Here is an example to build a new localstack runtime distribution without needing to define a new main or new bootstrapping procedure. All you need to do is expose a Components plugin, and (provisionally) disable AWS plugins that the currently leaking from localstack-core using localstack/plux#22.

Then, in your distribution, you can python -m localstack.runtime.main, and instead of starting AWS, it will start the my-emulator.

from functools import cached_property

from localstack.runtime import components, hooks
from plux.runtime.filter import global_plugin_filter
from rolo import Response
from rolo.gateway import Gateway, HandlerChain, RequestContext


@hooks.on_runtime_create()  # <- new hook that this PR introduces
def _disable_aws_plugins():
    # disable unwanted plugins from localstack aws
    global_plugin_filter.add_exclusion(namespace="localstack.runtime.components", name="aws")
    global_plugin_filter.add_exclusion(namespace="localstack.aws.*")
    global_plugin_filter.add_exclusion(value="localstack.aws.*")
    global_plugin_filter.add_exclusion(value="localstack.services*")


class MyEmulatorComponents(components.BaseComponents):
    namespace = "localstack.runtime.components"
    name = "my-emulator"

    @cached_property
    def gateway(self) -> Gateway:
        def _dummy_handler(chain: HandlerChain, context: RequestContext, response: Response):
            chain.respond(200, "ok")

        return Gateway(request_handlers=[
            _dummy_handler
        ])

Architecture

This is a high-level diagram to help explain the components. It's an ad-hoc notation, I hope it helps to get the gist though 😅

localstack-runtime

Components

One critical aspect to add was an IoC mechanism to dynamically load, specifically the Gateway instance, but more generally a type of application context (now the Components).
We assume that, within a runtime, there is only one Components plugin. For instance, there are now AwsComponents which is a Components plugin. Within the context of the AWS emulator, only this plugin should be available and loaded.

RuntimeServer

The RuntimeServer represents the lower-level IO control loop that should block the main thread. In the case of twisted, this is the twisted reactor. In the case of hypercorn, this is the asyncio event loop.
There was no need to limit the RuntimeServer to serve only one Gateway instance. Basically an IO control loop should be able to serve multiple server sockets, so in the future we could serve TCP proxies for instance through the main control loop, rather than instantiating new threads. Thus, the register(Gateway, ...) method was conceived. Other than register, the RuntimeServer only has a run and a shutdown method.

Instantiation

Which type of RuntimeServer is instantiated is currently governed by config.GATEWAY_SERVER. The different server implementations are now registered as plugins instead of using static imports as we do in serve_gateway.

Global access

There are situations where you need global access to the runtime and runtime components (for instance in the in-memory client, we need access to the Gateway instance we inject the request into).
To this end, the runtime package provides a localstack.runtime.get_current_runtime() method, that represents global (or in the future: scoped) access to the current LocalstackRuntime object.

Bootstrapping

Bootstrapping (the process of creating a localstack runtime and starting it), is done by localstack.runtime.main. The idea is that this is generalized. Whether we start AWS, Snowflake, or Azure, we always run the generalized localstack.runtime.main, which gets its application context from the Components explained above.
Bootstrapping also involves setting the current runtime (as described above), and registering exit handlers. The runtime and subsequently the RuntimeServer's event loop, should now always block the main thread.

A special case is the in_memory_localstack.py for testing, which works similarly, only that it creates a new runtime running in a thread, which still works.

Changes

  • Introduce a new runtime that can be enabled with LEGACY_RUNTIME=0 (changed in the future)
  • The runtime server (twisted reactor or hypercorn eventloop) now blocks the main thread.
  • Integration tests already use the new runtime

Testing

  • Simply run make entrypoints and then python -m localstack.runtime.main - should already use the new runtime.
  • You can also test that different gateway servers are instantiated correctly by setting GATEWAY_SERVER=hypercorn

Limitations / Future work

  • We still have a lot of import errors when running the example in "Usage". This is mostly because of localstack.services.plugins which contains the Service and ServiceManager framework, but also imports from botocore. In a second step, we need to generalize the Service concept similar to what I've done in this PR.
  • There are a few provisional concepts like localstack.runtime.legacy that I don't like, but were necessary to decouple the new runtime from localstack.services.infra while maintaining backwards compatibility
  • I would like to remove the legacy runtime very soon, and start replacing global access to it with get_current_runtime(). maybe we can make the switch to the new runtime with 3.5, and then remove LEGACY_RUNTIME with 3.6

TODO

What's left to do:

  • Extend PR description
  • Greenify pipeline
  • Set default config back to LEGACY_RUNTIME (maybe?)

Copy link
github-actions bot commented Jun 3, 2024

S3 Image Test Results (AMD64 / ARM64)

  2 files    2 suites   3m 23s ⏱️
404 tests 352 ✅  52 💤 0 ❌
808 runs  704 ✅ 104 💤 0 ❌

Results for commit 7639225.

♻️ This comment has been updated with latest results.

@thrau thrau requested a review from alexrashed June 3, 2024 10:35
Copy link
github-actions bot commented Jun 3, 2024

LocalStack Community integration with Pro

    2 files  ±0      2 suites  ±0   1h 35m 42s ⏱️ - 1m 44s
3 080 tests +6  2 735 ✅ +9  345 💤 ±0  0 ❌  - 3 
3 082 runs  +6  2 735 ✅ +9  347 💤 ±0  0 ❌  - 3 

Results for commit 7639225. ± Comparison against base commit de41959.

This pull request removes 2 and adds 8 tests. Note that renamed tests count towards both.
tests.aws.services.support.test_support.TestConfigService ‑ test_create_support_case
tests.aws.services.support.test_support.TestConfigService ‑ test_resolve_case
tests.aws.services.lambda_.test_lambda_api.TestPartialARNMatching ‑ test_cross_region_arn_function_access
tests.aws.services.lambda_.test_lambda_api.TestPartialARNMatching ‑ test_update_function_configuration_full_arn
tests.aws.services.stepfunctions.v2.services.test_aws_sdk_task_service.TestTaskServiceAwsSdk ‑ test_s3_put_object[bool]
tests.aws.services.stepfunctions.v2.services.test_aws_sdk_task_service.TestTaskServiceAwsSdk ‑ test_s3_put_object[dict]
tests.aws.services.stepfunctions.v2.services.test_aws_sdk_task_service.TestTaskServiceAwsSdk ‑ test_s3_put_object[list]
tests.aws.services.stepfunctions.v2.services.test_aws_sdk_task_service.TestTaskServiceAwsSdk ‑ test_s3_put_object[num]
tests.aws.services.stepfunctions.v2.services.test_aws_sdk_task_service.TestTaskServiceAwsSdk ‑ test_s3_put_object[str]
tests.aws.services.support.test_support.TestConfigService ‑ test_support_case_lifecycle

♻️ This comment has been updated with latest results.

@thrau thrau added the semver: minor Non-breaking changes which can be included in minor releases, but not in patch releases label Jun 3, 2024
@thrau thrau changed the title implement new runtime framework to bootstrap localstack implement new runtime framework for bootstrapping localstack Jun 3, 2024
@thrau thrau force-pushed the runtime-experiments branch from 7cea57e to 1d9b1e1 Compare June 4, 2024 15:04
@thrau thrau force-pushed the runtime-experiments branch from 1d9b1e1 to c91a896 Compare June 6, 2024 12:45
@thrau
Copy link
Member Author
thrau commented Jun 6, 2024

current state: i think everything works, except i believe the runtime shuts down too eagerly and therefore a CFn cleanup isn't working correctly.

@thrau thrau force-pushed the runtime-experiments branch 2 times, most recently from 149a3b9 to 5104264 Compare June 9, 2024 13:08
@thrau thrau marked this pull request as ready for review June 9, 2024 13:28
@thrau thrau requested a review from dominikschubert as a code owner June 9, 2024 13:28
@alexrashed alexrashed added this to the 3.5 milestone Jun 10, 2024
Copy link
Member
@alexrashed alexrashed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow! This is truly great, and an awesome push towards a proper separation between the runtime and the emulator specific code.
Currently, these parts are a bit intertwined, which you can see when you think about what code this PR supersedes:

  • localstack.aws.serving.*
  • localstack.aws.gateway

The separation into a Runtime, a RuntimeServer, as well as a Component is great!
However, I am not 100% sure if the current separation of concerns is the best way forward (but it's definitely a great step and this thinking should not block a merge of this PR):
You mentioned in the code that at some point it would be great to have the possibility to run multiple components in a single process (I know, it's quite far away from now).
These Component instances would then be served via different Gateway instances on a single RuntimeServer.
But the RuntimeServer is currently managed by the Component which basically means there's a 1:1 mapping between the Component and the RuntimeServer.
Instead it might be better to have the Runtime manage both, the Component instances as well as the RuntimeServer insteances and glue them together?

Another thing we need to think about is the deprecation path for the host mode (which is not supported anymore with the new runtime server).
Unfortunately, a quick and dirty GitHub search shows that there are quite a lot of usages out there in the wild which we need to phase out.
Maybe we could already include a deprecation notice for the usage of the host mode in the CLI in this PR?

Again, truly a great step forward! I'm really looking forward to separating the code into proper namespace packages! 🥳 🚀 🦸🏽

Comment on lines +14 to +19
# event flag indicating the infrastructure has been started and that the ready marker has been printed
# TODO: deprecated, use events.infra_ready
INFRA_READY = events.infra_ready

# event flag indicating that the infrastructure has been shut down
SHUTDOWN_INFRA = threading.Event()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I don't think this is used anywhere (yet), could this maybe be removed?
Actually, I don't think there's even a usage across all repos in this org:
https://github.com/search?q=org%3Alocalstack+%22INFRA_READY%22&type=code
https://github.com/search?q=org%3Alocalstack+%22SHUTDOWN_INFRA%22&type=code

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair, I'll remove it as part of the cleanup I'm a second PR.

@thrau thrau force-pushed the runtime-experiments branch from 5104264 to 00bccf3 Compare June 11, 2024 21:25
@alexrashed
Copy link
Member

FYI: I pushed a small commit with a small adjustment to localstack.testing.pytest.in_memory_localstack

  • This module contains a pytest hook which starts LocalStack if the startup has not explicitly been disabled.
  • This hook has also been executed in "collect only" phases of LocalStack used in a preflight check.
    • This means that it unnecessarily started LocalStack in this check which should just collect tests.
    • We can see this for example in this run on master: https://app.circleci.com/pipelines/github/localstack/localstack/25701/workflows/b45d7ddd-e561-4c13-9501-61e8615d75e7/jobs/216512
      2024-06-12T12:00:26.374  WARN --- [  MainThread] localstack.dns.server      : DNS server did not come up within 5 seconds.
      Ready.
      2024-06-12T12:00:26.693  INFO --- [-functhread4] l.t.p.in_memory_localstack : stopping infra
      2024-06-12T12:00:26.693  INFO --- [  MainThread] l.t.p.in_memory_localstack : waiting for infra to stop
      2024-06-12T12:00:27.069  INFO --- [-functhread4] l.runtime.shutdown         : [shutdown] Stopping all services
      
  • Unfortunately, this hook is also executed in some preflight checks (in downstream dependencies using this module), which might not have the proper entrypoints available at that time.
  • With the new changes in this PR, this step then fails (because with this change LocalStack wouldn't even start without entrypoints because the components are discovered via the plugin mechanism).

All my change does is that it fixes the hook to avoid starting LocalStack if we are in a "collect only" session.

Copy link
Member
@dominikschubert dominikschubert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome progress and I love the much cleaner way of "how to bootstrap an e2e emulator instance".
Great attention to deprecations and I like that even the in memory client has been considered from the start 🚀

There are some lower hanging fruits for follow-ups that we should get in sooner rather than later, especially starting the deprecation warnings and communication around that, so that we can phase them out with 4.0.

The config also seems to become more critical again. Maybe we could do a first step where we split the config into "global" and component config, so that we can then use dynamic component loading based on plugin name as mentioned in one of the comments.

I had a few nits but nothing that would block a merge and couldn't be handled in a follow-up PR, if at all.

Comment on lines +45 to +46
image_sha = inspect_result["Image"]
print("LocalStack Docker image sha: %s" % image_sha)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be adapted for #10951 but it's not critical and can be done in a follow-up, just shouldn't forget about it 👍

"""Hooks that are executed right before the LocalstackRuntime is created. These can be used to apply
patches or otherwise configure the interpreter before any other code is imported."""

on_runtime_start = on_infra_start
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's start deprecating this asap so that we can remove it with 4.0 at least

Comment on lines +47 to +49
# avoid starting up localstack if we only collect the tests (-co / --collect-only)
if session.config.option.collectonly:
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks for adding this. I think I had this in at least one local branch somewhere already when playing around more with filtering tests. 😅

ssl_creds: tuple[str, str] | None = None,
):
"""
Registers the Gateway and the port configuration into the server. Some servers like ``twisted`` or
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might it not make sense to word this a bit differently then: Attempts to... to signal that this could also not work depending on the used RuntimeServer plugin?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you have a particular potential RuntimeServer implementation in mind that would not allow this? Otherwise I guess let's just leave it as is and assume we can always add multiple gateways

Comment on lines +47 to +48
# TODO: rename to RUNTIME_SERVER
server_type = config.GATEWAY_SERVER
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: That comment should probably go into config.py

def get_current_runtime() -> "LocalstackRuntime":
with _runtime_lock:
if not _runtime:
raise ValueError("LocalStack runtime has not yet been set")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure a ValueError is necessarily the clearest Exception here.

What's the reason for not just returning an optional/None here? Downstream typing having to handle the None?

- Manage localstack filesystem directories
- Execute runtime lifecycle hook plugins from ``localstack.runtime.hooks``.
- Manage the localstack SSL certificate
- Serve the gateway (It uses a ``RuntimeServer`` to serve a ``Gateway`` instance coming from the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: What's the reason by the way why we don't just keep calling it the GatewayServer, like in the (now deprecated) config variable?

Creates a new runtime instance from the current environment. It uses a plugin manager to resolve the
necessary components from the ``localstack.runtime.components`` plugin namespace to start the runtime.

TODO: perhaps we could control which components should be instantiated with a config variable/constant
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should just directly map the environment variable to the name of the components plugin.
There's a few other implications there though that have been addressed in other code comments here like the need for a scoped config per component for example. (i.e. the config being part of the components plugin and thus only instantiated with them).

Comment on lines +207 to +211
if len(components) > 1:
LOG.warning(
"There are more than one component plugins, using the first one which is %s",
components[0].name,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Feels a bit too lenient for me. I'd personally prefer it to raise in that case. 🤷‍♂️

@thrau thrau merged commit 614dc96 into master Jun 12, 2024
37 checks passed
@thrau thrau deleted the runtime-experiments branch June 12, 2024 22:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semver: minor Non-breaking changes which can be included in minor releases, but not in patch releases
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0