8000 OpenSearch: HTTP proxy support for plugin installation by viren-nadkarni · Pull Request #11723 · localstack/localstack · GitHub
[go: up one dir, main page]

Skip to content

OpenSearch: HTTP proxy support for plugin installation #11723

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Oct 24, 2024

Conversation

viren-nadkarni
Copy link
Member
@viren-nadkarni viren-nadkarni commented Oct 22, 2024

Background

LocalStack provides the OUTBOUND_HTTP_PROXY and OUTBOUND_HTTPS_PROXY env config options to set the HTTP/HTTPS proxy. These config options are mostly obeyed within the codebase, but certain external invocations do not respect them. Notable among them is ElasticSearch/OpenSearch. These are Java applications and require separate configuration.

Changes

This PR ensures that the relevant ElasticSearch/OpenSearch components that directly access the internet are passed the proper proxy settings.

The values in OUTBOUND_HTTP_PROXY and OUTBOUND_HTTPS_PROXY are translated to Java network system properties and passed to invocations of plugin managers.

Furthermore, the PEM CA bundle if specified in REQUESTS_CA_BUNDLE is converted to a JKS TrustStore and passed to the invocations. This is to ensure that SSL connections between the plugin managers and the proxy server are successful.

The execution of ElasticSearch/OpenSearch is not changed, I'm assuming that they will not connect to the internet in normal operations.

Tests

This PR includes unit-tests for the code related to system properties.

The following methodology was used to test the actual proxy scenario:

  • Run mitmproxy. This starts an HTTP/HTTPS proxy at http://127.0.0.1:8080
  • Start LocalStack with the following config:
    • OUTBOUND_HTTP_PROXY=http://localhost:8080
    • OUTBOUND_HTTPS_PROXY=http://localhost:8080
    • REQUESTS_CA_BUNDLE=/home/viren/.mitmproxy/mitmproxy-ca-cert.pem because mitmproxy uses its own self-signed certificates (this config option is actually part of the Requests Python library but we piggyback on it)
  • Call OpenSearch CreateDomain operation (with both Elasticsearch_7.10 and OpenSearch_1.1 as engine versions). The step where ElasticSearch/OpenSearch installs the plugins is recorded in the mitmproxy terminal. The download of ElasticSearch/OpenSearch itself is not be seen in the screenshot because it was cached.
  • This validates that all network connections made by the plugin managers are proxied correctly

image

@viren-nadkarni viren-nadkarni self-assigned this Oct 22, 2024
Copy link
github-actions bot commented Oct 22, 2024

LocalStack Community integration with Pro

    2 files  ±0      2 suites  ±0   1h 42m 44s ⏱️ -37s
3 509 tests ±0  3 096 ✅ ±0  413 💤 ±0  0 ❌ ±0 
3 511 runs  ±0  3 096 ✅ ±0  415 💤 ±0  0 ❌ ±0 

Results for commit 6de0ca7. ± Comparison against base commit 56e0b77.

♻️ This comment has been updated with latest results.

@viren-nadkarni viren-nadkarni added the semver: minor Non-breaking changes which can be included in minor releases, but not in patch releases label Oct 22, 2024
@viren-nadkarni viren-nadkarni changed the title Opensearch: Use proxy config during plugin installation OpenSearch: HTTP proxy support for plugin installation Oct 22, 2024
@viren-nadkarni viren-nadkarni marked this pull request as ready for review October 23, 2024 05:47
Copy link
Member
@alexrashed alexrashed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow! Great digging, great test setup with mitmproxy, and a great implementation! Thanks a lot for jumping on this!🦸🏽
I only added two questions (for future iterations) and a nitpick comment on how to make the URL parsing a bit simpler.

:param store_passwd: store password to use.
:return: path to the truststore file.
"""
store_path = new_tmp_file(suffix=".jks")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: I don't know how expensive these operations are, and if this is worth it, but could we cache this in config.dirs.tmp (but that depends on what the store depends on). Right now we are generating the same trust store for the installation of every single plugin, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, currently it's one trust store used for a given version of Opensearch/Elasticsearch installation, reused for all its plugins.

The JRE and thus keytool executable changes based on the Opensearch/Elasticsearch version (we use the bundled JRE), it may make sense not to re-use the trust store across versions.

Comment on lines 29 to 42
for scheme, default_port, var in [
("http", "80", config.OUTBOUND_HTTP_PROXY),
("https", "443", config.OUTBOUND_HTTPS_PROXY),
]:
if var:
netloc = urlparse(var).netloc
url = netloc.split(":")
if len(url) == 2:
hostname, port = url
else:
hostname, port = url[0], default_port

props[f"{scheme}.proxyHost"] = hostname
props[f"{scheme}.proxyPort"] = port
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest renaming var to what it really is (proxy_url?).
And parsing the urls is always a bit complicated (I guess because the URL spec is complicated?), but urlparse should do the hostname and port parsing for you already:

Suggested change
for scheme, default_port, var in [
("http", "80", config.OUTBOUND_HTTP_PROXY),
("https", "443", config.OUTBOUND_HTTPS_PROXY),
]:
if var:
netloc = urlparse(var).netloc
url = netloc.split(":")
if len(url) == 2:
hostname, port = url
else:
hostname, port = url[0], default_port
props[f"{scheme}.proxyHost"] = hostname
props[f"{scheme}.proxyPort"] = port
for scheme, default_port, proxy_url in [
("http", "80", config.OUTBOUND_HTTP_PROXY),
("https", "443", config.OUTBOUND_HTTPS_PROXY),
]:
if var:
parsed_proxy_url = urlparse(proxy_url)
props[f"{scheme}.proxyHost"] = parsed_proxy_url.hostname
props[f"{scheme}.proxyPort"] = parsed_proxy_url.port

If the URL does not have a port, parsed_proxy_url.port will be None

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, it's slightly simpler now with 6eeb953

#


def system_properties_to_cli_args(properties: dict[str, str]) -> list[str]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: It's a pity you can't set system properties via environment variables, otherwise this could be easily integrated generally for our Java packages by adding the properties to JavaInstallerMixin.get_java_env_vars.
ElasticSearch and OpenSearch don't use the Java installation from the package installer, but their own, but what do you think would it take to get the proxy supported for all backends using Java in the future?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have the package/installer abstractions which handle the installation aspect. Personally I think we should look into introducing some abstractions for the execution/runtime aspect. Because strictly speaking, determining and setting the proxy config falls in that area. We have the JavaInstallerMixin which kind of blurs these responsibilities. But clearly demarcating the responsibilities would let us harmonise things just like the installers.

Already there are some common patterns that have emerged, e.g. KinesisServerManager which handle multiple instances of Kinesis-Mock to allow account/region namespacing, MqttBrokerManager — same for Mosquitto, etc. It just needs fleshing out. I'll put this in the backlog.

@viren-nadkarni viren-nadkarni merged commit 62a302c into master Oct 24, 2024
34 checks passed
@viren-nadkarni viren-nadkarni deleted the opensearch-proxy branch October 24, 2024 04:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semver: minor Non-breaking changes which can be included in minor releases, but not in patch releases
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0