8000 perf(ruleeng): employ `emqx_topic_index` to speed up topic matching by keynslug · Pull Request #11396 · emqx/emqx · GitHub
[go: up one dir, main page]

Skip to content

Conversation

@keynslug
Copy link
Contributor
@keynslug keynslug commented Aug 4, 2023

Fixes EMQX-10712.

Follows up #11332, #11282, #11313 .

Summary

🤖 Generated by Copilot at 674847b

This pull request introduces a new module emqx_topic_index that provides a fast and efficient way to match topics to topic filters for the rule engine. It also refactors the rule creation, update, and deletion logic in the emqx_rule_engine module to use the topic index and a common validation function. Additionally, it adds a test suite for the topic index module and a macro for the topic index table name.

PR Checklist

Please convert it to a draft if any of the following conditions are not met. Reviewers may skip over until all the items are checked:

  • Added tests for the changes
  • Added property-based tests for code which performs user input validation
  • Changed lines covered in coverage report
  • Change log has been added to changes/(ce|ee)/(feat|perf|fix)-<PR-id>.en.md files
  • For internal contributor: there is a jira ticket to track this change
  • Created PR to emqx-docs if documentation update is required, or link to a follow-up jira ticket
  • Schema changes are backward compatible
-->

@keynslug keynslug requested review from a team, kjellwinblad and lafirest as code owners August 4, 2023 15:36
@coveralls
Copy link
Collaborator

Pull Request Test Coverage Report for Build 5764332092

  • 81 of 85 (95.29%) changed or added relevant lines in 3 files are covered.
  • 37 unchanged lines in 14 files lost coverage.
  • Overall coverage decreased (-0.04%) to 81.434%

Changes Missing Coverage Covered Lines Changed/Added Lines %
apps/emqx/src/emqx_topic_index.erl 55 59 93.22%
Files with Coverage Reduction New Missed Lines %
apps/emqx/src/emqx_broker.erl 1 88.33%
apps/emqx/src/emqx_limiter/src/emqx_limiter_schema.erl 1 81.48%
apps/emqx/src/emqx_quic_stream.erl 1 75.0%
apps/emqx/src/emqx_router_helper.erl 1 87.04%
apps/emqx_utils/src/emqx_utils.erl 1 82.63%
apps/emqx_bridge_gcp_pubsub/src/emqx_bridge_gcp_pubsub_consumer_worker.erl 2 93.31%
apps/emqx_conf/src/emqx_conf_app.erl 2 84.91%
apps/emqx_gateway_mqttsn/src/emqx_mqttsn_frame.erl 2 64.04%
apps/emqx/src/emqx_alarm.erl 2 89.92%
apps/emqx/src/emqx_channel.erl 2 87.56%
Totals Coverage Status
Change from base Build 5762894247: -0.04%
Covered Lines: 31976
Relevant Lines: 39266

💛 - Coveralls

@keynslug
Copy link
Contributor Author

Few more careful measurements regarding memory-per-process consumption.

Before / After.

Both are docker images built locally with:

env EMQX_DEFAULT_BUILDER=ghcr.io/emqx/emqx-builder/5.1-3:1.14.5-25.3.2-1-debian11 EMQX_DEFAULT_RUNNER=debian:11-slim ./build emqx docker
  • Before built from 9927890 (current PR base).
  • After built from bcf58904 (current PR head).

Configuration has only one rule, forwarding messages to the local mosquitto instance:

v5.1.4-gbcf58904(emqx@172.42.42.7)1> emqx_rule_engine:get_rules().
[#{actions => [{bridge,mqtt,'mosquitto-tls', <<"bridge:mqtt:mosquitto-tls">>}],
   conditions => {},created_at => 1674754150821,
   description => <<>>,doeach => [],enable => true, fields => ['*'],
   from => [<<"t/forward/+/+">>],
   id => <<"rule_mosquitto_tls">>,incase => {},
   is_foreach => false,name => <<>>,
   sql => <<"SELECT * FROM \"t/forward/+/+\"">>,
   updated_at => 1692003013244}]

Workload is:

emqttb --restapi --log-level info @g --host node1.emqx.dev @pub --num-clients 5000 --topic 't/forward/%h/%n' --qos 1 --size 256 --conninterval5ms --pubinterval 2500ms --publatency 200ms --metadata

https://snapshots.raintank.io/dashboard/snapshot/3jyRFjm6FNEIosspYobGjh8bmkxNiace

image

Observations:

  1. There are no noticeable difference in memory consumption per process.
  2. With this PR memory consumption per process is even slightly less on average, the measurements are volatile though.
  3. VM spends slightly more reductions when there's only one rule.

zmstone
zmstone previously approved these changes Aug 14, 2023
@keynslug keynslug force-pushed the ft/EMQX-10712/ruleeng-topic-index-vol-2 branch from bcf5890 to d0dce1f Compare August 14, 2023 10:00
@keynslug keynslug requested a review from Rory-Z as a code owner August 14, 2023 10:00
@keynslug keynslug changed the base branch from master to release-52 August 14, 2023 10:00
keynslug and others added 10 commits August 14, 2023 15:36
Somewhat similar to `emqx_trie` in design and logic, yet built on
top of a single, potentially pre-existing table.
Otherwise, topic with empty tokens (e.g. `a/b///c`) would have
some of their tokens ordered before `#` / `+`, because empty
token was represented as empty atom (`''`).
Thanks to @HJianBo for spotting this issue. The approach to fix it
is different though: we try to keep the "recurrency branch factor"
to a minimum, instead introducing new condition for the case when
filter does not match, but iteration with `ets:next/2` is not yet
finished for the prefix.

Co-Authored-By: JianBo He <heeejianbo@gmail.com>
Co-Authored-By: JianBo He <heeejianbo@gmail.com>
This should give less `ets:next/2` calls in general and much less
when index has relatively small number of long non-wildcard topics.
* emqx_rule_engine 5.0.23
@keynslug keynslug force-pushed the ft/EMQX-10712/ruleeng-topic-index-vol-2 branch from 7b8f3c1 to d302aaa Compare August 14, 2023 11:37
To (hopefully) better illustrate what is happening there.
This should further optimize the number of `ets:next/2` calls
required for a single match query.
@keynslug keynslug merged commit 75ed6aa into emqx:release-52 Aug 17, 2023
@zmstone zmstone mentioned this pull request Aug 22, 2023
9 tasks
@keynslug keynslug deleted the ft/EMQX-10712/ruleeng-topic-index-vol-2 branch September 15, 2023 16:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

0