[go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default Kafka user quotas are applied also to internal users #10367

Open
im-konge opened this issue Jul 20, 2024 · 7 comments
Open

Default Kafka user quotas are applied also to internal users #10367

im-konge opened this issue Jul 20, 2024 · 7 comments
Assignees

Comments

@im-konge
Copy link
Member

When the default kafka quotas plugin is configured inside the .spec.kafka.quotas section of the Kafka resource, the quotas are applied to all of the users - as a default quotas. That means that they are applied also to the internal users, which can hit the quotas - for example when we set the controller mutation rate quota, the Topic operator can hit it during some of its operations.

For the strimzi quotas plugin type, this is handled using the "excluded principals" option of the plugin, where we are adding the internal users together with those specified inside the .spec.kafka.quotas section of the Kafka resource - so they are all excluded from the quotas.

But for the default Kafka quotas plugin, there is not such option that we can use.

To solve this, we can configure quotas to null values for the internal users, when the default quotas are configured in the Kafka resource. However, this is not that easy, as the quotas will be removed by User operator when they are created. Also, the information about the internal users would be accessible via the Kafka Admin API. This would not be trivial and it would require proposal to cover all the involved components that would need changes (Cluster operator, User operator, ...), together with the whole approach.

Another option is to document this inside our documentation - as it is maybe desired to limit the internal users as well. This would be the most simple way, but in the other hand it can cause issues - for example when someone would like to limit all other users, but keeping the TO and other components and their users without limitations.

We should discuss how to proceed with this or if there are other options that we should take into account.

@scholzj
Copy link
Member
scholzj commented Aug 8, 2024

Triaged on the Community call on 8.8.2024: @im-konge will prepare a summary of what Strimzi parts might be affected by this and how.

@im-konge
Copy link
Member Author
im-konge commented Sep 5, 2024

These part are (in my opinion and knowledge) affected by this issue:

  • Topic Operator -> when user sets the controller mutation rate, the operations done on TO side can be affected. That means the TO can be blocked from creating/updating/deleting the topics.
  • CruiseControl -> IIRC CC is sending messages to some internal topic to generate the model for rebalancing. When user sets the produce and fetch quotas, the CC can be affected as well.
  • I think that other components like MM2 or Connect/Connector can be affected as well, when we set the default quotas for produce and fetch.
  • I'm not sure if User Operator is affected, as the quotas should not be (but maybe I'm wrong) applied to creation of the users and managing additional quotas.

@scholzj
Copy link
Member
scholzj commented Sep 5, 2024
  • CruiseControl -> IIRC CC is sending messages to some internal topic to generate the model for rebalancing. When user sets the produce and fetch quotas, the CC can be affected as well.

So, what do we consider the minimal produce / fetch limit for Cruise Control to work?

  • I think that other components like MM2 or Connect/Connector can be affected as well, when we set the default quotas for produce and fetch.

I do not think we care. The user deploys them separately.

  • I'm not sure if User Operator is affected, as the quotas should not be (but maybe I'm wrong) applied to creation of the users and managing additional quotas.

It manages SCRAM-SHA users, ACLs and quotas. Does the mutation rate apply to that as well? Or is it only topics?

@im-konge
Copy link
Member Author
im-konge commented Sep 5, 2024

It manages SCRAM-SHA users, ACLs and quotas. Does the mutation rate apply to that as well? Or is it only topics?

From what I read, it is only topics.

So, what do we consider the minimal produce / fetch limit for Cruise Control to work?

I don't know .. @kyguy do you have an idea?

@scholzj
Copy link
Member
scholzj commented Sep 5, 2024

Discussed on the community call on 5.9.2024: This should be documented as a warning for the users. We should make it clear:

  • That the controller mutation rate will affect the Topic Operator if enabled and should be configured high enough to keep it working (the exact value will depend on the number of topics etc. being added / changed)
  • That Cruise Control and Cruise Control metrics reporter need some minimal values to work properly (@kyguy will try to provide some values needed by Cruise Control)

@PaulRMellor PaulRMellor self-assigned this Oct 2, 2024
@kyguy
Copy link
Member
kyguy commented Oct 3, 2024

That Cruise Control and Cruise Control metrics reporter need some minimal values to work properly (@kyguy will try to provide some values needed by Cruise Control)

Sorry I dropped the ball on this, let me do some calculations and provide an estimate for this tomorrow

@kyguy
Copy link
Member
kyguy commented Oct 28, 2024

That Cruise Control and Cruise Control metrics reporter need some minimal values to work properly (@kyguy will try to provide some values needed by Cruise Control)

Apologies for the delay, here are some minimal produce/fetch limits (producer_byte_rate/consumer_byte_rate) for Cruise Control producer/consumers that should suffice for small clusters with default Cruise Control configurations

  • CruiseControlMetricsReporter (producer) - 1 kB/s
  • KafkaCruiseControlSampleStoreProducer (producer) - 1kB/s
  • CruiseControlMetricsReporterSampler-consumer (consumer) - 1 kB/s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants