8000 [Cache] Improve reliability and performance of `TagAwareAdapter` by making tag versions an integral part of item value by sbelyshkin · Pull Request #42997 · symfony/symfony · GitHub
[go: up one dir, main page]

Skip to content

[Cache] Improve reliability and performance of TagAwareAdapter by making tag versions an integral part of item value #42997

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 28, 2022

Conversation

sbelyshkin
Copy link
Contributor
Q A
Branch? 5.4
Bug fix? no
New feature? yes
Deprecations? no
Tickets -
License MIT
Doc PR -

I'd like to introduce a new family of Tag Aware Adapters which can be used with various volatile storages such as Ephemeral Redis and Memcached. (Current implementations of TagAwareAdapters either do not support them explicitly or fail to guarantee tag-based invalidation of linked items)

Besides the main purpose and functionality, they have a feature which, I feel, needs to be explained and defended. Namely, an optimistic concurrency control, or something like that :)

With Ephemeral Tag Aware adapters, it's pretty easy to achieve OCC, at least we can track changes (tag invalidations) which happen during computation of value. Though outdated writes are not "rolled back" by adapters, the invalidated values will be rejected on subsequent reads. Without this feature, in such situations we usually end up with stale items waiting for the next update/invalidation, sometimes for quite a long period. Unit tests may shed further light on the power of OCC.

And one more thing in this PR is RetryProxyAdapter which is complementary to get()'s probabilistic stampede protection and a lock-free alternative to its LockRegistry. It's main advantage is that it is suitable for all PSR-6 adapters as well as for TagAware ones. As a special case, it can be useful in distributed setups when replicas are used for reads and a replication lag may be significant from time to time.

Your comments and suggestions will be greatly appreciated.

@carsonbot
Copy link

Hey!

I see that this is your first PR. That is great! Welcome!

Symfony has a contribution guide which I suggest you to read.

In short:

  • Always add tests
  • Keep backward compatibility (see https://symfony.com/bc).
  • Bug fixes must be submitted against the lowest maintained branch where they apply (see https://symfony.com/releases)
  • Features and deprecations must be submitted against the 5.4 branch.

Review the GitHub status checks of your pull request and try to solve the reported issues. If some tests are failing, try to see if they are failing because of this change.

When two Symfony core team members approve this change, it will be merged and you will become an official Symfony contributor!
If this PR is merged in a lower version branch, it will be merg 8000 ed up to all maintained branches within a few days.

I am going to sit back now and wait for the reviews.

Cheers!

Carsonbot

@derrabus derrabus added this to the 5.4 milestone Sep 13, 2021
@nicolas-grekas
Copy link
Member

Hi, thanks for the PR, it's an impressive amount of work!

Current implementations of TagAwareAdapters either do not support them explicitly or fail to guarantee tag-based invalidation of linked items

While it may not be explicit, I feel like TagAwareAdapter already provides what you describe here about OCC (and that's not by mistake ;)). I'd like to understand what the new ephemeral adapters provide that TagAwareAdapter doesn't? If they provide an incremental improvement, then can't we bring that improvement to TagAwareAdapter?

one more thing in this PR is RetryProxyAdapter

Can you please split this proposal in a separate PR so that we can consider it separately? Note that I still need to understand when this is an appropriate strategy, vs recomputing or locking. I'm also wondering how this works when the cache is cold? (but let's discuss that in the new PR)

@sbelyshkin
Copy link
Contributor Author

Thank you for your feedback @nicolas-grekas!

Current implementations of TagAwareAdapters either do not support them explicitly or fail to guarantee tag-based invalidation of linked items

While it may not be explicit, I feel like TagAwareAdapter already provides what you describe here about OCC (and that's not by mistake ;)). I'd like to understand what the new ephemeral adapters provide that TagAwareAdapter doesn't? If they provide an incremental improvement, then can't we bring that improvement to TagAwareAdapter?

A few years ago I arrived at the conclusion that TagAwareAdapter is not safe with LRU caches, and a few weeks ago I decided to fix that but arrived at another conclusion that it's not feasible, at least from my point of view. I'll recheck current TagAwareAdapter and be back with detailed information.

The main purpose of this family of adapters is to make tag-based invalidation reliable for LRU-like caches. OCC is just a possible positive side-effect.

one more thing in this PR is RetryProxyAdapter

Can you please split this proposal in a separate PR so that we can consider it separately? Note that I still need to understand when this is an appropriate strategy, vs recomputing or locking. I'm also wondering how this works when the cache is cold? (but let's discuss that in the new PR)

I'll cherry-pick related to RetryProxyAdapter commits into a new branch. Should I also remove them from the current branch and leave the rest in current PR, or I need to cherry-pick the rest and open new PR for Ephemeral Adapters? Which option is the best?

@nicolas-grekas
Copy link
Member

I arrived at the conclusion that TagAwareAdapter is not safe with LRU caches

Currently, TagAwareAdapter stores tags and items under two separate keys. This means LRU can decide to evict the tags but not the item. That's maybe what you found? But since both keys are always accessed together, they should have the same LRU stats. Let me know if there is anything else.

Should I also remove them from the current branch and leave the rest in current PR

Yes please. Let's keep only ephemeral-related stuff in this very PR.

@sbelyshkin sbelyshkin force-pushed the ephemeral-tag-aware-adapters branch from face50d to 5af1ff6 Compare September 28, 2021 11:24
@sbelyshkin sbelyshkin force-pushed the ephemeral-tag-aware-adapters branch 3 times, most recently from a697b14 to c952262 Compare October 6, 2021 10:03
@sbelyshkin sbelyshkin force-pushed the ephemeral-tag-aware-adapters branch from c952262 to a93b9ed Compare October 13, 2021 11:10
@sbelyshkin
Copy link
Contributor Author
sbelyshkin commented Oct 13, 2021

I am back :) Have made some fixes and improvements.

Currently, TagAwareAdapter stores tags and items under two separate keys. This means LRU can decide to evict the tags but not the item. That's maybe what you found? But since both keys are always accessed together, they should have the same LRU stats. Let me know if there is anything else.

Of course, this solution attracted my attention. I can say that although TagAwareAdapter handles the absence of item tags gracefully (it requires both keys), different situations when data may become outdated are possible. For example, Redis may reject writes when it fails to reclaim enough memory (is out of memory), thus updating relatively big item value may be rejected while item tags are sucessfully updated (ironically, instances with 'noeviction' policy are most obviously affected by this scenario). Theoretically, two concurrent processes may overwrite one of the each other's keys.

With regard to LRU scores and evictions, absolutely in most cases Redis will evict these keys separately just because of its probabilistic LRU algorithm https://redis.io/topics/lru-cache, and sometimes instead of evicting one item of bigger size Redis will have to evict more items (many smaller item tags). Memcached, in its turn, uses separate queues for items of different sizes so again these keys most likely will be evicted separately https://memcached.org/blog/modern-lru/. I'd say that because of their splitted nature, items have more than one reason to become corrupted.

With sharded db, you may also be concerned with the fact that the ratio of survived items when one of shards goes down declines from (N-1)/N to ((N-1)/N)^2 since item's value and their tags are stored randomly on different shards.

But my main concern is the versioning algorithm, it may be affected by evictions and OOM state.

$tagVersions[$tag = $tags[$tag]] = $version->get() ?: 0;
When tag's version (another key, not from the pair mentioned above) is evicted, it's being "reset" to 0. You may lose invalidation which occured before eviction if previous item's tag version was 0 and item had not been updated (recomputed and saved) by the time of tag eviction. Or, item's tag version may be non-0 but something near to, then it's possible to lose invalidation(s) which occur(s) after eviction. In such situations, item's value remains valid while should be deemed invalidated.

Versions are incremented, and increment operation is not atomic. Setting aside theoretical debates about atomicity, let's focus on implementation details specific to Memcached and Redis.

$invalidatedTags[$tag] = $version->set(++$tagVersions[$tag]);

When memory limit is reached, Redis and Memcached evict other key(s) on SET even if specified key already exists. In Memcached, it very likely will be another tag since they all are in a one queue for smallest items. In Redis, as we already know, SET (and other memory consuming commands) may be rejected resulting in not updating a value and responding with -OOM command not allowed when used memory > 'maxmemory'. That is, tag invalidation may be lost.

In EphemeralTagAwareAdapter, tag invalidation is DELetion of a tag. Redis doesn't reject DEL command so you'll never lose invalidation because of OOM state. After eviction or invalidation (which have the same effect), tag version is set to a new random number from fairly wide range (presisely 2^64 even for 32-bit platforms) which will be used for tagging items till the next invalidation or eviction.


In relation to OCC, there was one intresting bug/feature of TagAwareAdapter which was corrected by update #43302 (no 100% guarantee, though).

The order of operations on items and tags is different for adapters with single pool and with two separate pools for items and tags because they use different deferred queues which are implicitly flushed at different points. In most cases this difference doesn't affect data validity much but with single pool item value is saved to cache before current tag version is even loaded from cache, hence item may be saved with outdated data and then new just updated tag version will be attached.

Why did I say that this feature doesn't affect data validity much? Because data validity is already affected by the fact that tag versions are loaded after data was computed and passed to save(). It's pretty obvious fact for all PSR-6 adapters wich accept only pre-computed data while that data may become outdated during its computation.

So achieving OCC with tags is possible only if it's allowed to pass a callback instead of a pre-computed value. Allowing this for save() method probably violates PSR-6 restrictions while improvement of Contracts' get() seems feasable.

https://github.com/symfony/symfony/pull/42997/files#diff-3b97d36771c557028a9b6d24c305832ab72f78ac62c8951cf508bcaa0e0fc857R69

https://github.com/symfony/symfony/pull/42997/files#diff-9645113276660ed0fb1f3f55acb5fe71a88d270d63c2b2eca120f867c28de07fR56

== Appendix ==

Examples of Redis commands for computing and saving new tagged item with get() method followed by tag invalidation and then recomputing item's value on the next get(). You may notice the lag between first MGET and SET operations due to computation of value.

TagAwareAdapter, 1 pool (wrong order):

1633079229.164564 [0 127.0.0.1:49430] "MGET" "TagAwarePool:\x00tags\x00foo" "TagAwarePool:foo"
1633079229.279636 [0 127.0.0.1:49430] "SET" "TagAwarePool:foo" "\x00\x00\x00\x02\x00"
1633079229.279838 [0 127.0.0.1:49430] "MGET" "TagAwarePool:tag1\x00tags\x00"
1633079229.280168 [0 127.0.0.1:49430] "SET" "TagAwarePool:\x00tags\x00foo" "\x00\x00\x00\x02\x14\x01\x11\x04tag1\x06\x00"
. . .
1633079233.324239 [0 127.0.0.1:49430] "MGET" "TagAwarePool:tag1\x00tags\x00"
1633079233.324899 [0 127.0.0.1:49430] "SET" "TagAwarePool:tag1\x00tags\x00" "\x00\x00\x00\x02\x06\x01"
. . .
1633079237.011041 [0 127.0.0.1:49430] "MGET" "TagAwarePool:\x00tags\x00foo" "TagAwarePool:foo"
1633079237.011697 [0 127.0.0.1:49430] "MGET" "TagAwarePool:tag1\x00tags\x00"
1633079237.125972 [0 127.0.0.1:49430] "SET" "TagAwarePool:foo" "\x00\x00\x00\x02\x00"
1633079237.126200 [0 127.0.0.1:49430] "MGET" "TagAwarePool:tag1\x00tags\x00"
1633079237.126624 [0 127.0.0.1:49430] "SET" "TagAwarePool:\x00tags\x00foo" "\x00\x00\x00\x02\x14\x01\x11\x04tag1\x06\x01"

TagAwareAdapter, 2 pools (right order):

1633081096.268363 [0 127.0.0.1:57272] "MGET" "TagAwarePool:\x00tags\x00foo" "TagAwarePool:foo"
1633081096.382170 [0 127.0.0.1:57274] "MGET" "TagAwareTags:tag1\x00tags\x00"
1633081096.383040 [0 127.0.0.1:57272] "SET" "TagAwarePool:foo" "\x00\x00\x00\x02\x00"
1633081096.383062 [0 127.0.0.1:57272] "SET" "TagAwarePool:\x00tags\x00foo" "\x00\x00\x00\x02\x14\x01\x11\x04tag1\x06\a"
. . .
1633081101.967450 [0 127.0.0.1:57274] "MGET" "TagAwareTags:tag1\x00tags\x00"
1633081101.968720 [0 127.0.0.1:57274] "SET" "TagAwareTags:tag1\x00tags\x00" "\x00\x00\x00\x02\x06\b"
. . .
1633081104.507026 [0 127.0.0.1:57272] "MGET" "TagAwarePool:\x00tags\x00foo" "TagAwarePool:foo"
1633081104.507781 [0 127.0.0.1:57274] "MGET" "TagAwareTags:tag1\x00tags\x00"
1633081104.620087 [0 127.0.0.1:57274] "MGET" "TagAwareTags:tag1\x00tags\x00"
1633081104.620870 [0 127.0.0.1:57272] "SET" "TagAwarePool:foo" "\x00\x00\x00\x02\x00"
1633081104.620892 [0 127.0.0.1:57272] "SET" "TagAwarePool:\x00tags\x00foo" "\x00\x00\x00\x02\x14\x01\x11\x04tag1\x06\b"

EphemeralTagAwareAdapter without OCC, 1 pool:

1633438756.945548 [0 127.0.0.1:43578] "MGET" "EphemeralPool:$foo"
1633438757.059431 [0 127.0.0.1:43578] "MGET" "EphemeralPool:#tag1"
1633438757.059906 [0 127.0.0.1:43578] "SET" "EphemeralPool:#tag1" "\x00\x00\x00\x02\x11\tV\bL\xca\xbdo\xf0\xd5\xa8"
1633438757.060436 [0 127.0.0.1:43578] "SET" "EphemeralPool:$foo" "\x00\x00\x00\x02\x14\x02\x11\x01$\x06\x00\x11\x01#\x14\x01\x11\x04tag1\x11\tV\bL\xca\xbdo\xf0\xd5\xa8"
. . .
1633438760.593541 [0 127.0.0.1:43578] "UNLINK" "Ephe
8000
meralPool:#tag1"
. . .
1633438763.509929 [0 127.0.0.1:43578] "MGET" "EphemeralPool:$foo"
1633438763.510409 [0 127.0.0.1:43578] "MGET" "EphemeralPool:#tag1"
1633438763.510951 [0 127.0.0.1:43578] "SET" "EphemeralPool:#tag1" "\x00\x00\x00\x02\x11\t4\xe1\xbb\xa6\xbbo\xf0\xd5\xa8"
1633438763.625100 [0 127.0.0.1:43578] "SET" "EphemeralPool:$foo" "\x00\x00\x00\x02\x14\x02\x11\x01$\x06\x00\x11\x01#\x14\x01\x11\x04tag1\x11\t4\xe1\xbb\xa6\xbbo\xf0\xd5\xa8"

EphemeralTagAwareAdapter with OCC, 1 pool:

1633439000.313231 [0 127.0.0.1:43588] "MGET" "EphemeralPool:$foo"
1633439000.313981 [0 127.0.0.1:43588] "MGET" "EphemeralPool:#tag1"
1633439000.314627 [0 127.0.0.1:43588] "SET" "EphemeralPool:#tag1" "\x00\x00\x00\x02\x11\t\x0e{\xaa\x05K\xbe\xdegu"
1633439000.428288 [0 127.0.0.1:43588] "SET" "EphemeralPool:$foo" "\x00\x00\x00\x02\x14\x02\x11\x01$\x06\x00\x11\x01#\x14\x01\x11\x04tag1\x11\t\x0e{\xaa\x05K\xbe\xdegu"
. . .
1633439004.192115 [0 127.0.0.1:43588] "UNLINK" "EphemeralPool:#tag1"
. . .
1633439007.144599 [0 127.0.0.1:43588] "MGET" "EphemeralPool:$foo"
1633439007.145116 [0 127.0.0.1:43588] "MGET" "EphemeralPool:#tag1"
1633439007.145622 [0 127.0.0.1:43588] "SET" "EphemeralPool:#tag1" "\x00\x00\x00\x02\x11\t1\xc4\xfb\x90\xea\xbe\xdegu"
1633439007.260152 [0 127.0.0.1:43588] "SET" "EphemeralPool:$foo" "\x00\x00\x00\x02\x14\x02\x11\x01$\x06\x00\x11\x01#\x14\x01\x11\x04tag1\x11\t1\xc4\xfb\x90\xea\xbe\xdegu"

EphemeralTagAwareAdapter with OCC, 2 pools:

1633439207.980599 [0 127.0.0.1:43596] "MGET" "EphemeralPool:$foo"
1633439207.981096 [0 127.0.0.1:43598] "MGET" "EphemeralTags:#tag1"
1633439207.981430 [0 127.0.0.1:43598] "SET" "EphemeralTags:#tag1" "\x00\x00\x00\x02\x11\tw;\xee^\x15`#r\xbc"
1633439208.101122 [0 127.0.0.1:43596] "SET" "EphemeralPool:$foo" "\x00\x00\x00\x02\x14\x02\x11\x01$\x06\x00\x11\x01#\x14\x01\x11\x04tag1\x11\tw;\xee^\x15`#r\xbc"
. . .
1633439215.342044 [0 127.0.0.1:43598] "UNLINK" "EphemeralTags:#tag1"
. . .
1633439219.108947 [0 127.0.0.1:43596] "MGET" "EphemeralPool:$foo"
1633439219.109352 [0 127.0.0.1:43598] "MGET" "EphemeralTags:#tag1"
1633439219.109975 [0 127.0.0.1:43598] "SET" "EphemeralTags:#tag1" "\x00\x00\x00\x02\x11\t\x0bM\x9d4\xbf`#r\xbc"
1633439219.225135 [0 127.0.0.1:43596] "SET" "EphemeralPool:$foo" "\x00\x00\x00\x02\x14\x02\x11\x01$\x06\x00\x11\x01#\x14\x01\x11\x04tag1\x11\t\x0bM\x9d4\xbf`#r\xbc"

@sbelyshkin sbelyshkin force-pushed the ephemeral-tag-aware-adapters branch from a93b9ed to b0b2c35 Compare October 15, 2021 04:32
@nicolas-grekas
Copy link
Member
nicolas-grekas commented Oct 18, 2021

Thanks for the extensive details, that's super interesting!

About split keys for value & tags, let's merge them. We don't guarantee that cached values will survive a minor upgrade so it's fine to break that. We just need to be sure that we gracefully discard old data.

About versioning, good points also! We have #43301 on the topic but you go further. I like deleting for invalidating.

I would really really prefer not adding a new adapter to the component. TagAwareAdapter should be the only ephemeral adapter.

Would you agree with sending PRs that incrementally converge to implementing your proposals into TagAwareAdapter? Eg by reviewing #43301 then contributing a PR on top, and by sending another PR for merging keys for value+tags, and another for the ordering issue if there is something to do there?

@sbelyshkin
Copy link
Contributor Author

@nicolas-grekas If all features of EphemeralTagAwareAdapter are desirable and we can discard cached items on upgrade, wouldn't it be easier to replace TagAwareAdapter with EphemeralTagAwareAdapter (keeping the old name) rather than fixing it? The only thing to add in this case is $knownTagVersionsTtl parameter - I can do this.

@nicolas-grekas
Copy link
Member
nicolas-grekas commented Oct 20, 2021

Yes, I'd be fine renaming your EphemeralTagAwareAdapter to TagAwareAdapter, with graceful discarding of old data if that's not already the case.

@sbelyshkin
Copy link
Contributor Author

Yes, I'd be fine renaming your EphemeralTagAwareAdapter to TagAwareAdapter, with graceful discarding of old data id that's not already the case.

EphemeralTagAwareAdapter uses own prefixes for keys so it will totally ignore old data. The only issue I see is that old tags are stored with no expiration and have to be cleared manually (this may be tricky). Do you anticipate any other problems?

@nicolas-grekas
Copy link
Member

As long as the code remain backward compatible, it should be fine :)

@sbelyshkin sbelyshkin force-pushed the ephemeral-tag-aware-adapters branch 7 times, most recently from 6c63270 to 0e8d14d Compare October 25, 2021 03:51
@sbelyshkin
Copy link
Contributor Author
sbelyshkin commented Oct 25, 2021

As a proof of concept I made TagAwareAdapter descendant of EphemeralTagAwareAdapter. It looks very nice now but there are particular concerns:

  1. Change of invalidation algorithm makes it impossible to use different versions of TagAwareAdapter concurrently including period of rolling updates. Because of different prefixes they are two different caches, and while they do not interfere each other, invalidations which come from instances with different versions of adapter will be mutually ignored. Even if we would upgrade TagAwareAdapter incrementally and preserve old prefixes for tags, invalidations from instances with updated adapter (tag deletions) will most likely be ignored by instances with old adapters [Cache] Decrease the probability of invalidation loss on tag eviction #43301
  2. Implicit commit on tag invalidations is not in effect anymore. This behavior is very specific to TagAvareAdapter and I think users should be warned about this change.
    0e8d14d#diff-62a9d545beee431f11c3e05ba5d991976e559f7f95238382a26ded599a95fab6R51
  3. Private method getTagVersions is replaced with inherited protected one, and there is other new protected methods.
    https://github.com/symfony/symfony/blob/0e8d14defdf69b5769a875e831108e8f1817d403/src/Symfony/Component/Cache/Adapter/TagAwareAdapter.php

Which of these points are real BC breaks? Are there other barriers?

Copy link
Member
@nicolas-grekas nicolas-grekas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change of invalidation algorithm makes it impossible to use different versions of TagAwareAdapter concurrently including period of rolling updates

That's fine, we don't guarantee that.

Implicit commit on tag invalidations is not in effect anymore. This behavior is very specific to TagAvareAdapter and I think users should be warned about this change.

Can't we preserve it?

Private method getTagVersions is replaced with inherited protected one, and there is other new protected methods.

I would prefer reducing the API surface, including protected methods. Let's mark the concrete classes final and the abstract one @internal at least.

@fabpot fabpot removed this from the 5.4 milestone Oct 29, 2021
@sbelyshkin sbelyshkin force-pushed the ephemeral-tag-aware-adapters branch from 0f7c13f to c878e52 Compare March 21, 2022 05:22
@nicolas-grekas nicolas-grekas changed the title [Cache] New Family of Ephemeral Tag Aware Adapters [Cache] Improve reliability and performance of TagAwareAdapter by making tag versions an integral part of item value Mar 21, 2022
@nicolas-grekas
Copy link
Member
nicolas-grekas commented Mar 23, 2022

I'm proposing sbelyshkin#1 on top of your work. I added notes there.

Here's a quick summary:

  • instead of using a key with a special array, I propose to introduce a DTO. I named it Ͼ (in the root namespace) for shortness, since this impacts the storage needs.
  • compatibility with previously populated pools is preserved
  • the knownTagVersions cache is now limited to 1000 items with an LRU-based implementation
  • no tags are created anymore when only fetching them for comparison

more compact packing of metadata
prefix for item keys

Somehow reverted by my DTO proposal, which would allow - in another iteration - to use HSET for Redis, thus consuming even less storage.

expiration time instead of creation time for knownTagVersions

Reverted, because I don't get what this provides. What did I miss?

pruning of expired knownTagVersions

Replaced by LRU-based pruning, which is simpler and should be as effective.

inlined getItem()

Reverted also, this won't make a difference in practice, better keep the code simple.

Let me know if I missed anything!

@nicolas-grekas nicolas-grekas force-pushed the ephemeral-tag-aware-adapters branch from 210b4cb to 9961272 Compare March 23, 2022 19:14
@sbelyshkin
Copy link
Contributor Author

Thanks a lot @nicolas-grekas ! I need some time to review all changes you've proposed.
Right now I have a few questions and answers for you.

  • compatibility with previously populated pools is preserved

How is this achieved?

more compact packing of metadata
prefix for item keys

Somehow reverted by my DTO proposal, which would allow - in another iteration - to use HSET for Redis, thus consuming even less storage.

In what way and where do you plan to use HSET?

expiration time instead of creation time for knownTagVersions

Reverted, because I don't get what this provides. What did I miss?

This was a tiny optimization for computing expiration only once and not every time you retrieve a known tag from internal cache. Really tiny :)

pruning of expired knownTagVersions

Replaced by LRU-based pruning, which is simpler and should be as effective.

I'll check that.

inlined getItem()

Reverted also, this won't make a difference in practice, better keep the code simple.

It was another optimization. The only reason for this is better performance when retrieving small items (a few KBs). The degree of increase depends on the ratio of CPU/IO time for whole operation and can be as high as 20% for TAA+FileSystemAdapter on local FS and as high as 10% for TAA+RedisAdapter with local instance of Redis. But for big items (100KBs and more) or for slow IO, the increase is not so impressive.

@nicolas-grekas nicolas-grekas force-pushed the ephemeral-tag-aware-adapters branch from 9961272 to 8c991c8 Compare March 24, 2022 09:06
@nicolas-grekas
Copy link
Member

compatibility with previously populated pools is preserved

by 1. fetching old-style tags while fetching items and 2. have metadata[tags] set to an array when retrieving newly persisted items (vs unset for older ones)

In what way and where do you plan to use HSET?

metadata is about expiry, ctime and tags. I think we could drop expiry and replace it by a call to EXPIRETIME, then we can put ctime+item+tags in one HSET, with special keys for ctime+item to not messup with tags that would also be in the hash.

The only reason for this is better performance when retrieving small items (a few KBs).

Thanks for the details, I didn't bench in such details. I still would like to not duplicate the logic so what I did instead is removing the generator. getItems now returns a simple array. The generator was not useful anyway since we buffer all items internally. This should reclaim some perf.

@nicolas-grekas
Copy link
Member
nicolas-grekas commented Mar 24, 2022

When versions of non-existing tags are requested for item commits or for validation of retrieved items, adapter creates tags and assigns a new random version to them.

Note that this is not true anymore: we don't persist the tags when retrieving items. Does it make sense? I'd like to have your thoughts about that.

@sbelyshkin
Copy link
Contributor Author

When versions of non-existing tags are requested for item commits or for validation of retrieved items, adapter creates tags and assigns a new random version to them.

Note that this is not true anymore: we don't persist the tags when retrieving items.

Thank you for bringing this to my attention. Very likely this solution will increase the number of overwrites of tags and hence item computations. I experimented around this earlier and now need to dive into new code and benchmark it again. I'll do it on this weekend.

@nicolas-grekas nicolas-grekas force-pushed the ephemeral-tag-aware-adapters branch 4 times, most recently from e25b80c to b0e3c9e Compare March 28, 2022 09:31
@nicolas-grekas nicolas-grekas force-pushed the ephemeral-tag-aware-adapters branch from b0e3c9e to 6a90ca0 Compare March 28, 2022 09:57
@nicolas-grekas
Copy link
Member

ctime is now packed as a 14-bits floating point number. This leave 2 bits free: one is used to tell if tags are set or not, the other could be used for future needs.

I'm going to merge this PR because I think it's ready. If any changes should be made, a follow up PR is very welcome @sbelyshkin

@nicolas-grekas
Copy link
Member

Thank you @sbelyshkin.

@sbelyshkin
Copy link
Contributor Author
sbelyshkin commented Mar 28, 2022

I'm a bit late but very happy to see the result :)

I would only warn about declared backward compatibility which is for prevention of cold start after upgrade rather than for simultaneous use of new and old versions of adapter since old adapters reject new format (new wrapper class is unknown to them) and recompute items. So, running them simultaneously will put extra load on old instances.

I hope we'll have the possibility in the future to discontinue an old two-keys format for less I/O and better performance.

@stof
Copy link
Member
stof commented Mar 28, 2022

@sbelyshkin the solution is indeed backward compatible but not forward compatible (which is a lot harder to achieve)

@fabpot fabpot mentioned this pull request Apr 15, 2022
nicolas-grekas added a commit that referenced this pull request Apr 25, 2022
This PR was merged into the 6.1 branch.

Discussion
----------

[Cache] Optimize caching of tags

| Q             | A
| ------------- | ---
| Branch?       | 6.1
| Bug fix?      | no
| New feature?  | no <!-- please update src/**/CHANGELOG.md files -->
| Deprecations? | no <!-- please update UPGRADE-*.md and src/**/CHANGELOG.md files -->
| Tickets       | - <!-- prefix each issue number with "Fix #", no need to create an issue if none exist, explain below instead -->
| License       | MIT
| Doc PR        | - <!-- required for new features -->

It's the follow-up to #42997.

1. Forcing the adapter to fetch tags on item commits makes the algorithm straightforward and a bit more optimized.
2. Caching tag versions only when they are retrieved from or being persisted to the pool ensures that any tagged item will be rejected when there is no related tags in the pool.
3. Using FIFO instead of LRU for known tag versions allows to use all cached tags until expiration and still be able to prevent memory leak.

<!--
Replace this notice by a short README for your feature/bugfix.
This will help reviewers and should be a good start for the documentation.

Additionally (see https://symfony.com/releases):
 - Always add tests and ensure they pass.
 - Bug fixes must be submitted against the lowest maintained branch where they apply
   (lowest branches are regularly merged to upper ones so they get the fixes too.)
 - Features and deprecations must be submitted against the latest branch.
 - Changelog entry should follow https://symfony.com/doc/current/contributing/code/conventions.html#writing-a-changelog-entry
 - Never break backward compatibility (see https://symfony.com/bc).
-->

Commits
-------

68f309b [Cache] Optimize caching of tags
nicolas-grekas added a commit that referenced this pull request Feb 8, 2024
…kas)

This PR was merged into the 6.4 branch.

Discussion
----------

[Cache] Fix BC layer with pre-6.1 cache items

| Q             | A
| ------------- | ---
| Branch?       | 6.4
| Bug fix?      | yes
| New feature?  | no
| Deprecations? | no
| Issues        | Fix #53825
| License       | MIT

Issue introduced in #42997

Commits
-------

9372d95 [Cache] Fix BC layer with pre-6.1 cache items
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants
0