make iterator usage safer after intermediate commits #14346

jsteemann · 2021-06-08T21:13:45Z

Scope & Purpose

When a transaction triggers an intermediate commit, this may affect rocksdb::Iterator objects used by the transaction. These may become silently invalidated if they point to nodes in the committed transaction's WriteBatchWithIndex's index (i.e. the SkipList).

The PR also removes superfluous out-of-bounds checks for read-only transactions, which could even see a small speedup due to the changes.

We should add a lot more tests for these changes and the behavior. In our normal tests, intermediate commits are not happening frequently, and often they have no observable effect on the iterators.

💩 Bugfix (requires CHANGELOG entry)
🍕 New feature (requires CHANGELOG entry, feature documentation and release notes)
🔥 Performance improvement
🔨 Refactoring/simplification
📖 CHANGELOG entry made

Backports:

Backports required for: 3.8

Testing & Verification

This change is a trivial rework / code cleanup without any test coverage.
The behavior in this PR was manually tested
This change is already covered by existing tests, such as shell_server_aql, shell_client.
This PR adds tests that were used to verify all changes:
- Added new C++ Unit tests
- Added new integration tests (e.g. in shell_server / shell_server_aql)
- Added new resilience tests (only if the feature is impacted by failovers)
There are tests in an external testing repository:
I ensured this code runs with ASan / TSan or other static verification tools

…e-iterators-after-intermediate-commits

…ommits

…er-intermediate-commits

…rs-after-intermediate-commits

…s see when reading from the storage engine.

arangod/Aql/DocumentProducingNode.cpp

arangod/Aql/IndexExecutor.cpp

arangod/Aql/Query.cpp

arangod/RocksDBEngine/Methods/RocksDBTrxMethods.cpp

arangod/RocksDBEngine/RocksDBEdgeIndex.cpp

arangod/RocksDBEngine/Methods/RocksDBTrxMethods.h

arangod/RocksDBEngine/Methods/RocksDBTrxMethods.cpp

neunhoef

Intermediate commit of my comments.

CHANGELOG

arangod/Aql/DocumentProducingNode.cpp

arangod/Aql/IndexExecutor.cpp

neunhoef · 2021-08-27T14:17:11Z

arangod/Aql/MaterializeExecutor.cpp

    _readDocumentContext._outputRow = &output;
    written = collection->getPhysical()->read(
-        &_trx, LocalDocumentId(input.getValue(docRegId).slice().getUInt()), callback).ok();
+        &_trx, LocalDocumentId(input.getValue(docRegId).slice().getUInt()), callback, ReadOwnWrites::no).ok();


Why is this hardcoded to ReadOwnWrites::no. At the very least an explanation for this is needed.

neunhoef · 2021-08-27T14:25:48Z

arangod/Graph/Cache/RefactoredTraverserCache.cpp

        }
        return true;
-      }).ok();
+      }, ReadOwnWrites::no).ok();


I am not sure why this is hard-wired to no here. Please add a comment.

arangod/RocksDBEngine/Methods/RocksDBReadOnlyBaseMethods.cpp

neunhoef · 2021-08-27T15:00:17Z

arangod/RocksDBEngine/Methods/RocksDBTrxMethods.cpp

+  } else {
+    TRI_ASSERT(_ownsReadWriteBatch == false);
+    TRI_ASSERT(_hasActiveModificationQuery.load() == false);
+    if (_hasActiveModificationQuery.load(std::memory_order_relaxed)) {


Why this code if the assertion is correct? Is it only by coincidence that we do not have a test which tries this? In this case we should add such a test.

I would guess that release builds should return an error instead?

We currently have no tests that perform concurrent operations on the same transaction. I agree that we need such a test, but I would prefer to add it in a separate PR.

neunhoef

LGTM. I think in a few places it might be sensible to add a few more comments, if only so that we remember ourselves in half a year, why a certain ReadOwnWrites::no or yes is there. I think the method works and improves our situation. Tests are sufficient.
One area where it would be worthwhile to add a few tests is some AQL queries which modify the collection they run over, to make sure that the new behaviour of not seeing the own writes works. An example would be:

FOR doc IN coll
  INSERT {value : doc.value + 1} INTO coll

and then with some index scans, or a graph traversal or the like. But we can also add this later, if time is scarce now (as it always is).

In any case: Well done! This was good and tedious cleanup work and you have done a good job of explaining it to us.

arangod/Transaction/Methods.cpp

neunhoef · 2021-08-27T15:46:24Z

arangod/RocksDBEngine/RocksDBVPackIndex.cpp

        s = mthds->GetForUpdate(_cf, key.string(), &existing); 
      } else {
-        s = mthds->Get(_cf, key.string(), &existing); 
+        s = mthds->Get(_cf, key.string(), &existing, ReadOwnWrites::yes);


Why yes here? Same for the two other places in this file. Maybe add a comment.

cpjulia

LGTM

arangod/Aql/AqlTransaction.cpp

arangod/Aql/DocumentProducingNode.h

arangod/Aql/EnumerateCollectionExecutor.cpp

arangod/Aql/EnumerateCollectionExecutor.h

cpjulia · 2021-08-29T22:25:12Z

arangod/Aql/MaterializeExecutor.cpp

    _readDocumentContext._outputRow = &output;
    written = collection->getPhysical()->read(
-        &_trx, LocalDocumentId(input.getValue(docRegId).slice().getUInt()), callback).ok();
+        &_trx, LocalDocumentId(input.getValue(docRegId).slice().getUInt()), callback, ReadOwnWrites::no).ok();


arangod/Aql/IndexExecutor.h

cpjulia · 2021-08-29T22:28:23Z

arangod/Aql/ModificationNodes.cpp

  OperationOptions options =
      ModificationExecutorHelpers::convertOptions(_options, _outVariableNew, _outVariableOld);
+  // We must not disable indexing for UPSERTs because the subquery might rely on a non-unique secondary index
+  options.canDisableIndexing = false;


perhaps this could be encapsulated in a setter, or even be initialized in the constructor?

cpjulia · 2021-08-29T22:38:13Z

arangod/Graph/Cache/RefactoredTraverserCache.cpp

        }
        return true;
-      }).ok();
+      }, ReadOwnWrites::no).ok();


cpjulia · 2021-08-29T22:45:14Z

arangod/RocksDBEngine/Methods/RocksDBTrxMethods.cpp

+    // Even though the assertion is only evaluated in maintainer mode, it must at
+    // least compile. But since checkIntermediateCommits is only defined in maintainer
+    // mode, we have to wrap this assert in another ifdef.
+    TRI_ASSERT(!opts.checkIntermediateCommits || !hasIntermediateCommitsEnabled());


wouldn't it be a &&?

No, we either want intermediate commits to be disabled, or we want the check to be disabled.

jsteemann

LGTM!

cpjulia

LGTM

…rs-after-intermediate-commits

mpoeter · 2021-08-30T15:47:52Z

@maierlars @neunhoef I have added several comments, perhaps you could take another look if anything still needs clarification.

neunhoef

LGTM

…ommits

jsteemann added 3 commits June 8, 2021 21:23

make iterator usage safer after intermediate commits

b0d7f00

more adjustments

b61528e

more cleanup

d79df00

jsteemann added this to the devel milestone Jun 8, 2021

jsteemann requested review from mpoeter and neunhoef June 8, 2021 21:13

jsteemann mentioned this pull request Jun 8, 2021

make iterator usage safer after intermediate commits #14347

Closed

15 tasks

jsteemann and others added 7 commits June 8, 2021 23:17

added CHANGELOG entry

4ef5aa0

updated CHANGELOG

c727070

Merge branch 'devel' of github.com:arangodb/arangodb into bug-fix/saf…

670150d

…e-iterators-after-intermediate-commits

revert most changes

b35fb33

Merge branch 'devel' of github.com:arangodb/arangodb into bug-fix/saf…

4ffe59b

…e-iterators-after-intermediate-commits

Merge branch 'devel' of github.com:arangodb/arangodb into bug-fix/saf…

174d1b1

…e-iterators-after-intermediate-commits

fix issues with invalid iterator states

0eb8ee8

jsteemann marked this pull request as ready for review July 7, 2021 09:53

jsteemann added 2 commits July 7, 2021 12:18

Merge branch 'devel' into bug-fix/safe-iterators-after-intermediate-c…

aafea55

…ommits

simplify PR

777d07f

jsteemann marked this pull request as draft July 7, 2021 18:40

jsteemann added 9 WIP DO_NOT_MERGE labels Jul 28, 2021

mpoeter added 10 commits August 19, 2021 17:29

Merge remote-tracking branch 'origin' into bug-fix/safe-iterators-aft…

9a48e3c

…er-intermediate-commits

Remove code duplication introduced by merge.

a0f1596

Merge remote-tracking branch 'origin/devel' into bug-fix/safe-iterato…

f3f0dde

…rs-after-intermediate-commits

Extend test suite

201a018

Introduce ReadOwnWrites flag to control what iterators/read operation…

18f0420

…s see when reading from the storage engine.

Introduce dontDisableIndexing operation option for UPSERT inserts.
316ba3c

Adapt unit tests to changed interface.

f92f765

Fix: iteratorMustCheckBounds must consider readOwnWrites flag.

3bbc8ee

Fix tests

2e07cf3

Fix read-own-write semantic for streaming transactions.

9138b6c