8000 make iterator usage safer after intermediate commits by jsteemann · Pull Request #14346 · arangodb/arangodb · GitHub
[go: up one dir, main page]

Skip to content

Conversation

@jsteemann
Copy link
Contributor
@jsteemann jsteemann commented Jun 8, 2021

Scope & Purpose

When a transaction triggers an intermediate commit, this may affect rocksdb::Iterator objects used by the transaction. These may become silently invalidated if they point to nodes in the committed transaction's WriteBatchWithIndex's index (i.e. the SkipList).

The PR also removes superfluous out-of-bounds checks for read-only transactions, which could even see a small speedup due to the changes.

We should add a lot more tests for these changes and the behavior. In our normal tests, intermediate commits are not happening frequently, and often they have no observable effect on the iterators.

  • 💩 Bugfix (requires CHANGELOG entry)
  • 🍕 New feature (requires CHANGELOG entry, feature documentation and release notes)
  • 🔥 Performance improvement
  • 🔨 Refactoring/simplification
  • 📖 CHANGELOG entry made

Backports:

  • Backports required for: 3.8

Testing & Verification

  • This change is a trivial rework / code cleanup without any test coverage.
  • The behavior in this PR was manually tested
  • This change is already covered by existing tests, such as shell_server_aql, shell_client.
  • This PR adds tests that were used to verify all changes:
    • Added new C++ Unit tests
    • Added new integration tests (e.g. in shell_server / shell_server_aql)
    • Added new resilience tests (only if the feature is impacted by failovers)
  • There are tests in an external testing repository:
  • I ensured this code runs with ASan / TSan or other static verification tools

@jsteemann jsteemann added this to the devel milestone Jun 8, 2021
@jsteemann jsteemann requested review from mpoeter and neunhoef June 8, 2021 21:13
@jsteemann jsteemann marked this pull request as ready for review July 7, 2021 09:53
@jsteemann jsteemann marked this pull request as draft July 7, 2021 18:40
Copy link
Member
@neunhoef neunhoef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intermediate commit of my comments.

_readDocumentContext._outputRow = &output;
written = collection->getPhysical()->read(
&_trx, LocalDocumentId(input.getValue(docRegId).slice().getUInt()), callback).ok();
&_trx, LocalDocumentId(input.getValue(docRegId).slice().getUInt()), callback, ReadOwnWrites::no).ok();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this hardcoded to ReadOwnWrites::no. At the very least an explanation for this is needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree

}
return true;
}).ok();
}, ReadOwnWrites::no).ok();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure why this is hard-wired to no here. Please add a comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree

} else {
TRI_ASSERT(_ownsReadWriteBatch == false);
TRI_ASSERT(_hasActiveModificationQuery.load() == false);
if (_hasActiveModificationQuery.load(std::memory_order_relaxed)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this code if the assertion is correct? Is it only by coincidence that we do not have a test which tries this? In this case we should add such a test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would guess that release builds should return an error instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We currently have no tests that perform concurrent operations on the same transaction. I agree that we need such a test, but I would prefer to add it in a separate PR.

Copy link
Member
@neunhoef neunhoef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I think in a few places it might be sensible to add a few more comments, if only so that we remember ourselves in half a year, why a certain ReadOwnWrites::no or yes is there. I think the method works and improves our situation. Tests are sufficient.
One area where it would be worthwhile to add a few tests is some AQL queries which modify the collection they run over, to make sure that the new behaviour of not seeing the own writes works. An example would be:

FOR doc IN coll
  INSERT {value : doc.value + 1} INTO coll

and then with some index scans, or a graph traversal or the like. But we can also add this later, if time is scarce now (as it always is).

In any case: Well done! This was good and tedious cleanup work and you have done a good job of explaining it to us.

s = mthds->GetForUpdate(_cf, key.string(), &existing);
} else {
s = mthds->Get(_cf, key.string(), &existing);
s = mthds->Get(_cf, key.string(), &existing, ReadOwnWrites::yes);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why yes here? Same for the two other places in this file. Maybe add a comment.

Copy link
Contributor
@cpjulia cpjulia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

_readDocumentContext._outputRow = &output;
written = collection->getPhysical()->read(
&_trx, LocalDocumentId(input.getValue(docRegId).slice().getUInt()), callback).ok();
&_trx, LocalDocumentId(input.getValue(docRegId).slice().getUInt()), callback, ReadOwnWrites::no).ok();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree

OperationOptions options =
ModificationExecutorHelpers::convertOptions(_options, _outVariableNew, _outVariableOld);
// We must not disable indexing for UPSERTs because the subquery might rely on a non-unique secondary index
options.canDisableIndexing = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps this could be encapsulated in a setter, or even be initialized in the constructor?

}
return true;
}).ok();
}, ReadOwnWrites::no).ok();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree

// Even though the assertion is only evaluated in maintainer mode, it must at
// least compile. But since checkIntermediateCommits is only defined in maintainer
// mode, we have to wrap this assert in another ifdef.
TRI_ASSERT(!opts.checkIntermediateCommits || !hasIntermediateCommitsEnabled());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't it be a &&?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we either want intermediate commits to be disabled, or we want the check to be disabled.

Copy link
Contributor Author
@jsteemann jsteemann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor
@cpjulia cpjulia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mpoeter
Copy link
Contributor
mpoeter commented Aug 30, 2021

@maierlars @neunhoef I have added several comments, perhaps you could take another look if anything still needs clarification.

Copy link
Member
@neunhoef neunhoef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jsteemann jsteemann merged commit 7055864 into devel Aug 31, 2021
@jsteemann jsteemann deleted the bug-fix/safe-iterators-after-intermediate-commits branch August 31, 2021 11:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants

0