[go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix RDF and Rel Group Entries in WAL Recovery #3912

Merged
merged 15 commits into from
Jul 25, 2024

Conversation

yiyun-sj
Copy link
Collaborator

No description provided.

@yiyun-sj yiyun-sj requested a review from ray6080 July 23, 2024 17:28
src/catalog/catalog_set.cpp Outdated Show resolved Hide resolved
Copy link

Benchmark Result

Master commit hash: 091f67166caab85088499b5943d552992237a4b8
Branch commit hash: 371683e2b6a46f9cf05a1edb53c77756c85efced

Query Group Query Name Mean Time - Commit (ms) Mean Time - Master (ms) Diff
aggregation q24 673.10 674.14 -1.04 (-0.15%)
aggregation q28 11831.22 12175.60 -344.38 (-2.83%)
filter q14 153.00 154.42 -1.43 (-0.92%)
filter q15 154.85 153.18 1.66 (1.09%)
filter q16 324.14 323.45 0.68 (0.21%)
filter q17 477.80 486.20 -8.40 (-1.73%)
filter q18 1950.23 1959.01 -8.77 (-0.45%)
fixed_size_expr_evaluator q07 593.76 593.11 0.65 (0.11%)
fixed_size_expr_evaluator q08 816.48 817.61 -1.13 (-0.14%)
fixed_size_expr_evaluator q09 817.03 812.88 4.15 (0.51%)
fixed_size_expr_evaluator q10 271.50 268.46 3.04 (1.13%)
fixed_size_expr_evaluator q11 266.44 261.53 4.91 (1.88%)
fixed_size_expr_evaluator q12 262.63 263.16 -0.53 (-0.20%)
fixed_size_expr_evaluator q13 1492.60 1495.76 -3.16 (-0.21%)
fixed_size_seq_scan q23 149.83 147.41 2.42 (1.64%)
join q31 48.65 48.46 0.19 (0.39%)
ldbc_snb_ic q35 3723.06 3993.58 -270.51 (-6.77%)
ldbc_snb_ic q36 137.66 123.65 14.01 (11.33%)
ldbc_snb_is q32 9.44 10.20 -0.77 (-7.51%)
ldbc_snb_is q33 91.00 92.70 -1.69 (-1.82%)
ldbc_snb_is q34 84.47 95.66 -11.19 (-11.69%)
multi-rel multi-rel-large-scan 3314.74 3183.95 130.79 (4.11%)
multi-rel multi-rel-lookup 43.87 35.06 8.81 (25.12%)
multi-rel multi-rel-small-scan 56.55 75.01 -18.46 (-24.61%)
order_by q25 161.09 157.14 3.96 (2.52%)
order_by q26 469.34 466.35 2.99 (0.64%)
order_by q27 1429.46 1444.33 -14.87 (-1.03%)
scan_after_filter q01 202.12 201.67 0.44 (0.22%)
scan_after_filter q02 187.89 188.65 -0.76 (-0.40%)
shortest_path_ldbc100 q39 165.65 165.94 -0.29 (-0.17%)
var_size_expr_evaluator q03 2091.12 2099.81 -8.69 (-0.41%)
var_size_expr_evaluator q04 2298.29 2249.15 49.15 (2.19%)
var_size_expr_evaluator q05 2644.67 2637.72 6.94 (0.26%)
var_size_expr_evaluator q06 1407.05 1403.54 3.51 (0.25%)
var_size_seq_scan q19 1495.56 1493.47 2.09 (0.14%)
var_size_seq_scan q20 3086.84 3131.91 -45.07 (-1.44%)
var_size_seq_scan q21 2387.51 2428.59 -41.07 (-1.69%)
var_size_seq_scan q22 133.96 135.05 -1.09 (-0.81%)

@yiyun-sj yiyun-sj marked this pull request as ready for review July 23, 2024 20:30
Copy link
Contributor
@ray6080 ray6080 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

#include <utility>

#include "binder/binder.h"
#include "catalog/catalog.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double check if this include catalog/catalog.h is necessary.

@@ -51,11 +54,12 @@ CatalogEntry* CatalogSet::getEntryNoLock(Transaction* transaction, const std::st
}

void CatalogSet::createEntry(Transaction* transaction, std::unique_ptr<CatalogEntry> entry) {
std::lock_guard lck{mtx};
createEntryNoLock(transaction, std::move(entry));
std::unique_lock lck{mtx};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See if you can reuse UniqLock under common/uniq_lock.h.

}

void CatalogSet::createEntryNoLock(Transaction* transaction, std::unique_ptr<CatalogEntry> entry) {
void CatalogSet::createEntryNoLock(Transaction* transaction, std::unique_ptr<CatalogEntry> entry,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename this back to createEntry. I think passing lock as param in is quite evident itself.

@@ -75,6 +79,7 @@ void CatalogSet::createEntryNoLock(Transaction* transaction, std::unique_ptr<Cat
entries.emplace(entry->getName(), std::move(dummyEntry));
auto entryPtr = entry.get();
emplaceNoLock(std::move(entry));
lck.unlock();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not unlock here. Instead you can move the part of push undo buffer out of this function.
It's not a good practice to unlock a passed in lock. Always better to lock and unlock at the same level.

@@ -142,7 +147,7 @@ void CatalogSet::dropEntryNoLock(Transaction* transaction, const std::string& na
}

void CatalogSet::alterEntry(Transaction* transaction, const binder::BoundAlterInfo& alterInfo) {
std::lock_guard lck{mtx};
std::unique_lock lck{mtx};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto here on UniqLock.

@@ -54,12 +54,14 @@ class Deserializer {
void deserializeVector(std::vector<T>& values) {
uint64_t vectorSize;
deserializeValue(vectorSize);
values.resize(vectorSize);
for (auto& value : values) {
values.reserve(vectorSize);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change this function?

Don't do reserve and emplace_back. This assumes the input values is empty before deserializing. but it can be violated. either change to resize or at least assert that values is empty before deserializing.

@@ -36,6 +37,7 @@ class WAL {

~WAL();

void logCreateTableEntryRecord(binder::BoundCreateTableInfo tableInfo);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment here to explain logCreateTableEntryRecord and logCreateCatalogEntryRecord, as their names are not evident enough.

Alternatively you can rename them to logCreateTableEntryRecord and logCreateNonTableEntryRecord, which is not perfect but somewhat makes sense.

@@ -65,10 +70,16 @@ void Transaction::pushCatalogEntry(CatalogSet& catalogSet, CatalogEntry& catalog
case CatalogEntryType::TYPE_ENTRY: {
KU_ASSERT(
catalogEntry.getType() == CatalogEntryType::DUMMY_ENTRY && catalogEntry.isDeleted());
if (newCatalogEntry->hasParent()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this for sequence? I'm not following sequence for now. Maybe you can just walk me through this line of change tomorrow.

@ray6080
Copy link
Contributor
ray6080 commented Jul 24, 2024

Can u unskip and double check tests related to dataset/rdf/rdf_variant?

@@ -50,12 +50,24 @@ CatalogEntry* CatalogSet::getEntryNoLock(Transaction* transaction, const std::st
return entry;
}

static void LogEntryForTrx(Transaction* transaction, CatalogSet& set, CatalogEntry& entry) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LogEntryForTrx -> logEntryForTrx.

Function names always stay as the first letter lower case regardless of static or non-static

@yiyun-sj yiyun-sj force-pushed the wal-rdf-rel-group-entry-fix branch from 2e33352 to 784f973 Compare July 24, 2024 21:19
Copy link

Benchmark Result

Master commit hash: 8e80d3e4934a5ee304fff88ee13810ce59395cd3
Branch commit hash: e2cda1e96c9ea583fb41fcdb19882870a04c652d

Query Group Query Name Mean Time - Commit (ms) Mean Time - Master (ms) Diff
aggregation q24 667.92 689.16 -21.24 (-3.08%)
aggregation q28 11108.43 11433.05 -324.62 (-2.84%)
copy node-Comment 66782.84 71246.25 -4463.41 (-6.26%)
copy node-Forum 5096.72 5078.71 18.01 (0.35%)
copy node-Organisation 1449.82 1676.89 -227.07 (-13.54%)
copy node-Person 2655.13 2486.89 168.24 (6.77%)
copy node-Place 1319.30 1485.43 -166.13 (-11.18%)
copy node-Post 25130.18 27990.57 -2860.39 (-10.22%)
copy node-Tag 1530.04 1492.53 37.51 (2.51%)
copy node-Tagclass 858.87 1461.78 -602.91 (-41.24%)
copy rel-comment-hasCreator 102291.81 106345.39 -4053.58 (-3.81%)
copy rel-comment-hasTag 226513.77 178990.98 47522.79 (26.55%)
copy rel-comment-isLocatedIn 152669.72 147715.40 4954.32 (3.35%)
copy rel-containerOf 29816.42 35437.40 -5620.98 (-15.86%)
copy rel-forum-hasTag 12344.76 15438.50 -3093.74 (-20.04%)
copy rel-hasInterest 8272.29 8302.77 -30.48 (-0.37%)
copy rel-hasMember 122557.32 101447.43 21109.89 (20.81%)
copy rel-hasModerator 4162.04 4663.85 -501.81 (-10.76%)
copy rel-hasType 6829.25 6772.47 56.78 (0.84%)
copy rel-isPartOf 7181.84 6976.39 205.45 (2.94%)
copy rel-isSubclassOf 6378.50 6382.04 -3.54 (-0.06%)
copy rel-knows 20177.41 26715.98 -6538.57 (-24.47%)
copy rel-likes-comment 167848.89 168607.90 -759.01 (-0.45%)
copy rel-likes-post 56303.92 65167.90 -8863.98 (-13.60%)
copy rel-organisation-isLocatedIn 5822.92 5918.14 -95.22 (-1.61%)
copy rel-person-isLocatedIn 5352.09 5441.29 -89.20 (-1.64%)
copy rel-post-hasCreator 35012.72 37542.11 -2529.39 (-6.74%)
copy rel-post-hasTag 51732.14 51993.45 -261.31 (-0.50%)
copy rel-post-isLocatedIn 40826.31 40769.51 56.80 (0.14%)
copy rel-replyOf-comment 101033.20 101451.79 -418.59 (-0.41%)
copy rel-replyOf-post 76241.00 64363.88 11877.12 (18.45%)
copy rel-studyAt 12231.18 10822.92 1408.26 (13.01%)
copy rel-workAt 11414.51 11629.66 -215.15 (-1.85%)
filter q14 144.37 156.34 -11.97 (-7.66%)
filter q15 154.64 151.88 2.76 (1.82%)
filter q16 320.91 322.66 -1.75 (-0.54%)
filter q17 465.55 475.03 -9.48 (-2.00%)
filter q18 1885.45 1950.55 -65.10 (-3.34%)
fixed_size_expr_evaluator q07 581.30 589.99 -8.70 (-1.47%)
fixed_size_expr_evaluator q08 807.74 815.70 -7.96 (-0.98%)
fixed_size_expr_evaluator q09 806.26 812.37 -6.11 (-0.75%)
fixed_size_expr_evaluator q10 258.39 268.54 -10.15 (-3.78%)
fixed_size_expr_evaluator q11 253.07 262.39 -9.32 (-3.55%)
fixed_size_expr_evaluator q12 253.07 261.37 -8.30 (-3.18%)
fixed_size_expr_evaluator q13 1494.08 1495.56 -1.48 (-0.10%)
fixed_size_seq_scan q23 134.99 149.02 -14.03 (-9.42%)
join q31 46.31 50.28 -3.97 (-7.90%)
ldbc_snb_ic q35 3701.25 3729.56 -28.30 (-0.76%)
ldbc_snb_ic q36 129.99 129.77 0.21 (0.16%)
ldbc_snb_is q32 10.44 10.18 0.26 (2.57%)
ldbc_snb_is q33 96.26 97.28 -1.02 (-1.05%)
ldbc_snb_is q34 92.68 95.39 -2.71 (-2.84%)
multi-rel multi-rel-large-scan 3684.34 3175.36 508.99 (16.03%)
multi-rel multi-rel-lookup 41.11 44.99 -3.88 (-8.62%)
multi-rel multi-rel-small-scan 66.53 71.62 -5.09 (-7.11%)
order_by q25 149.92 155.96 -6.04 (-3.87%)
order_by q26 460.50 466.51 -6.01 (-1.29%)
order_by q27 1406.37 1446.85 -40.48 (-2.80%)
scan_after_filter q01 190.28 200.23 -9.95 (-4.97%)
scan_after_filter q02 178.34 186.58 -8.23 (-4.41%)
shortest_path_ldbc100 q39 90.94 154.72 -63.79 (-41.23%)
var_size_expr_evaluator q03 2062.76 2103.64 -40.88 (-1.94%)
var_size_expr_evaluator q04 2188.28 2279.55 -91.27 (-4.00%)
var_size_expr_evaluator q05 2692.66 2643.44 49.22 (1.86%)
var_size_expr_evaluator q06 1333.52 1396.89 -63.37 (-4.54%)
var_size_seq_scan q19 1472.41 1486.03 -13.63 (-0.92%)
var_size_seq_scan q20 3151.98 3122.87 29.11 (0.93%)
var_size_seq_scan q21 2410.87 2432.89 -22.02 (-0.91%)
var_size_seq_scan q22 132.39 133.64 -1.25 (-0.93%)

@ray6080 ray6080 force-pushed the wal-rdf-rel-group-entry-fix branch from e1e0b48 to 65d3ce0 Compare July 25, 2024 04:19
Copy link

Benchmark Result

Master commit hash: b59cd52b6d28d3ec6fb9b77c7b979145e7926268
Branch commit hash: 7cd3114d8844a3869b2f776debe588d26b9dadc3

Query Group Query Name Mean Time - Commit (ms) Mean Time - Master (ms) Diff
aggregation q24 666.43 671.71 -5.28 (-0.79%)
aggregation q28 11614.56 12072.17 -457.62 (-3.79%)
copy node-Comment 63318.43 N/A N/A
copy node-Forum 5142.30 N/A N/A
copy node-Organisation 1548.88 N/A N/A
copy node-Person 2683.61 N/A N/A
copy node-Place 1444.37 N/A N/A
copy node-Post 24877.33 N/A N/A
copy node-Tag 1624.53 N/A N/A
copy node-Tagclass 986.90 N/A N/A
copy rel-comment-hasCreator 100975.19 N/A N/A
copy rel-comment-hasTag 189954.82 N/A N/A
copy rel-comment-isLocatedIn 147334.08 N/A N/A
copy rel-containerOf 33982.89 N/A N/A
copy rel-forum-hasTag 12451.83 N/A N/A
copy rel-hasInterest 8291.89 N/A N/A
copy rel-hasMember 121706.82 N/A N/A
copy rel-hasModerator 4965.10 N/A N/A
copy rel-hasType 6796.33 N/A N/A
copy rel-isPartOf 7024.78 N/A N/A
copy rel-isSubclassOf 6411.29 N/A N/A
copy rel-knows 20151.53 N/A N/A
copy rel-likes-comment 124268.12 N/A N/A
copy rel-likes-post 67160.43 N/A N/A
copy rel-organisation-isLocatedIn 5759.61 N/A N/A
copy rel-person-isLocatedIn 5296.29 N/A N/A
copy rel-post-hasCreator 37107.18 N/A N/A
copy rel-post-hasTag 51445.75 N/A N/A
copy rel-post-isLocatedIn 45743.67 N/A N/A
copy rel-replyOf-comment 100181.53 N/A N/A
copy rel-replyOf-post 76701.86 N/A N/A
copy rel-studyAt 11226.38 N/A N/A
copy rel-workAt 12209.12 N/A N/A
filter q14 144.79 151.85 -7.06 (-4.65%)
filter q15 145.96 156.11 -10.15 (-6.50%)
filter q16 321.82 326.91 -5.08 (-1.55%)
filter q17 468.68 472.99 -4.31 (-0.91%)
filter q18 1884.10 1972.22 -88.12 (-4.47%)
fixed_size_expr_evaluator q07 582.48 593.26 -10.78 (-1.82%)
fixed_size_expr_evaluator q08 804.89 813.08 -8.19 (-1.01%)
fixed_size_expr_evaluator q09 802.60 813.91 -11.30 (-1.39%)
fixed_size_expr_evaluator q10 256.82 270.55 -13.73 (-5.08%)
fixed_size_expr_evaluator q11 250.72 262.53 -11.81 (-4.50%)
fixed_size_expr_evaluator q12 250.78 263.72 -12.94 (-4.91%)
fixed_size_expr_evaluator q13 1490.44 1489.31 1.13 (0.08%)
fixed_size_seq_scan q23 136.45 145.59 -9.14 (-6.28%)
join q31 44.05 51.77 -7.72 (-14.92%)
ldbc_snb_ic q35 3751.30 3754.42 -3.13 (-0.08%)
ldbc_snb_ic q36 127.25 131.48 -4.22 (-3.21%)
ldbc_snb_is q32 9.71 10.43 -0.73 (-6.97%)
ldbc_snb_is q33 95.41 98.39 -2.97 (-3.02%)
ldbc_snb_is q34 98.47 90.23 8.25 (9.14%)
multi-rel multi-rel-large-scan 3297.19 3373.48 -76.28 (-2.26%)
multi-rel multi-rel-lookup 54.27 46.06 8.20 (17.81%)
multi-rel multi-rel-small-scan 78.74 61.03 17.71 (29.02%)
order_by q25 148.45 154.00 -5.55 (-3.61%)
order_by q26 460.60 462.80 -2.20 (-0.48%)
order_by q27 1402.02 1444.29 -42.27 (-2.93%)
scan_after_filter q01 196.90 200.33 -3.43 (-1.71%)
scan_after_filter q02 184.77 191.85 -7.07 (-3.69%)
shortest_path_ldbc100 q39 56.77 150.83 -94.05 (-62.36%)
var_size_expr_evaluator q03 2085.72 2071.88 13.85 (0.67%)
var_size_expr_evaluator q04 2221.68 2278.34 -56.66 (-2.49%)
var_size_expr_evaluator q05 2683.65 2651.37 32.28 (1.22%)
var_size_expr_evaluator q06 1332.49 1352.93 -20.44 (-1.51%)
var_size_seq_scan q19 1469.71 1500.24 -30.53 (-2.03%)
var_size_seq_scan q20 3175.48 3163.14 12.34 (0.39%)
var_size_seq_scan q21 2408.92 2450.90 -41.98 (-1.71%)
var_size_seq_scan q22 132.27 134.43 -2.16 (-1.61%)

@ray6080 ray6080 merged commit 6ab2b52 into master Jul 25, 2024
27 of 29 checks passed
@ray6080 ray6080 deleted the wal-rdf-rel-group-entry-fix branch July 25, 2024 06:03
acquamarin pushed a commit that referenced this pull request Jul 25, 2024
* initial reimpl

* Run clang-format

* add tests and fix locking

* Run clang-format

* address comments

* Run clang-format

* fix small mistake

* bump versions

* change name casing

* fix logging

* export symbol for extension

* change default constructor

* make change for dllexport

* fix sequence log to wal during recovery

* bump version again

---------

Co-authored-by: CI Bot <yiyun-sj@users.noreply.github.com>
Co-authored-by: Guodong Jin <guod.jin@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants