-
Notifications
You must be signed in to change notification settings - Fork 5k
enh: [6690002267] Optimize virtual table query with plenty of columns. #34341
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 3.0
Are you sure you want to change the base?
Conversation
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request optimizes virtual table queries with many columns by changing the dataBlockId type from int16_t to int64_t and implementing several performance improvements.
Changes:
- Changed
dataBlockIdfromint16_ttoint64_tthroughout the codebase to support more data blocks - Replaced O(n²) algorithms with O(n) hash-based lookups for column name validation and table reference lookups
- Replaced indexed loops with FOREACH macros for better performance with large node lists
- Removed the 2000 reference table limit for virtual tables
Reviewed changes
Copilot reviewed 34 out of 34 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
| test/cases/05-VirtualTables/test_vtable_max_column_num.py | Updated test to create 33,000 tables instead of 3,000; changed error expectations to execute expectations for large column reference tests |
| source/libs/scheduler/src/schJob.c | Refactored indexed loops to FOREACH macros for processing subplans and tasks |
| source/libs/scalar/test/scalar/scalarTests.cpp | Updated function signature to accept int64_t dataBlockId |
| source/libs/scalar/src/scalar.c | Updated error message format strings to handle int64_t dataBlockId |
| source/libs/scalar/src/filter.c | Updated debug output format strings for int64_t dataBlockId |
| source/libs/qcom/src/queryUtil.c | Replaced O(n²) nested loop for duplicate column name checking with O(n) hash-based approach |
| source/libs/planner/src/planUtil.c | Removed trailing newline |
| source/libs/planner/src/planSpliter.c | Refactored error handling in exchange node creation for better maintainability |
| source/libs/planner/src/planPhysiCreater.c | Changed dataBlockId type to int64_t; added merged hash optimization for multi-block slot ID resolution |
| source/libs/planner/src/planOptimizer.c | Improved control flow and error handling in scan path optimization |
| source/libs/planner/src/planLogicCreater.c | Replaced O(n) linear search with O(1) hash lookup for reference table nodes |
| source/libs/planner/inc/planInt.h | Changed nextDataBlockId from int16_t to int64_t |
| source/libs/parser/src/parTranslater.c | Added hash-based colRef lookup; refactored createColumnsByTable for better maintainability |
| source/libs/nodes/src/nodesMsgFuncs.c | Updated serialization/deserialization for int64_t dataBlockId |
| source/libs/nodes/src/nodesCodeFuncs.c | Updated JSON encoding/decoding for int64_t dataBlockId |
| source/libs/index/test/index_executor_tests.cpp | Updated test function signature for int64_t dataBlockId |
| source/libs/executor/test/queryPlanTests.cpp | Updated test code for int64_t dataBlockId |
| source/libs/executor/src/virtualtablescanoperator.c | Changed tagBlockId to int64_t; updated buffer size calculation; changed loop variable type |
| source/libs/executor/src/tsort.c | Changed return type of tsortGetBlockId from uint64_t to int64_t |
| source/libs/executor/src/operator.c | Updated function signatures for int64_t; removed trailing newline |
| source/libs/executor/src/mergeoperator.c | Changed srcBlkIds array element type to int64_t |
| source/libs/executor/src/hashjoinoperator.c | Changed blkId parameter type to int64_t |
| source/libs/executor/src/executil.c | Refactored loop to use FOREACH; improved validation logic |
| source/libs/executor/inc/tsort.h | Changed return type of tsortGetBlockId to int64_t |
| source/libs/executor/inc/operator.h | Changed resultDataBlockId and function return types to int64_t |
| source/libs/executor/inc/mergejoin.h | Changed blkId field type to int64_t |
| source/libs/executor/inc/hashjoin.h | Changed blkId field type to int64_t |
| source/libs/command/src/explain.c | Refactored indexed loops to FOREACH macros |
| source/dnode/vnode/src/meta/metaTable.c | Removed 2000 reference table limit validation |
| source/dnode/vnode/src/bse/bseTable.h | Changed blkIdx from int32_t to int64_t |
| source/client/src/clientHb.c | Removed verbose session metric logging |
| include/libs/nodes/querynodes.h | Changed dataBlockId fields to int64_t in ColumnNode and TargetNode |
| include/libs/nodes/plannodes.h | Changed dataBlockId field to int64_t in SDataBlockDescNode |
| include/common/tcommon.h | Changed blockId field to int64_t in SBlockID |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
ee0bfaa to
62a85c5
Compare
c633b58 to
18e7906
Compare
72a494c to
9c391d6
Compare
52d6edc to
ad57be5
Compare
Description
Issue(s)
Checklist
Please check the items in the checklist if applicable.