-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Draft] rust-agent-qbg #2751
base: main
Are you sure you want to change the base?
[Draft] rust-agent-qbg #2751
Conversation
📝 Walkthrough📝 WalkthroughWalkthroughThe changes in this pull request introduce a new library, Changes
Possibly related PRs
Suggested labels
Suggested reviewers
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
[CHATOPS:HELP] ChatOps commands.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 15
🧹 Outside diff range and nitpick comments (10)
rust/libs/algorithm/Cargo.toml (1)
25-25
: Documentation needed for QBG integrationSince this is a draft PR introducing a new algorithm component, please provide:
- Description of what QBG is and its purpose
- Integration requirements and dependencies
- Any performance implications or considerations
Consider adding these details to:
- PR description
- README in the QBG module
- Code documentation
rust/libs/algorithms/qbg/src/input.h (2)
32-39
: Add parameter names toset_qbg_construction_parameters
for clarityIncluding parameter names in the function declaration enhances readability and helps developers understand the purpose of each parameter.
49-65
: Include parameter names inset_qbg_build_parameters
functionProviding parameter names in the function declaration improves code comprehension and maintainability.
rust/libs/algorithms/qbg/src/lib.rs (5)
59-62
: Inconsistent Integer Types inset_hierarchical_clustering_init_mode
The function
set_hierarchical_clustering_init_mode
usesi16
for the parameterhierarchical_clustering_init_mode
, whereas similar parameters in other functions usei32
. Consider using consistent integer types to avoid potential issues.
63-65
: Type Mismatch inset_number_of_second_objects
The parameter
number_of_second_objects
inset_number_of_second_objects
is of typeu32
, while other similar parameters useusize
. For consistency and to prevent potential bugs, consider usingusize
here as well.
109-130
: Use Meaningful Values in Test Property SettersIn the test
test_qbg
, many property setters are assigned the value1
. Using more realistic and diverse values can improve the effectiveness of the test by covering a wider range of scenarios.
184-185
: Simplify Return Statements in Test FunctionsThe lines
return Ok(());
can be simplified to justOk(())
for brevity and idiomatic Rust code.Also applies to: 216-217
84-85
: Clarify Parameter Naming and Types innew_index
andnew_prebuilt_index
The
new_index
function takes aPin<&mut Property>
as the second parameter namedp
, whilenew_prebuilt_index
takes abool
namedp
. This overloading ofp
with different types may cause confusion. Consider renaming the boolean parameter to something more descriptive, likeprebuilt
, for clarity.rust/libs/algorithms/qbg/src/input.cpp (2)
315-328
: Missing Error Handling in Factory FunctionsThe factory functions
new_index
andnew_prebuilt_index
do not handle exceptions that may be thrown during object creation, potentially leading to unhandled exceptions.Consider using exception handling to capture and report any errors during object creation:
std::unique_ptr<Index> new_index(const rust::String &path, Property &p) { try { return std::make_unique<Index>(path, p); } catch (const std::exception &e) { // Handle exception, possibly log the error std::cerr << "Error creating Index: " << e.what() << std::endl; return nullptr; } }
227-238
: Use ofconst char*
Instead ofstd::string
inopen_index
Passing
cpath.c_str()
may lead to issues ifcpath
goes out of scope beforec_str()
is used.Since
cpath
is a local variable, it's safe here, but consider passingstd::string
directly if the underlying function accepts it, to reduce unnecessary conversions.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
⛔ Files ignored due to path filters (1)
rust/Cargo.lock
is excluded by!**/*.lock
📒 Files selected for processing (7)
rust/Cargo.toml
(1 hunks)rust/libs/algorithm/Cargo.toml
(1 hunks)rust/libs/algorithms/qbg/Cargo.toml
(1 hunks)rust/libs/algorithms/qbg/build.rs
(1 hunks)rust/libs/algorithms/qbg/src/input.cpp
(1 hunks)rust/libs/algorithms/qbg/src/input.h
(1 hunks)rust/libs/algorithms/qbg/src/lib.rs
(1 hunks)
✅ Files skipped from review due to trivial changes (2)
- rust/libs/algorithms/qbg/Cargo.toml
- rust/libs/algorithms/qbg/build.rs
🔇 Additional comments (4)
rust/Cargo.toml (2)
24-24
: LGTM! The workspace member addition follows project conventions.
The new qbg algorithm library is properly placed alongside other algorithm implementations (ngt, faiss).
24-24
: Verify qbg library setup
Let's verify that the qbg library directory exists and has the required Cargo.toml configuration.
✅ Verification successful
QBG library setup is properly configured
The verification shows that the qbg library is correctly set up with:
- Directory structure present at
rust/libs/algorithms/qbg/
- Required files:
Cargo.toml
,build.rs
, andsrc/
directory - Proper Cargo.toml configuration with:
- Package metadata
- Required dependencies (anyhow, cxx)
- Build dependencies (cxx-build, miette)
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify qbg library structure and configuration
# Check if the directory exists
if [ ! -d "rust/libs/algorithms/qbg" ]; then
echo "Error: qbg directory does not exist at expected path"
exit 1
fi
# Check for Cargo.toml and essential files
echo "Checking qbg library structure:"
ls -la rust/libs/algorithms/qbg/
# Display Cargo.toml contents if it exists
if [ -f "rust/libs/algorithms/qbg/Cargo.toml" ]; then
echo -e "\nQBG Cargo.toml contents:"
cat rust/libs/algorithms/qbg/Cargo.toml
else
echo "Error: Cargo.toml not found in qbg directory"
fi
Length of output: 1461
rust/libs/algorithm/Cargo.toml (1)
25-25
: Verify QBG integration path and dependencies
The dependency path looks consistent with other algorithm dependencies, but let's verify the integration.
✅ Verification successful
QBG dependency path and integration verified
The QBG integration appears to be properly structured:
- QBG directory exists at the correct path (
rust/libs/algorithms/qbg
) - The library implements a C++ bridge for QBG index construction and search
- Proper license headers and version metadata are present
- No direct usage in algorithm lib yet, which is expected for a draft PR
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify QBG integration
# Check if qbg directory exists
echo "Checking qbg directory..."
if [ -d "rust/libs/algorithms/qbg" ]; then
echo "✓ QBG directory exists"
else
echo "✗ QBG directory not found"
fi
# Check for other potential integration points
echo -e "\nChecking for other QBG references..."
rg -l "qbg" --type rust
Length of output: 336
Script:
#!/bin/bash
# Let's check the content of lib.rs to understand the integration points
echo "=== Content of QBG lib.rs ==="
cat rust/libs/algorithms/qbg/src/lib.rs
echo -e "\n=== Checking Cargo.toml for metadata and description ==="
cat rust/libs/algorithms/qbg/Cargo.toml
echo -e "\n=== Checking for any usage in algorithm lib ==="
rg "qbg" rust/libs/algorithm/src/ -A 2 -B 2
Length of output: 9834
rust/libs/algorithms/qbg/src/input.cpp (1)
273-273
: Incorrect Use of v.length()
in append
The use of v.length()
may be incorrect if v
does not have a length()
method or if it does not return the expected size.
Verify that v.length()
correctly returns the number of elements in the slice. If not, use v.size()
or adjust accordingly.
Run the following script to check for the correct method to retrieve the size of rust::Slice
:
Ensure that the correct method is consistently used across the codebase.
✅ Verification successful
Let's search for actual usage patterns of rust::Slice in the codebase to understand how it's being used and what methods are available. I've also included a search for the specific vector construction pattern to see the surrounding context.
Use of v.length()
is correct for rust::Slice
Looking at the codebase usage, v.length()
is the correct method for rust::Slice
. This is evidenced by its usage in qbg/src/input.cpp
and supported by the fact that rust::Slice
is a Rust-C++ bridge type where the Rust convention for getting collection size is length()
. The code correctly constructs a vector from the slice's iterators and uses length()
to get its size.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for the correct method to get size from rust::Slice in the codebase.
# Search for usage of rust::Slice methods
rg --type-add 'cpp' --files-with-matches 'rust::Slice' | xargs grep -E 'rust::Slice.*\.(length\(\)|size\(\))' -A 2
Length of output: 172
Script:
#!/bin/bash
# Search for rust::Slice usage patterns in cpp files
rg "rust::Slice" -A 3 -g "*.cpp"
# Also search for the specific line context
rg "std::vector<float>\(.*\.begin\(\),.*\.end\(\)\)" -g "*.cpp" -A 2
Length of output: 2993
d6e423f
to
e70ea8d
Compare
Deploying vald with Cloudflare Pages
|
e70ea8d
to
6de8a6a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
🧹 Outside diff range and nitpick comments (2)
rust/libs/algorithms/qbg/src/input.cpp (1)
31-34
: Add const correctness to getter methods.The getter methods should be marked as const since they don't modify the object's state.
-QBGConstructionParameters *Property::get_qbg_construction_parameters() +QBGConstructionParameters *Property::get_qbg_construction_parameters() const { return qbg_construction_parameters; } -QBGBuildParameters *Property::get_qbg_build_parameters() +QBGBuildParameters *Property::get_qbg_build_parameters() const { return qbg_build_parameters; }Also applies to: 95-98
rust/libs/algorithms/qbg/src/lib.rs (1)
362-443
: Add error case testingThe test only covers the happy path. Add tests for error cases:
- Invalid dimensions
- Empty vectors
- Out of bounds K values
- Invalid file paths
Add error case tests:
#[test] fn test_ffi_qbg_errors() -> Result<()> { let path = "invalid/path".to_string(); let mut p = Property::new(); p.init_qbg_construction_parameters(); // Test invalid path assert!(Index::new(&path, &mut p).is_err()); // Test invalid dimensions p.set_dimension(0); assert!(Index::new(&"index".to_string(), &mut p).is_err()); Ok(()) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
⛔ Files ignored due to path filters (1)
rust/Cargo.lock
is excluded by!**/*.lock
📒 Files selected for processing (7)
rust/Cargo.toml
(1 hunks)rust/libs/algorithm/Cargo.toml
(1 hunks)rust/libs/algorithms/qbg/Cargo.toml
(1 hunks)rust/libs/algorithms/qbg/build.rs
(1 hunks)rust/libs/algorithms/qbg/src/input.cpp
(1 hunks)rust/libs/algorithms/qbg/src/input.h
(1 hunks)rust/libs/algorithms/qbg/src/lib.rs
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
- rust/Cargo.toml
- rust/libs/algorithm/Cargo.toml
- rust/libs/algorithms/qbg/Cargo.toml
- rust/libs/algorithms/qbg/build.rs
- rust/libs/algorithms/qbg/src/input.h
🔇 Additional comments (2)
rust/libs/algorithms/qbg/src/input.cpp (2)
230-233
: Uncomment and implement the destructor.
The destructor is currently commented out, which could lead to resource leaks. The index should be properly closed when the object is destroyed.
335-348
: LGTM! Well-implemented factory functions.
The factory functions make good use of modern C++ practices with std::unique_ptr for automatic memory management.
void Property::set_extended_dimension(rust::usize extended_dimension) | ||
{ | ||
qbg_construction_parameters->extended_dimension = extended_dimension; | ||
} | ||
|
||
void Property::set_dimension(rust::usize dimension) | ||
{ | ||
qbg_construction_parameters->dimension = dimension; | ||
} | ||
|
||
void Property::set_number_of_subvectors(rust::usize number_of_subvectors) | ||
{ | ||
qbg_construction_parameters->number_of_subvectors = number_of_subvectors; | ||
} | ||
|
||
void Property::set_number_of_blobs(rust::usize number_of_blobs) | ||
{ | ||
qbg_construction_parameters->number_of_blobs = number_of_blobs; | ||
} | ||
|
||
void Property::set_internal_data_type(rust::i32 internal_data_type) | ||
{ | ||
qbg_construction_parameters->internal_data_type = internal_data_type; | ||
} | ||
|
||
void Property::set_data_type(rust::i32 data_type) | ||
{ | ||
qbg_construction_parameters->data_type = data_type; | ||
} | ||
|
||
void Property::set_distance_type(rust::i32 distance_type) | ||
{ | ||
qbg_construction_parameters->distance_type = distance_type; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add parameter validation in setter methods.
The setter methods for construction parameters should validate their input values to ensure they are within acceptable ranges. For example, dimension and number_of_subvectors should be positive values.
Example implementation for one setter:
void Property::set_dimension(rust::usize dimension)
{
+ if (dimension == 0) {
+ throw std::invalid_argument("dimension must be positive");
+ }
qbg_construction_parameters->dimension = dimension;
}
Committable suggestion skipped: line range outside the PR's diff.
void Index::search( | ||
rust::Slice<const rust::f32> v, | ||
rust::usize k, | ||
rust::i32 *ids, | ||
rust::f32 *distances) | ||
{ | ||
QBGQuery query; | ||
qbg_initialize_query(&query); | ||
std::vector<float> vec(v.begin(), v.end()); | ||
query.query = vec.data(); | ||
|
||
NGTError err = ngt_create_error_object(); | ||
NGTObjectDistances results = ngt_create_empty_results(err); | ||
bool ok = qbg_search_index(index, query, results, err); | ||
if (!ok) | ||
{ | ||
std::cerr << "Error: " << __func__ << std::endl; | ||
std::cerr << ngt_get_error_string(err) << std::endl; | ||
qbg_destroy_results(results); | ||
ngt_destroy_error_object(err); | ||
return; | ||
} | ||
|
||
size_t rsize = qbg_get_result_size(results, err); | ||
size_t limit = std::min(k, rsize); | ||
for (size_t i = 0; i < limit; i++) | ||
{ | ||
NGTObjectDistance obj = qbg_get_result(results, i, err); | ||
ids[i] = obj.id; | ||
distances[i] = obj.distance; | ||
} | ||
|
||
qbg_destroy_results(results); | ||
ngt_destroy_error_object(err); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider thread safety in search method.
The search method uses shared resources (index
) without synchronization. If this method is called from multiple threads, it could lead to race conditions.
Consider adding mutex protection:
+#include <mutex>
+
class Index {
+private:
+ std::mutex search_mutex;
+
public:
void search(
rust::Slice<const rust::f32> v,
rust::usize k,
rust::i32 *ids,
rust::f32 *distances)
{
+ std::lock_guard<std::mutex> lock(search_mutex);
// ... rest of the method
}
};
Committable suggestion skipped: line range outside the PR's diff.
rust::i32 Index::append(rust::Slice<const rust::f32> v) | ||
{ | ||
NGTError err = ngt_create_error_object(); | ||
std::vector<float> vec(v.begin(), v.end()); | ||
unsigned int id = qbg_append_object(index, vec.data(), v.length(), err); | ||
if (id == 0) | ||
{ | ||
std::cerr << ngt_get_error_string(err) << std::endl; | ||
ngt_destroy_error_object(err); | ||
return 0; | ||
} | ||
ngt_destroy_error_object(err); | ||
return id; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve error handling in append method.
The method returns 0 on error, which could be confused with a valid ID. Consider using exceptions or a more explicit error handling mechanism.
-rust::i32 Index::append(rust::Slice<const rust::f32> v)
+rust::i32 Index::append(rust::Slice<const rust::f32> v)
{
NGTError err = ngt_create_error_object();
std::vector<float> vec(v.begin(), v.end());
unsigned int id = qbg_append_object(index, vec.data(), v.length(), err);
if (id == 0)
{
- std::cerr << ngt_get_error_string(err) << std::endl;
+ std::string error_msg = ngt_get_error_string(err);
ngt_destroy_error_object(err);
- return 0;
+ throw std::runtime_error("Failed to append object: " + error_msg);
}
ngt_destroy_error_object(err);
return id;
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
rust::i32 Index::append(rust::Slice<const rust::f32> v) | |
{ | |
NGTError err = ngt_create_error_object(); | |
std::vector<float> vec(v.begin(), v.end()); | |
unsigned int id = qbg_append_object(index, vec.data(), v.length(), err); | |
if (id == 0) | |
{ | |
std::cerr << ngt_get_error_string(err) << std::endl; | |
ngt_destroy_error_object(err); | |
return 0; | |
} | |
ngt_destroy_error_object(err); | |
return id; | |
} | |
rust::i32 Index::append(rust::Slice<const rust::f32> v) | |
{ | |
NGTError err = ngt_create_error_object(); | |
std::vector<float> vec(v.begin(), v.end()); | |
unsigned int id = qbg_append_object(index, vec.data(), v.length(), err); | |
if (id == 0) | |
{ | |
std::string error_msg = ngt_get_error_string(err); | |
ngt_destroy_error_object(err); | |
throw std::runtime_error("Failed to append object: " + error_msg); | |
} | |
ngt_destroy_error_object(err); | |
return id; | |
} |
pub fn open_index(&mut self, path: &String, prebuilt: bool) { | ||
self.inner.pin_mut().open_index(path, prebuilt); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add error handling for index operations
Methods like open_index
, build_index
, and save_index
don't return Results, making it impossible to handle failures gracefully.
Add consistent error handling:
- pub fn open_index(&mut self, path: &String, prebuilt: bool) {
- self.inner.pin_mut().open_index(path, prebuilt);
+ pub fn open_index(&mut self, path: &String, prebuilt: bool) -> Result<(), cxx::Exception> {
+ self.inner.pin_mut().open_index(path, prebuilt)?;
+ Ok(())
}
- pub fn build_index(&mut self, path: &String, p: &mut property::Property) {
- self.inner.pin_mut().build_index(path, p.get_property());
+ pub fn build_index(&mut self, path: &String, p: &mut property::Property) -> Result<(), cxx::Exception> {
+ self.inner.pin_mut().build_index(path, p.get_property())?;
+ Ok(())
}
Also applies to: 334-336, 338-340
unsafe fn search( | ||
self: Pin<&mut Index>, | ||
v: &[f32], | ||
k: usize, | ||
ids: *mut i32, | ||
distances: *mut f32, | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add safety documentation for unsafe search function
The unsafe search function uses raw pointers without clear documentation about:
- Required buffer sizes for
ids
anddistances
- Alignment requirements
- Ownership and lifetime guarantees
- Valid parameter ranges
Add comprehensive safety documentation:
+ /// # Safety
+ ///
+ /// Caller must ensure:
+ /// - `ids` points to a buffer of size >= k
+ /// - `distances` points to a buffer of size >= k
+ /// - Both buffers are properly aligned
+ /// - Buffers remain valid for the duration of the call
+ /// - k > 0 and k <= number of indexed vectors
unsafe fn search(
self: Pin<&mut Index>,
v: &[f32],
k: usize,
ids: *mut i32,
distances: *mut f32,
);
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
unsafe fn search( | |
self: Pin<&mut Index>, | |
v: &[f32], | |
k: usize, | |
ids: *mut i32, | |
distances: *mut f32, | |
); | |
/// # Safety | |
/// | |
/// Caller must ensure: | |
/// - `ids` points to a buffer of size >= k | |
/// - `distances` points to a buffer of size >= k | |
/// - Both buffers are properly aligned | |
/// - Buffers remain valid for the duration of the call | |
/// - k > 0 and k <= number of indexed vectors | |
unsafe fn search( | |
self: Pin<&mut Index>, | |
v: &[f32], | |
k: usize, | |
ids: *mut i32, | |
distances: *mut f32, | |
); |
pub fn set_qbg_construction_parameters( | ||
&mut self, | ||
extended_dimension: usize, | ||
dimension: usize, | ||
number_of_subvectors: usize, | ||
number_of_blobs: usize, | ||
internal_data_type: i32, | ||
data_type: i32, | ||
distance_type: i32, | ||
) { | ||
self.inner | ||
.pin_mut() | ||
.set_extended_dimension(extended_dimension); | ||
self.inner.pin_mut().set_dimension(dimension); | ||
self.inner | ||
.pin_mut() | ||
.set_number_of_subvectors(number_of_subvectors); | ||
self.inner.pin_mut().set_number_of_blobs(number_of_blobs); | ||
self.inner | ||
.pin_mut() | ||
.set_internal_data_type(internal_data_type); | ||
self.inner.pin_mut().set_data_type(data_type); | ||
self.inner.pin_mut().set_distance_type(distance_type); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add input validation for construction parameters
The set_qbg_construction_parameters
method accepts parameters without validation. Consider adding checks for:
- Non-zero dimensions
- Valid relationships between dimensions and subvectors
- Valid enum values for data types and distance types
Add validation:
pub fn set_qbg_construction_parameters(
&mut self,
extended_dimension: usize,
dimension: usize,
number_of_subvectors: usize,
number_of_blobs: usize,
internal_data_type: i32,
data_type: i32,
distance_type: i32,
) {
+ if dimension == 0 {
+ panic!("dimension must be greater than 0");
+ }
+ if number_of_subvectors == 0 || number_of_subvectors > dimension {
+ panic!("invalid number_of_subvectors");
+ }
+ // Add more validation as needed
self.inner
.pin_mut()
.set_extended_dimension(extended_dimension);
Committable suggestion skipped: line range outside the PR's diff.
rust/libs/algorithms/qbg/src/lib.rs
Outdated
// Search | ||
let mut ids: Vec<i32> = vec![0; K]; | ||
let mut distances: Vec<f32> = vec![0.0; K]; | ||
let mut vec = vec![0.0; DIMENSION]; | ||
for i in 0..DIMENSION { | ||
vec[i] = i as f32 | ||
} | ||
unsafe { | ||
index.pin_mut().search( | ||
vec.as_slice(), | ||
K, | ||
&mut ids[0] as *mut i32, | ||
&mut distances[0] as *mut f32, | ||
) | ||
}; | ||
println!("ids:\n{:?}", ids); | ||
println!("distances:\n{:?}", distances); | ||
index.pin_mut().close_index(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Enhance search result validation in tests
The test prints search results without verifying their correctness. Add assertions to validate:
- Result ordering (by distance)
- Distance values are non-negative and properly ordered
- IDs are within valid range
Add validation:
unsafe {
index.pin_mut().search(
vec.as_slice(),
K,
&mut ids[0] as *mut i32,
&mut distances[0] as *mut f32,
)
};
- println!("ids:\n{:?}", ids);
- println!("distances:\n{:?}", distances);
+ // Validate results
+ assert!(ids.iter().all(|&id| id >= 0 && id < 100), "Invalid IDs in results");
+ assert!(distances.windows(2).all(|w| w[0] <= w[1]), "Distances not properly ordered");
+ assert!(distances.iter().all(|&d| d >= 0.0), "Negative distances found");
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
// Search | |
let mut ids: Vec<i32> = vec![0; K]; | |
let mut distances: Vec<f32> = vec![0.0; K]; | |
let mut vec = vec![0.0; DIMENSION]; | |
for i in 0..DIMENSION { | |
vec[i] = i as f32 | |
} | |
unsafe { | |
index.pin_mut().search( | |
vec.as_slice(), | |
K, | |
&mut ids[0] as *mut i32, | |
&mut distances[0] as *mut f32, | |
) | |
}; | |
println!("ids:\n{:?}", ids); | |
println!("distances:\n{:?}", distances); | |
index.pin_mut().close_index(); | |
// Search | |
let mut ids: Vec<i32> = vec![0; K]; | |
let mut distances: Vec<f32> = vec![0.0; K]; | |
let mut vec = vec![0.0; DIMENSION]; | |
for i in 0..DIMENSION { | |
vec[i] = i as f32 | |
} | |
unsafe { | |
index.pin_mut().search( | |
vec.as_slice(), | |
K, | |
&mut ids[0] as *mut i32, | |
&mut distances[0] as *mut f32, | |
) | |
}; | |
// Validate results | |
assert!(ids.iter().all(|&id| id >= 0 && id < 100), "Invalid IDs in results"); | |
assert!(distances.windows(2).all(|w| w[0] <= w[1]), "Distances not properly ordered"); | |
assert!(distances.iter().all(|&d| d >= 0.0), "Negative distances found"); | |
index.pin_mut().close_index(); |
6de8a6a
to
6e5aa96
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
⛔ Files ignored due to path filters (1)
rust/Cargo.lock
is excluded by!**/*.lock
📒 Files selected for processing (7)
rust/Cargo.toml
(1 hunks)rust/libs/algorithm/Cargo.toml
(1 hunks)rust/libs/algorithms/qbg/Cargo.toml
(1 hunks)rust/libs/algorithms/qbg/build.rs
(1 hunks)rust/libs/algorithms/qbg/src/input.cpp
(1 hunks)rust/libs/algorithms/qbg/src/input.h
(1 hunks)rust/libs/algorithms/qbg/src/lib.rs
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
- rust/Cargo.toml
- rust/libs/algorithm/Cargo.toml
- rust/libs/algorithms/qbg/Cargo.toml
- rust/libs/algorithms/qbg/build.rs
- rust/libs/algorithms/qbg/src/input.h
- rust/libs/algorithms/qbg/src/lib.rs
🔇 Additional comments (1)
rust/libs/algorithms/qbg/src/input.cpp (1)
335-348
: LGTM! Well-designed factory functions.
The factory functions are well-implemented:
- Use of
std::unique_ptr
ensures proper memory management - Clear and concise interface for object creation
- Follow modern C++ practices
void Property::set_qbg_construction_parameters( | ||
rust::usize extended_dimension, | ||
rust::usize dimension, | ||
rust::usize number_of_subvectors, | ||
rust::usize number_of_blobs, | ||
rust::i32 internal_data_type, | ||
rust::i32 data_type, | ||
rust::i32 distance_type) | ||
{ | ||
qbg_initialize_construction_parameters(qbg_construction_parameters); | ||
qbg_construction_parameters->extended_dimension = extended_dimension; | ||
qbg_construction_parameters->dimension = dimension; | ||
qbg_construction_parameters->number_of_subvectors = number_of_subvectors; | ||
qbg_construction_parameters->number_of_blobs = number_of_blobs; | ||
qbg_construction_parameters->internal_data_type = internal_data_type; | ||
qbg_construction_parameters->data_type = data_type; | ||
qbg_construction_parameters->distance_type = distance_type; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add comprehensive parameter validation in bulk parameter setters.
The bulk parameter setters set_qbg_construction_parameters
and set_qbg_build_parameters
should validate their input parameters before setting them. This helps catch invalid configurations early.
Example implementation for set_qbg_construction_parameters
:
void Property::set_qbg_construction_parameters(
rust::usize extended_dimension,
rust::usize dimension,
rust::usize number_of_subvectors,
rust::usize number_of_blobs,
rust::i32 internal_data_type,
rust::i32 data_type,
rust::i32 distance_type)
{
+ if (dimension == 0) {
+ throw std::invalid_argument("dimension must be positive");
+ }
+ if (number_of_subvectors == 0) {
+ throw std::invalid_argument("number_of_subvectors must be positive");
+ }
+ if (number_of_blobs == 0) {
+ throw std::invalid_argument("number_of_blobs must be positive");
+ }
+ if (extended_dimension < dimension) {
+ throw std::invalid_argument("extended_dimension must be >= dimension");
+ }
qbg_initialize_construction_parameters(qbg_construction_parameters);
// ... rest of the method
}
Also applies to: 105-136
Index::~Index() | ||
{ | ||
// close_index(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uncomment and implement the destructor.
The destructor is currently commented out, which will lead to resource leaks. The index needs to be properly closed when the object is destroyed.
Apply this fix:
Index::~Index()
{
- // close_index();
+ close_index();
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
Index::~Index() | |
{ | |
// close_index(); | |
} | |
Index::~Index() | |
{ | |
close_index(); | |
} |
Description
Related Issue
Versions
Checklist
Special notes for your reviewer
Summary by CodeRabbit