feat: support ast-grep-wasm#2484
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a new wasm crate Changes
Sequence Diagram(s)sequenceDiagram
participant Client as JavaScript Client
participant WASM as ast-grep-wasm (WASM)
participant TS as web-tree-sitter Runtime
participant Parser as Language Parser
participant Source as Source Code
Client->>WASM: initialize_tree_sitter()
WASM->>TS: init runtime
Client->>WASM: register_dynamic_language(map{name->WasmLangInfo})
WASM->>Parser: load language WASM (path/bytes)
Parser-->>WASM: Language ready
Client->>WASM: parse(lang, src)
WASM->>Parser: create parser instance
Parser->>TS: parse source
TS-->>WASM: Tree
WASM->>WASM: wrap Tree -> WasmDoc / SgRoot
WASM-->>Client: SgRoot
Client->>WASM: node.matches(pattern) / node.replace(...) / commit_edits(...)
WASM->>WASM: compile pattern / compute edits / apply edits
WASM-->>Client: match/result/updated source
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2484 +/- ##
==========================================
- Coverage 88.14% 84.41% -3.74%
==========================================
Files 108 113 +5
Lines 17876 18667 +791
==========================================
Hits 15757 15757
- Misses 2119 2910 +791 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
Verify each finding against the current code and only fix it if needed.
In `@crates/wasm/__test__/index.spec.mjs`:
- Around line 277-312: The tests "parse multiple languages simultaneously" and
"kind works for multiple languages" call wasm.setupParser('python', ...) which
mutates global parser state; make the intent explicit and avoid interleaving by
marking these tests serial. Update the two test declarations (the functions
named parse multiple languages simultaneously and kind works for multiple
languages) to use test.serial instead of test, leaving the calls to setupParser,
parse, and kind unchanged so the Python WASM parser is initialized
deterministically.
In `@crates/wasm/Cargo.toml`:
- Line 31: The Cargo.toml currently pins the unmaintained allocator crate
`wee_alloc`; remove the `wee_alloc = "0.4.5"` dependency (or replace it with an
actively maintained alternative such as `lol_alloc` or `alloc_cat`) and update
any code that references it (e.g., remove or change the `#[global_allocator]`
usage that points to `wee_alloc` in your crate source such as lib.rs or
main.rs). If switching, add the chosen allocator dependency to Cargo.toml,
follow that allocator's initialization instructions (feature flags or
`#[global_allocator]` setup), and run cargo build/test to ensure no references
to `wee_alloc` remain. Ensure the default `dlmalloc` behavior is acceptable if
you opt to simply remove `wee_alloc` and verify wasm build/test results.
In `@crates/wasm/package.json`:
- Around line 16-18: The repository is missing automated publishing and
top-level entry fields for the `@ast-grep/wasm` package: update the release
workflow (the release-npm job in .github/workflows/release.yml) to include a
step that publishes `@ast-grep/wasm` (publishing the contents of crates/wasm/pkg)
or explicitly document/manual-publish the package; and add entry point fields to
crates/wasm/package.json (e.g., main/module/exports pointing to the generated
pkg/ package.json or index.js) so consumers get correct entry resolution when
installing via yarn add `@ast-grep/wasm`; ensure the change references the
wasm-pack output in pkg/ and aligns publishConfig already present in the
top-level manifest.
In `@crates/wasm/src/doc.rs`:
- Around line 66-84: accept_edit builds a modified copy called input, computes
positions, but never stores the modified text back into self.inner so subsequent
do_edit reads stale source; update accept_edit to apply the splice to self.inner
(or assign input back to self.inner) before computing new_end_position and
returning the ts::Edit so self.inner, the value used by do_edit, reflects the
edit; adjust the order so pos_for_char_offset calls use the updated self.inner
and keep the ts::Edit::new call the same.
- Around line 66-69: The variable names in accept_edit (start_byte,
old_end_byte, new_end_byte) are misleading because Wrapper stores data as
Vec<char> and edit.position/lengths are character offsets; rename them to
reflect char indices (e.g., start_char, old_end_char, new_end_char) and update
every use within fn accept_edit to the new identifiers and any related comments
to avoid implying byte offsets. Ensure any derived calculations (like using
edit.position + edit.deleted_length) stay identical but use the new names so the
semantics remain character-based throughout the function.
In `@crates/wasm/src/sg_node.rs`:
- Around line 379-400: commit_edits currently underflows when an edit.start_pos
< node range start and silently skips overlapping edits; fix by validating each
WasmEdit before applying: in commit_edits, compute offset =
self.inner.range().start and for each edit use checked_sub on start_pos and
end_pos (or explicitly compare start_pos/end_pos against offset and range end)
and return a JsError if an edit is out of bounds or if start_pos > end_pos, and
also return an error when an edit overlaps the previous applied edit instead of
silently continuing; update the logic around pos/start to use the validated
usize positions and keep the sorting by start_pos to preserve order.
- Around line 180-218: The five methods (matches, inside, has, precedes,
follows) duplicate the same three-arm MatcherType dispatch after calling
parse_matcher; replace the repeated match with a single helper (macro or generic
closure) that takes self, the JsValue, and the target method name, and then
calls parse_matcher once and dispatches Pattern/Kind/RuleCore to
self.inner.<method>. Implement the helper (e.g., macro_rules! dispatch_matcher)
and update each of matches, inside, has, precedes, follows to a one-line call to
the helper referencing parse_matcher, MatcherType variants, and inner so the
behavior is unchanged but boilerplate removed.
- Around line 94-108: parse_matcher currently accepts non-integer or
out-of-range f64 values for kind IDs (m.as_f64() cast to u16), which can produce
silent incorrect behavior; update parse_matcher to validate the numeric matcher
by checking that m.as_f64() is an integer and within 0..=65535 before casting
and calling KindMatcher::from_id, and return a meaningful JsError (with context)
when the value is non-integer or out of range; reference the parse_matcher
function, the m.as_f64() branch, KindMatcher::from_id, and the MatcherType::Kind
constructor when making this change.
In `@crates/wasm/src/ts_types.rs`:
- Around line 83-100: Add a brief comment in Edit::new above the six
Reflect::set(...).unwrap() calls explaining that these unwraps are safe: the
code operates on a freshly created Object::new() with fixed string property
names and JsValue-convertible primitives/Points so Reflect::set cannot fail in
this context; reference the use of Reflect::set, Object::new(), and the unwrap()
calls to make the justification clear for future readers.
- Around line 167-183: The error path currently uses unchecked castors which can
silently mangle unexpected JS errors; in both Language::load_bytes and
Language::load_path, replace the .map_err(JsCast::unchecked_into) with a closure
that attempts a safe/dynamic conversion of the rejection JsValue into a
LanguageError (e.g., try dyn_into/dyn_ref), and if that fails fall back to
converting to a generic JsError/JsValue wrapper so you preserve the original
diagnostic; make the change in the JsFuture::from(...).await.chain so both
functions share the same safe fallback logic.
In `@crates/wasm/src/wasm_lang.rs`:
- Around line 97-101: The unsafe impls of Send and Sync for TsParser (wrapping
ts::Parser) are unconditional and unsafe outside single-threaded WASM; gate
these impls with #[cfg(target_arch = "wasm32")] so they only apply when
compiling to wasm32. Locate the unsafe impl blocks for Send and Sync on TsParser
and add the cfg attribute (or equivalent cfg_if) so non-wasm targets do not get
these impls, keeping the TsParser struct and its Clone impl unchanged.
- Around line 129-135: get_ts_language currently uses expect_throw and will
panic if the parser/language isn't loaded, causing an uncatchable throw when
Language trait methods (kind_to_id, field_to_id) or their JS-facing helpers
(kind(), pattern()) are called before setupParser; add a clear doc comment on
get_ts_language (or on the impl of the Language trait) stating the precondition
that setupParser must be called prior to invoking these methods and update the
expect_throw messages to explicitly mention that setupParser was not called so
callers can see the required initialization step.
- Around line 209-226: pre_process_pattern always allocates a new String even
when query has no '$'; add an early-return guard at the top that checks if query
contains '$' (e.g. if !query.contains('$') { return Cow::Borrowed(query); }) so
you only allocate and build ret when replacements are actually needed; keep the
rest of the function (variables dollar_count, ret, loop, final sigil handling,
and Cow::Owned conversion) unchanged.
In `@crates/wasm/tests/setup.js`:
- Line 1: Remove the unused import of path: delete the const path =
require("path"); line (or replace it with a used helper if you intended to use
path) so that the variable path is not declared but unused; search for the
symbol "path" in the file to confirm no usage before removing.
🧹 Nitpick comments (11)
🤖 Fix all nitpicks with AI agents
Verify each finding against the current code and only fix it if needed. In `@crates/wasm/__test__/index.spec.mjs`: - Around line 277-312: The tests "parse multiple languages simultaneously" and "kind works for multiple languages" call wasm.setupParser('python', ...) which mutates global parser state; make the intent explicit and avoid interleaving by marking these tests serial. Update the two test declarations (the functions named parse multiple languages simultaneously and kind works for multiple languages) to use test.serial instead of test, leaving the calls to setupParser, parse, and kind unchanged so the Python WASM parser is initialized deterministically. In `@crates/wasm/Cargo.toml`: - Line 31: The Cargo.toml currently pins the unmaintained allocator crate `wee_alloc`; remove the `wee_alloc = "0.4.5"` dependency (or replace it with an actively maintained alternative such as `lol_alloc` or `alloc_cat`) and update any code that references it (e.g., remove or change the `#[global_allocator]` usage that points to `wee_alloc` in your crate source such as lib.rs or main.rs). If switching, add the chosen allocator dependency to Cargo.toml, follow that allocator's initialization instructions (feature flags or `#[global_allocator]` setup), and run cargo build/test to ensure no references to `wee_alloc` remain. Ensure the default `dlmalloc` behavior is acceptable if you opt to simply remove `wee_alloc` and verify wasm build/test results. In `@crates/wasm/src/doc.rs`: - Around line 66-69: The variable names in accept_edit (start_byte, old_end_byte, new_end_byte) are misleading because Wrapper stores data as Vec<char> and edit.position/lengths are character offsets; rename them to reflect char indices (e.g., start_char, old_end_char, new_end_char) and update every use within fn accept_edit to the new identifiers and any related comments to avoid implying byte offsets. Ensure any derived calculations (like using edit.position + edit.deleted_length) stay identical but use the new names so the semantics remain character-based throughout the function. In `@crates/wasm/src/sg_node.rs`: - Around line 180-218: The five methods (matches, inside, has, precedes, follows) duplicate the same three-arm MatcherType dispatch after calling parse_matcher; replace the repeated match with a single helper (macro or generic closure) that takes self, the JsValue, and the target method name, and then calls parse_matcher once and dispatches Pattern/Kind/RuleCore to self.inner.<method>. Implement the helper (e.g., macro_rules! dispatch_matcher) and update each of matches, inside, has, precedes, follows to a one-line call to the helper referencing parse_matcher, MatcherType variants, and inner so the behavior is unchanged but boilerplate removed. - Around line 94-108: parse_matcher currently accepts non-integer or out-of-range f64 values for kind IDs (m.as_f64() cast to u16), which can produce silent incorrect behavior; update parse_matcher to validate the numeric matcher by checking that m.as_f64() is an integer and within 0..=65535 before casting and calling KindMatcher::from_id, and return a meaningful JsError (with context) when the value is non-integer or out of range; reference the parse_matcher function, the m.as_f64() branch, KindMatcher::from_id, and the MatcherType::Kind constructor when making this change. In `@crates/wasm/src/ts_types.rs`: - Around line 83-100: Add a brief comment in Edit::new above the six Reflect::set(...).unwrap() calls explaining that these unwraps are safe: the code operates on a freshly created Object::new() with fixed string property names and JsValue-convertible primitives/Points so Reflect::set cannot fail in this context; reference the use of Reflect::set, Object::new(), and the unwrap() calls to make the justification clear for future readers. - Around line 167-183: The error path currently uses unchecked castors which can silently mangle unexpected JS errors; in both Language::load_bytes and Language::load_path, replace the .map_err(JsCast::unchecked_into) with a closure that attempts a safe/dynamic conversion of the rejection JsValue into a LanguageError (e.g., try dyn_into/dyn_ref), and if that fails fall back to converting to a generic JsError/JsValue wrapper so you preserve the original diagnostic; make the change in the JsFuture::from(...).await.chain so both functions share the same safe fallback logic. In `@crates/wasm/src/wasm_lang.rs`: - Around line 97-101: The unsafe impls of Send and Sync for TsParser (wrapping ts::Parser) are unconditional and unsafe outside single-threaded WASM; gate these impls with #[cfg(target_arch = "wasm32")] so they only apply when compiling to wasm32. Locate the unsafe impl blocks for Send and Sync on TsParser and add the cfg attribute (or equivalent cfg_if) so non-wasm targets do not get these impls, keeping the TsParser struct and its Clone impl unchanged. - Around line 129-135: get_ts_language currently uses expect_throw and will panic if the parser/language isn't loaded, causing an uncatchable throw when Language trait methods (kind_to_id, field_to_id) or their JS-facing helpers (kind(), pattern()) are called before setupParser; add a clear doc comment on get_ts_language (or on the impl of the Language trait) stating the precondition that setupParser must be called prior to invoking these methods and update the expect_throw messages to explicitly mention that setupParser was not called so callers can see the required initialization step. - Around line 209-226: pre_process_pattern always allocates a new String even when query has no '$'; add an early-return guard at the top that checks if query contains '$' (e.g. if !query.contains('$') { return Cow::Borrowed(query); }) so you only allocate and build ret when replacements are actually needed; keep the rest of the function (variables dollar_count, ret, loop, final sigil handling, and Cow::Owned conversion) unchanged. In `@crates/wasm/tests/setup.js`: - Line 1: Remove the unused import of path: delete the const path = require("path"); line (or replace it with a used helper if you intended to use path) so that the variable path is not declared but unused; search for the symbol "path" in the file to confirm no usage before removing.crates/wasm/__test__/index.spec.mjs (1)
277-312: Multi-language tests set up Python parser inline — potential test ordering concern.These async tests call
setupParser('python', ...)which mutates global state. If ava runs tests concurrently (even in the same thread), a non-multi-language test could theoretically interleave with the Python setup. In practice this is safe since WASM is single-threaded and the setup is idempotent, but you could consider usingtest.serialfor these multi-language tests to make the intent explicit.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/wasm/__test__/index.spec.mjs` around lines 277 - 312, The tests "parse multiple languages simultaneously" and "kind works for multiple languages" call wasm.setupParser('python', ...) which mutates global parser state; make the intent explicit and avoid interleaving by marking these tests serial. Update the two test declarations (the functions named parse multiple languages simultaneously and kind works for multiple languages) to use test.serial instead of test, leaving the calls to setupParser, parse, and kind unchanged so the Python WASM parser is initialized deterministically.crates/wasm/src/ts_types.rs (2)
83-100:Reflect::setpanics are acceptable here but could be documented.All six
unwrap()calls onReflect::setoperate on a freshly createdObject::new(), so they cannot fail in practice. A brief comment noting why these are safe would help future readers, but this is purely optional.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/wasm/src/ts_types.rs` around lines 83 - 100, Add a brief comment in Edit::new above the six Reflect::set(...).unwrap() calls explaining that these unwraps are safe: the code operates on a freshly created Object::new() with fixed string property names and JsValue-convertible primitives/Points so Reflect::set cannot fail in this context; reference the use of Reflect::set, Object::new(), and the unwrap() calls to make the justification clear for future readers.
167-183:unchecked_intoon error path could mask unexpected JS errors.
load_bytesandload_pathuse.map_err(JsCast::unchecked_into)to cast the rejection value toLanguageError. If the JSLanguage.load()rejects with a different error type (e.g., a networkTypeError), this unchecked cast would produce a garbageLanguageErrorwrapper without any visible diagnostic.Consider falling back to a generic
JsErrorconversion if the cast is not valid, or at least document the assumption thatLanguage.load()always rejects with aLanguageError.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/wasm/src/ts_types.rs` around lines 167 - 183, The error path currently uses unchecked castors which can silently mangle unexpected JS errors; in both Language::load_bytes and Language::load_path, replace the .map_err(JsCast::unchecked_into) with a closure that attempts a safe/dynamic conversion of the rejection JsValue into a LanguageError (e.g., try dyn_into/dyn_ref), and if that fails fall back to converting to a generic JsError/JsValue wrapper so you preserve the original diagnostic; make the change in the JsFuture::from(...).await.chain so both functions share the same safe fallback logic.crates/wasm/tests/setup.js (1)
1-1: Unusedpathimport.
pathis imported but never used in this file.🧹 Proposed fix
-const path = require("path"); - -exports.parserPath = function (lang) { +exports.parserPath = function (lang) {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/wasm/tests/setup.js` at line 1, Remove the unused import of path: delete the const path = require("path"); line (or replace it with a used helper if you intended to use path) so that the variable path is not declared but unused; search for the symbol "path" in the file to confirm no usage before removing.crates/wasm/Cargo.toml (1)
31-31:wee_allocis unmaintained and has known bugs that should be addressed.
wee_allocwas archived on Aug 25, 2025 and has an RustSec unmaintained advisory (RUSTSEC-2022-0054). The crate has an unfixed unbounded memory leak that can cause resource exhaustion. The defaultdlmallocallocator used bywasm32-unknown-unknownis reliable and adequate for most use cases. Remove the custom allocator, or switch to an actively maintained alternative likelol_allocoralloc_cat.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/wasm/Cargo.toml` at line 31, The Cargo.toml currently pins the unmaintained allocator crate `wee_alloc`; remove the `wee_alloc = "0.4.5"` dependency (or replace it with an actively maintained alternative such as `lol_alloc` or `alloc_cat`) and update any code that references it (e.g., remove or change the `#[global_allocator]` usage that points to `wee_alloc` in your crate source such as lib.rs or main.rs). If switching, add the chosen allocator dependency to Cargo.toml, follow that allocator's initialization instructions (feature flags or `#[global_allocator]` setup), and run cargo build/test to ensure no references to `wee_alloc` remain. Ensure the default `dlmalloc` behavior is acceptable if you opt to simply remove `wee_alloc` and verify wasm build/test results.crates/wasm/src/doc.rs (1)
66-69: Misleading variable names: these are character offsets, not byte offsets.
WrapperusesVec<char>, soedit.positionand lengths are character indices. Naming themstart_byte/old_end_byte/new_end_byteis confusing and could mislead future maintainers into thinking they're byte-based.Suggested rename
fn accept_edit(&mut self, edit: &Edit<Self>) -> ts::Edit { - let start_byte = edit.position; - let old_end_byte = edit.position + edit.deleted_length; - let new_end_byte = edit.position + edit.inserted_text.len(); + let start_char = edit.position; + let old_end_char = edit.position + edit.deleted_length; + let new_end_char = edit.position + edit.inserted_text.len();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/wasm/src/doc.rs` around lines 66 - 69, The variable names in accept_edit (start_byte, old_end_byte, new_end_byte) are misleading because Wrapper stores data as Vec<char> and edit.position/lengths are character offsets; rename them to reflect char indices (e.g., start_char, old_end_char, new_end_char) and update every use within fn accept_edit to the new identifiers and any related comments to avoid implying byte offsets. Ensure any derived calculations (like using edit.position + edit.deleted_length) stay identical but use the new names so the semantics remain character-based throughout the function.crates/wasm/src/wasm_lang.rs (3)
97-101: UnsafeSend+Syncimpls rely on WASM's single-threaded execution model.The safety of these impls depends on
ts::Parser(a JS-bound object) never being accessed from multiple threads. This holds onwasm32(single-threaded), but the impls are unconditional — they'd also apply if the crate were compiled for a multi-threaded non-WASM target. Consider gating these with#[cfg(target_arch = "wasm32")]to avoid unsoundness if the struct is ever reused outside WASM.Suggested fix
#[derive(Clone)] struct TsParser(ts::Parser); +#[cfg(target_arch = "wasm32")] unsafe impl Send for TsParser {} +#[cfg(target_arch = "wasm32")] unsafe impl Sync for TsParser {}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/wasm/src/wasm_lang.rs` around lines 97 - 101, The unsafe impls of Send and Sync for TsParser (wrapping ts::Parser) are unconditional and unsafe outside single-threaded WASM; gate these impls with #[cfg(target_arch = "wasm32")] so they only apply when compiling to wasm32. Locate the unsafe impl blocks for Send and Sync on TsParser and add the cfg attribute (or equivalent cfg_if) so non-wasm targets do not get these impls, keeping the TsParser struct and its Clone impl unchanged.
129-135:get_ts_languagepanics (viaexpect_throw) if the language isn't loaded.This is called from the
Languagetrait methodskind_to_idandfield_to_id(lines 198–206), which don't returnResult. If a caller invokeskind()orpattern()beforesetupParser, the JS side gets an uncatchable throw. This seems intentional given the trait signature constraint, but worth a doc comment clarifying the precondition.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/wasm/src/wasm_lang.rs` around lines 129 - 135, get_ts_language currently uses expect_throw and will panic if the parser/language isn't loaded, causing an uncatchable throw when Language trait methods (kind_to_id, field_to_id) or their JS-facing helpers (kind(), pattern()) are called before setupParser; add a clear doc comment on get_ts_language (or on the impl of the Language trait) stating the precondition that setupParser must be called prior to invoking these methods and update the expect_throw messages to explicitly mention that setupParser was not called so callers can see the required initialization step.
209-226:pre_process_patternalways allocates even when the input has no$characters.When the query contains no
$, the loop copies every character intoretand returnsCow::Owned. A quick early-return for the common no-op case would avoid the allocation.Suggested optimization
fn pre_process_pattern(expando: char, query: &str) -> Cow<'_, str> { + if !query.contains('$') { + return Cow::Borrowed(query); + } let mut ret = Vec::with_capacity(query.len()); ...🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/wasm/src/wasm_lang.rs` around lines 209 - 226, pre_process_pattern always allocates a new String even when query has no '$'; add an early-return guard at the top that checks if query contains '$' (e.g. if !query.contains('$') { return Cow::Borrowed(query); }) so you only allocate and build ret when replacements are actually needed; keep the rest of the function (variables dollar_count, ret, loop, final sigil handling, and Cow::Owned conversion) unchanged.crates/wasm/src/sg_node.rs (2)
180-218: Consider a helper to reduce the repetitive 3-armMatcherTypedispatch.
matches,inside,has,precedes, andfollowsall share the identical match-on-MatcherTypestructure. A macro or a closure-based helper could reduce this boilerplate.Example macro approach
macro_rules! dispatch_matcher { ($self:expr, $m:expr, $method:ident) => {{ Ok(match $self.parse_matcher($m)? { MatcherType::Pattern(p) => $self.inner.$method(p), MatcherType::Kind(k) => $self.inner.$method(k), MatcherType::RuleCore(r) => $self.inner.$method(r), }) }}; }Then each method becomes:
pub fn matches(&self, m: JsValue) -> Result<bool, JsError> { dispatch_matcher!(self, m, matches) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/wasm/src/sg_node.rs` around lines 180 - 218, The five methods (matches, inside, has, precedes, follows) duplicate the same three-arm MatcherType dispatch after calling parse_matcher; replace the repeated match with a single helper (macro or generic closure) that takes self, the JsValue, and the target method name, and then calls parse_matcher once and dispatches Pattern/Kind/RuleCore to self.inner.<method>. Implement the helper (e.g., macro_rules! dispatch_matcher) and update each of matches, inside, has, precedes, follows to a one-line call to the helper referencing parse_matcher, MatcherType variants, and inner so the behavior is unchanged but boilerplate removed.
94-108:parse_matcheracceptsf64for kind IDs — worth a brief doc note on the JS side.Line 101 casts
m.as_f64()tou16. If JS passes a non-integer or a value outside0..=65535, the behavior is silently wrong. This is a minor edge-case concern since thekind()API inlib.rsreturnsu16, but there's no runtime validation here.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/wasm/src/sg_node.rs` around lines 94 - 108, parse_matcher currently accepts non-integer or out-of-range f64 values for kind IDs (m.as_f64() cast to u16), which can produce silent incorrect behavior; update parse_matcher to validate the numeric matcher by checking that m.as_f64() is an integer and within 0..=65535 before casting and calling KindMatcher::from_id, and return a meaningful JsError (with context) when the value is non-integer or out of range; reference the parse_matcher function, the m.as_f64() branch, KindMatcher::from_id, and the MatcherType::Kind constructor when making this change.
There was a problem hiding this comment.
🧹 Nitpick comments (6)
crates/wasm/src/wasm_lang.rs (3)
196-213:pre_process_patternalways allocates, defeatingCow's purpose.The function returns
Cow::Ownedunconditionally, even for inputs with no$characters where the output would be identical to the input. An early return ofCow::Borrowed(query)when the input contains no$would avoid the allocation in the common case for non-pattern strings.Possible early-return optimization
fn pre_process_pattern(expando: char, query: &str) -> Cow<'_, str> { + if expando == '$' || !query.contains('$') { + return Cow::Borrowed(query); + } let mut ret = Vec::with_capacity(query.len());🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/wasm/src/wasm_lang.rs` around lines 196 - 213, The function pre_process_pattern always constructs an owned String even when no '$' are present; add an early check for the presence of '$' in query and return Cow::Borrowed(query) immediately when none are found to avoid allocation. Keep the existing logic (using expando, dollar_count, ret, and final sigil handling) unchanged for the case where a '$' exists so behavior is identical; reference the pre_process_pattern function, the query parameter, and the Cow return type when making this change.
92-96:unsafe impl Send/Syncneeds a// SAFETYcomment.The
TsParserwraps a JS object (ts::Parser) which is inherently!Send + !Syncin wasm-bindgen. This is pragmatically safe because WASM is single-threaded, but the unsafety should be documented for future maintainers (and to satisfy Clippy'sundocumented_unsafe_blockslint if enabled).Add safety comments
#[derive(Clone)] struct TsParser(ts::Parser); +// SAFETY: WASM is single-threaded; these impls are required to store +// TsParser inside a `Mutex<Vec<Inner>>` used as a global registry. +// If WASM threading (SharedArrayBuffer) is ever adopted, this must be revisited. unsafe impl Send for TsParser {} unsafe impl Sync for TsParser {}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/wasm/src/wasm_lang.rs` around lines 92 - 96, The unsafe impls for Send and Sync on TsParser (which wraps the wasm-bindgen JS type ts::Parser) need a SAFETY comment: add a // SAFETY: ... comment directly above or next to the unsafe impl Send/Sync explaining that ts::Parser is !Send/!Sync in wasm-bindgen but marking TsParser as Send/Sync is safe because WebAssembly in this build is single-threaded (no cross-thread access to JS values) and no interior mutability or non-atomic shared state is exposed; mention any assumptions (no future threaded wasm usage or wasm-bindgen changes) so future maintainers know why the unsafety is justified.
148-154:get_ts_languagepanics on missing language instead of returningResult.Both
expect_throwcalls will abort the WASM module with a thrown JS exception if a language isn't loaded or a parser has no language set. This is forced by theLanguagetrait methods (kind_to_id,field_to_id) not returningResult, so this is acceptable. Just noting that callers ofkind_to_id/field_to_idon an invalidWasmLang(e.g., one deserialized from stale data after a re-registration that changed indices) will get an opaque panic rather than a recoverable error.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/wasm/src/wasm_lang.rs` around lines 148 - 154, get_ts_language currently aborts via expect_throw when no parser or language is set; change it to return Result<ts::Language, JsValue> (or a suitable error type) and propagate failures instead of panicking: replace the expect_throw calls on self.get_parser() and .language() with early-return Err(...) that includes a descriptive JS error, and update callers to handle the Result; reference get_ts_language, get_parser, .language(), and the expect_throw uses when making the change.crates/wasm/src/lib.rs (2)
57-68: Doc comment is misleading —pattern()doesn't compile anything.The doc says "Compile a string to ast-grep Pattern config" but the function just wraps the pattern string in a
WasmConfigstruct and serializes it toJsValue. No parsing or compilation occurs. Consider updating the doc to accurately describe what it does, e.g., "Create a pattern config object from a pattern string."Suggested doc update
-/// Compile a string to ast-grep Pattern config. +/// Create an ast-grep rule config from a pattern string.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/wasm/src/lib.rs` around lines 57 - 68, The doc comment for the exported function `pattern` is incorrect: it claims to "Compile a string to ast-grep Pattern config" but `pattern` only constructs a `WasmConfig` (wrapping `pattern_str` into the `rule` field) and serializes it via `serde_wasm_bindgen::to_value`. Update the doc comment for the `pattern` function to accurately describe its behavior (e.g., "Create a pattern config object from a pattern string" or similar), mentioning that it returns a serialized `WasmConfig` `JsValue` rather than performing compilation; reference the `pattern` function and the `WasmConfig` struct in the comment to make intent clear.
17-18: Remove or replacewee_alloc— it has been unmaintained and archived since August 2025.The
wee_alloccrate is no longer maintained (GitHub repo archived) and was already flagged by RustSec (Advisory GHSA-rc23-xxgq-x27g, Sep 2022) with known memory leaks and no patched versions. The Rust WASM working group itself was archived in 2024. For most WASM modules, the defaultdlmallocallocator bundled with Rust's wasm32 toolchain is a safe and proven choice. If code size is a constraint, actively maintained alternatives liketalccan be evaluated.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/wasm/src/lib.rs` around lines 17 - 18, The global allocator declaration using #[global_allocator] static ALLOC: wee_alloc::WeeAlloc = wee_alloc::WeeAlloc::INIT; must be removed or replaced because wee_alloc is unmaintained; delete that static and the attribute and remove the wee_alloc dependency from Cargo.toml to fall back to the default wasm allocator (dlmalloc), or replace it by a maintained alternative (e.g., add and initialize talc) if you need a smaller allocator—refer to the symbols #[global_allocator], ALLOC, and wee_alloc::WeeAlloc when making the change.crates/wasm/__test__/index.spec.mjs (1)
279-317: Python registration is duplicated across two tests.Both
'parse multiple languages simultaneously'(line 283) and'kind works for multiple languages'(line 310) independently callregisterDynamicLanguagefor Python. This works because the API updates existing languages, but it makes test execution order-dependent in a subtle way: if these tests ran in isolation, they'd each need the registration, but when run together the second call is redundant. Consider extracting Python registration into thetest.beforehook or a shared helper to make the intent clearer.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/wasm/__test__/index.spec.mjs` around lines 279 - 317, The Python language registration is duplicated in the tests 'parse multiple languages simultaneously' and 'kind works for multiple languages'; move the call to wasm.registerDynamicLanguage (and its pythonPath resolution/expandoChar) into a shared setup so tests don't rely on order—either add a test.before hook that resolves pythonPath and calls wasm.registerDynamicLanguage({ python: { libraryPath: pythonPath, expandoChar: 'µ' } }) or create a helper function (e.g., registerPythonLanguage) and call it from both tests to centralize registration and remove the duplicate lines inside those tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@crates/wasm/__test__/index.spec.mjs`:
- Around line 279-317: The Python language registration is duplicated in the
tests 'parse multiple languages simultaneously' and 'kind works for multiple
languages'; move the call to wasm.registerDynamicLanguage (and its pythonPath
resolution/expandoChar) into a shared setup so tests don't rely on order—either
add a test.before hook that resolves pythonPath and calls
wasm.registerDynamicLanguage({ python: { libraryPath: pythonPath, expandoChar:
'µ' } }) or create a helper function (e.g., registerPythonLanguage) and call it
from both tests to centralize registration and remove the duplicate lines inside
those tests.
In `@crates/wasm/src/lib.rs`:
- Around line 57-68: The doc comment for the exported function `pattern` is
incorrect: it claims to "Compile a string to ast-grep Pattern config" but
`pattern` only constructs a `WasmConfig` (wrapping `pattern_str` into the `rule`
field) and serializes it via `serde_wasm_bindgen::to_value`. Update the doc
comment for the `pattern` function to accurately describe its behavior (e.g.,
"Create a pattern config object from a pattern string" or similar), mentioning
that it returns a serialized `WasmConfig` `JsValue` rather than performing
compilation; reference the `pattern` function and the `WasmConfig` struct in the
comment to make intent clear.
- Around line 17-18: The global allocator declaration using #[global_allocator]
static ALLOC: wee_alloc::WeeAlloc = wee_alloc::WeeAlloc::INIT; must be removed
or replaced because wee_alloc is unmaintained; delete that static and the
attribute and remove the wee_alloc dependency from Cargo.toml to fall back to
the default wasm allocator (dlmalloc), or replace it by a maintained alternative
(e.g., add and initialize talc) if you need a smaller allocator—refer to the
symbols #[global_allocator], ALLOC, and wee_alloc::WeeAlloc when making the
change.
In `@crates/wasm/src/wasm_lang.rs`:
- Around line 196-213: The function pre_process_pattern always constructs an
owned String even when no '$' are present; add an early check for the presence
of '$' in query and return Cow::Borrowed(query) immediately when none are found
to avoid allocation. Keep the existing logic (using expando, dollar_count, ret,
and final sigil handling) unchanged for the case where a '$' exists so behavior
is identical; reference the pre_process_pattern function, the query parameter,
and the Cow return type when making this change.
- Around line 92-96: The unsafe impls for Send and Sync on TsParser (which wraps
the wasm-bindgen JS type ts::Parser) need a SAFETY comment: add a // SAFETY: ...
comment directly above or next to the unsafe impl Send/Sync explaining that
ts::Parser is !Send/!Sync in wasm-bindgen but marking TsParser as Send/Sync is
safe because WebAssembly in this build is single-threaded (no cross-thread
access to JS values) and no interior mutability or non-atomic shared state is
exposed; mention any assumptions (no future threaded wasm usage or wasm-bindgen
changes) so future maintainers know why the unsafety is justified.
- Around line 148-154: get_ts_language currently aborts via expect_throw when no
parser or language is set; change it to return Result<ts::Language, JsValue> (or
a suitable error type) and propagate failures instead of panicking: replace the
expect_throw calls on self.get_parser() and .language() with early-return
Err(...) that includes a descriptive JS error, and update callers to handle the
Result; reference get_ts_language, get_parser, .language(), and the expect_throw
uses when making the change.
d3185dc to
3392cb5
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (7)
crates/wasm/tests/web.rs (1)
286-296: Same note as JS tests — the pre-sort on line 292 is redundant sincecommit_editssorts internally.Not a bug, just a minor readability nit for test code.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/wasm/tests/web.rs` around lines 286 - 296, The test_multiple_fixes function contains a redundant pre-sort of the fixes vector before calling commit_edits, because commit_edits already sorts edits internally; remove the fixes.sort_by(|a, b| b.start_pos.cmp(&a.start_pos)) call (or replace it with a brief comment noting commit_edits handles ordering) so the test relies on commit_edits’ internal sorting and improves readability.crates/wasm/src/wasm_lang.rs (2)
92-96:unsafe impl Send/Sync for TsParser— safe only in single-threaded WASM.This is sound today because
wasm32targets are single-threaded, but it becomes unsound ifwasm32with shared memory / threads is ever targeted. A brief// SAFETYcomment documenting this single-thread assumption would help.Suggested comment
#[derive(Clone)] struct TsParser(ts::Parser); +// SAFETY: TsParser wraps a JS object. This is safe because wasm32 is +// single-threaded. Must be revisited if wasm-threads support is added. unsafe impl Send for TsParser {} unsafe impl Sync for TsParser {}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/wasm/src/wasm_lang.rs` around lines 92 - 96, Add a SAFETY comment above the unsafe impls for TsParser documenting that the Send/Sync assertions rely on the current single-threaded WebAssembly environment (wasm32 without shared memory/threads) and are only sound under that assumption; reference the TsParser wrapper and inner ts::Parser, state that if wasm shared memory/threads are enabled this would be unsound and the impls must be removed or guarded (e.g., behind a cfg) — keep the comment concise and add guidance for future maintainers to revisit when targeting threaded wasm.
196-213:pre_process_patternalways allocates even when input has no$characters.The function returns
Cow::Ownedunconditionally. For patterns without meta-variables, an early return ofCow::Borrowed(query)would avoid the allocation.Proposed optimization
fn pre_process_pattern(expando: char, query: &str) -> Cow<'_, str> { + if expando == '$' || !query.contains('$') { + return Cow::Borrowed(query); + } let mut ret = Vec::with_capacity(query.len()); // ... rest unchanged }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/wasm/src/wasm_lang.rs` around lines 196 - 213, pre_process_pattern currently always builds and returns Cow::Owned causing needless allocation for strings without '$'; change it to detect whether the input contains any '$' (e.g., using query.contains('$') or track dollar_count during a first quick scan) and return Cow::Borrowed(query) early when there are no meta characters to replace; keep the existing transform logic (using expando, dollar_count, ret, etc.) and only construct and return Cow::Owned(ret.into_iter().collect()) when a replacement actually occurred.crates/wasm/src/sg_node.rs (2)
49-60: Unsafe lifetime extension via raw pointer — fragile but sound under stated invariants.The safety argument (WasmDoc's
Nodewraps a JS GC-managed object, so the'staticborrow doesn't actually alias mutable Rust memory, andRckeepsAstGrepalive) is reasonable for this WASM-only target. However, this invariant is implicit and easy to break ifWasmDocor itsNodetype ever changes to hold Rust-side borrows.Consider adding a module-level
// SAFETYdoc comment or#[doc(hidden)]note explaining this design decision so future contributors know the constraint.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/wasm/src/sg_node.rs` around lines 49 - 60, The unsafe lifetime extension in SgNode::root() relies on the invariant that WasmDoc::Node wraps a JS GC-managed object and that Rc< AstGrep<WasmDoc> > keeps the AST alive; add a clear module-level SAFETY doc comment (or a #[doc(hidden)] note) near the Wasm-specific code explaining this invariant and its constraints so future maintainers know why the raw pointer cast (in root()), the use of NodeMatch<'static, WasmDoc>, and the assumption about WasmDoc::Node are sound and what changes would break it (e.g., if WasmDoc::Node ever contains Rust borrows).
180-218: Repetitive 3-arm match dispatch across matcher methods.
matches,inside,has,precedes, andfollowsall share the exact same match-on-MatcherType-then-delegate pattern. A small helper could reduce this boilerplate:Example helper
impl SgNode { fn with_matcher<F, R>(&self, m: JsValue, f: F) -> Result<R, JsError> where F: FnOnce(&NodeMatch<'static, WasmDoc>, &MatcherType) -> R, { let matcher = self.parse_matcher(m)?; Ok(f(&self.inner, &matcher)) } } // Then MatcherType implements Matcher trait or use a macro to dispatch🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/wasm/src/sg_node.rs` around lines 180 - 218, Refactor the repeated 3-arm match in SgNode::matches, inside, has, precedes, and follows by adding a helper (e.g., SgNode::with_matcher) that calls self.parse_matcher(m) once and then dispatches to a provided closure; have the helper return Result<R, JsError> and accept a closure that takes (&self.inner, &MatcherType) (or similar) so each of matches/inside/has/precedes/follows can call with_matcher(m, |inner, mt| match mt { MatcherType::Pattern(p) => inner.matches(p), MatcherType::Kind(k) => inner.matches(k), MatcherType::RuleCore(r) => inner.matches(r) }) (adjusting the delegated method name per caller), thereby eliminating duplicated match blocks while keeping parse_matcher, MatcherType, and inner as the dispatch points.crates/wasm/__test__/index.spec.mjs (1)
93-100: Pre-sort is redundant —commitEditssorts internally.Line 97 sorts
fixesdescending bystart_pos, butcommit_editsinsg_node.rs(line 382) re-sorts ascending. The JS-side sort has no effect. It's not harmful, but it could mislead readers into thinking external sort order matters.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/wasm/__test__/index.spec.mjs` around lines 93 - 100, The JS test "multiple fixes with unicode" performs an unnecessary sort on the fixes array: remove the explicit fixes.sort((a, b) => b.start_pos - a.start_pos) since commitEdits already re-sorts edits; just build fixes via matches.map(...) and pass them directly to sg.root().commitEdits(fixes) (refer to the test name "multiple fixes with unicode", the variable fixes, and the commitEdits method).crates/wasm/src/lib.rs (1)
106-130:Reflect::setcalls unwrap unconditionally — safe in practice but fragile.All
Reflect::setcalls indump_pattern_to_jsuse.unwrap(). While these won't fail on a plainObject, the pattern could mask errors if the function is ever extended. Since this already returnsResult<JsValue, JsError>, propagating the error would be more robust.Example for one call
- js_sys::Reflect::set(&obj, &"isMetaVar".into(), &JsValue::from_bool(node.is_meta_var)).unwrap(); + js_sys::Reflect::set(&obj, &"isMetaVar".into(), &JsValue::from_bool(node.is_meta_var)) + .map_err(|_| JsError::new("Failed to set isMetaVar"))?;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/wasm/src/lib.rs` around lines 106 - 130, The Reflect::set calls in dump_pattern_to_js currently call .unwrap(), which should be replaced to propagate errors: replace each .unwrap() on js_sys::Reflect::set(...) with ? after mapping the JS error into a JsError (e.g. .map_err(|e| JsError::from(e))?), so each Reflect::set(...) call becomes something like js_sys::Reflect::set(...).map_err(JsError::from)?; ensure the function dump_pattern_to_js retains its Result<JsValue, JsError> signature and propagate any Reflect failures instead of panicking.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/wasm/README.md`:
- Around line 17-30: The README uses a non-existent setupParser API; update the
example to import and call registerDynamicLanguage instead: replace the import
of setupParser with registerDynamicLanguage (alongside initializeTreeSitter and
parse), and change the setupParser call to a registerDynamicLanguage invocation
that passes a language-to-config map as expected by the registerDynamicLanguage
function (matching the shape used in __test__/index.spec.mjs and
crates/wasm/src/lib.rs); keep the rest of the example (initializeTreeSitter,
parse, usage of parse and node.getMatch) unchanged.
In `@crates/wasm/src/lib.rs`:
- Around line 18-19: The crate currently forces a global allocator via the
#[global_allocator] attribute and the static ALLOC: wee_alloc::WeeAlloc =
wee_alloc::WeeAlloc::INIT; which pins an unmaintained, unsafe allocator; remove
the #[global_allocator] attribute and the static ALLOC declaration (and any
use/import of wee_alloc) from lib.rs so the build falls back to the default
dlmalloc allocator for wasm32 targets (search for ALLOC, wee_alloc::WeeAlloc,
and the #[global_allocator] line to locate the code to delete).
---
Duplicate comments:
In `@crates/wasm/package.json`:
- Around line 1-44: The package manifest in package.json is valid for the
wasm-pack project and requires no changes; keep the existing scripts
(scripts.build, scripts.build:nodejs, scripts.build:bundler, scripts.test), AVA
config, peerDependencies (web-tree-sitter), devDependencies, and publishConfig
as-is—no code edits are necessary.
In `@crates/wasm/src/sg_node.rs`:
- Around line 379-400: commit_edits can panic due to unsigned subtraction when
diff.start_pos or diff.end_pos is less than offset; before computing pos =
diff.start_pos as usize - offset and before computing start = diff.end_pos as
usize - offset, validate each WasmEdit's start_pos and end_pos against offset
and the node range (and ensure end >= start and positions do not exceed
old_content.len()); skip or return an Err for edits that are out-of-range, or
clamp them safely, and use comparisons (e.g., if diff.start_pos as usize <
offset) rather than direct subtraction to avoid underflow while keeping the rest
of the new_content assembly (references: commit_edits, WasmEdit, start_pos,
end_pos, offset, old_content, new_content, Wrapper::decode_str).
---
Nitpick comments:
In `@crates/wasm/__test__/index.spec.mjs`:
- Around line 93-100: The JS test "multiple fixes with unicode" performs an
unnecessary sort on the fixes array: remove the explicit fixes.sort((a, b) =>
b.start_pos - a.start_pos) since commitEdits already re-sorts edits; just build
fixes via matches.map(...) and pass them directly to
sg.root().commitEdits(fixes) (refer to the test name "multiple fixes with
unicode", the variable fixes, and the commitEdits method).
In `@crates/wasm/src/lib.rs`:
- Around line 106-130: The Reflect::set calls in dump_pattern_to_js currently
call .unwrap(), which should be replaced to propagate errors: replace each
.unwrap() on js_sys::Reflect::set(...) with ? after mapping the JS error into a
JsError (e.g. .map_err(|e| JsError::from(e))?), so each Reflect::set(...) call
becomes something like js_sys::Reflect::set(...).map_err(JsError::from)?; ensure
the function dump_pattern_to_js retains its Result<JsValue, JsError> signature
and propagate any Reflect failures instead of panicking.
In `@crates/wasm/src/sg_node.rs`:
- Around line 49-60: The unsafe lifetime extension in SgNode::root() relies on
the invariant that WasmDoc::Node wraps a JS GC-managed object and that Rc<
AstGrep<WasmDoc> > keeps the AST alive; add a clear module-level SAFETY doc
comment (or a #[doc(hidden)] note) near the Wasm-specific code explaining this
invariant and its constraints so future maintainers know why the raw pointer
cast (in root()), the use of NodeMatch<'static, WasmDoc>, and the assumption
about WasmDoc::Node are sound and what changes would break it (e.g., if
WasmDoc::Node ever contains Rust borrows).
- Around line 180-218: Refactor the repeated 3-arm match in SgNode::matches,
inside, has, precedes, and follows by adding a helper (e.g.,
SgNode::with_matcher) that calls self.parse_matcher(m) once and then dispatches
to a provided closure; have the helper return Result<R, JsError> and accept a
closure that takes (&self.inner, &MatcherType) (or similar) so each of
matches/inside/has/precedes/follows can call with_matcher(m, |inner, mt| match
mt { MatcherType::Pattern(p) => inner.matches(p), MatcherType::Kind(k) =>
inner.matches(k), MatcherType::RuleCore(r) => inner.matches(r) }) (adjusting the
delegated method name per caller), thereby eliminating duplicated match blocks
while keeping parse_matcher, MatcherType, and inner as the dispatch points.
In `@crates/wasm/src/wasm_lang.rs`:
- Around line 92-96: Add a SAFETY comment above the unsafe impls for TsParser
documenting that the Send/Sync assertions rely on the current single-threaded
WebAssembly environment (wasm32 without shared memory/threads) and are only
sound under that assumption; reference the TsParser wrapper and inner
ts::Parser, state that if wasm shared memory/threads are enabled this would be
unsound and the impls must be removed or guarded (e.g., behind a cfg) — keep the
comment concise and add guidance for future maintainers to revisit when
targeting threaded wasm.
- Around line 196-213: pre_process_pattern currently always builds and returns
Cow::Owned causing needless allocation for strings without '$'; change it to
detect whether the input contains any '$' (e.g., using query.contains('$') or
track dollar_count during a first quick scan) and return Cow::Borrowed(query)
early when there are no meta characters to replace; keep the existing transform
logic (using expando, dollar_count, ret, etc.) and only construct and return
Cow::Owned(ret.into_iter().collect()) when a replacement actually occurred.
In `@crates/wasm/tests/web.rs`:
- Around line 286-296: The test_multiple_fixes function contains a redundant
pre-sort of the fixes vector before calling commit_edits, because commit_edits
already sorts edits internally; remove the fixes.sort_by(|a, b|
b.start_pos.cmp(&a.start_pos)) call (or replace it with a brief comment noting
commit_edits handles ordering) so the test relies on commit_edits’ internal
sorting and improves readability.
3392cb5 to
023636b
Compare
There was a problem hiding this comment.
Actionable comments posted: 8
🧹 Nitpick comments (2)
crates/wasm/tests/setup.js (1)
1-1: Remove unusedpathimport.
pathis never referenced in this file.♻️ Proposed fix
-const path = require("path"); - exports.parserPath = function (lang) {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/wasm/tests/setup.js` at line 1, The file imports the unused symbol "path" via const path = require("path"); — remove that unused import line from the top of crates/wasm/tests/setup.js (delete the const path = require("path"); statement) and verify there are no remaining references to "path" in setup.js; run the linter/test to confirm no unused-import warnings remain.crates/wasm/src/wasm_lang.rs (1)
229-229:LanguageErrordisplay arm uses{:?}instead of.message(), producing an unhelpful debug string.Both
ParserErrorandLanguageErrorextendjs_sys::Error(via#[wasm_bindgen(extends = Error)]ints_types.rs), so.message()is available on both via theDerefchain.♻️ Proposed fix
-SgWasmError::LanguageError(err) => write!(f, "Language error: {:?}", err), +SgWasmError::LanguageError(err) => write!(f, "Language error: {}", err.message()),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/wasm/src/wasm_lang.rs` at line 229, The LanguageError match arm prints a debug representation; update the SgWasmError::LanguageError arm to use the JS error message instead of "{:?}". Specifically, replace write!(f, "Language error: {:?}", err) in the SgWasmError Display implementation with write!(f, "Language error: {}", err.message()) so the human-readable JS error message is shown (you can fallback to a debug form if needed).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/wasm/__test__/index.spec.mjs`:
- Around line 228-234: The test reveals inconsistent sibling traversal names:
the Rust-exported method next_node() and the wasm-bound prev() (originally
prev_node()) should be symmetric; update the wasm bindings so both methods use
the same naming convention (choose either next_node()/prev_node() or
next()/prev()) by adjusting the wasm_bindgen export attribute (e.g., change the
prev_node() binding to #[wasm_bindgen(js_name = prev_node)] or change the Rust
next_node() export to #[wasm_bindgen(js_name = next)]), update the corresponding
function names/attributes in the module that defines next_node()/prev_node() and
any references in tests (index.spec.mjs) to match the chosen pair, and run the
tests to confirm the API is consistent.
In `@crates/wasm/Cargo.toml`:
- Line 31: Remove the unmaintained wee_alloc usage: delete the dependency entry
"wee_alloc = \"0.4.5\"" from Cargo.toml and remove the corresponding allocator
declaration in the crate (the #[global_allocator] attribute and its static
instance that references wee_alloc, e.g., any extern crate/use wee_alloc and
static ALLOC: wee_alloc::WeeAlloc = ... in src/lib.rs) so the crate falls back
to the default Rust allocator.
In `@crates/wasm/src/doc.rs`:
- Around line 205-224: The iterator in field_children ignores the boolean result
of cursor.goto_first_child(), so when a node has no children the cursor remains
on the parent and may yield the parent itself; fix field_children by capturing
the result of cursor.goto_first_child() and treating a false return as “no
children” (set done = true or return None from the iterator), ensuring you only
read current_field_id()/current_node() after a successful goto_first_child();
keep using the same cursor methods (goto_first_child, current_field_id,
goto_next_sibling, current_node) and the Node wrapper so the iterator never
emits the parent when there are no children.
In `@crates/wasm/src/sg_node.rs`:
- Around line 12-14: The doc comments are incorrect: WasmEdit.start_pos and
end_pos and Pos.index are populated from self.inner.range() (tree-sitter byte
offsets), not character offsets; update the field docs for WasmEdit.start_pos,
WasmEdit.end_pos and Pos.index to state "byte offset" (UTF-8 byte index) instead
of "character offset" so callers know these are byte indices and not
UTF-16/character indices when using the values from self.inner.range().
- Around line 100-101: The current branch that converts a JSON number
(m.as_f64()) into a KindMatcher using n as u16 silently produces wrong IDs for
non-integer or out-of-range floats; instead validate that n is a finite,
non-negative integer within 0..=u16::MAX (e.g., check n.is_finite(), n >= 0.0, n
<= u16::MAX as f64, and n.fract() == 0.0) before calling KindMatcher::from_id,
and if validation fails return a JsError with a clear message; update the
conversion in the MatcherType::Kind branch that currently does
KindMatcher::from_id(n as u16) to perform these checks and only cast to u16
after validation.
In `@crates/wasm/src/ts_types.rs`:
- Around line 475-510: Implement Drop impls for the WASM-backed types that
allocate Emscripten memory so their underlying heap allocations are freed: add
impl Drop for Tree that calls self.delete() in drop, impl Drop for Parser that
calls Parser::delete(self) (or the instance method) in drop, and impl Drop for
TreeCursor that calls its delete() in drop; ensure these use the existing
delete() instance methods (matching Parser::__new(), Tree::copy()/delete(), and
TreeCursor::walk()/delete()) so clones (which call Tree::copy()) still own
distinct heap objects and are properly freed when dropped.
- Around line 33-47: The init() function has a TOCTOU race: multiple concurrent
callers can observe TREE_SITTER_INITIALIZED false and both call Parser::init();
fix by making the flag optimistic/atomic: set TREE_SITTER_INITIALIZED to true
(via TREE_SITTER_INITIALIZED.with(|cell| cell.replace(true))) before awaiting
Parser::init(), then call JsFuture::from(Parser::init()).await.lift_error(); if
that await returns an error, reset the flag back to false inside the same init()
(so future calls can retry) and return the error; alternatively implement a
three-state enum (Uninitialized/Initializing/Initialized) and have concurrent
callers wait or return early based on the state — update init(),
TREE_SITTER_INITIALIZED usage, and error-path handling accordingly.
---
Duplicate comments:
In `@crates/wasm/package.json`:
- Around line 16-18: The package `@ast-grep/wasm` is not included in the release
workflow and lacks entry-point fields; update the release process and package
metadata by either (A) adding a publish step in .github/workflows/release.yml
that publishes crates/wasm (ensuring npm publish uses the generated pkg
directory), or (B) documenting a manual publish workflow in the repo README.
Also add appropriate entry-point fields (main/module/exports) to the top-level
crates/wasm/package.json or explicitly state that consumers should use the
wasm-pack-generated pkg/package.json (and ensure "files": ["pkg"] remains), so
npm consumers resolve the correct entry point for `@ast-grep/wasm`.
In `@crates/wasm/src/doc.rs`:
- Around line 66-84: The accept_edit function currently mutates a local Vec<u8>
named input and never updates the struct, so subsequent do_edit (which reads
self.source.inner) re-parses the stale content; change accept_edit (the method
named accept_edit) to apply the splice to self.inner (or assign the modified
input back into self.inner) so the edited bytes are persisted on the struct
before returning the ts::Edit (ensure you update the same Vec<u8> used by
do_edit/self.source.inner rather than only a local copy).
In `@crates/wasm/src/sg_node.rs`:
- Around line 379-400: The commit_edits function can underflow and slice
out-of-bounds; validate each WasmEdit's start_pos and end_pos against offset and
old_content.len() before using them: in commit_edits, convert start_pos and
end_pos to checked values (e.g. ensure start_pos >= offset and end_pos >=
start_pos, then compute start_idx = start_pos - offset and end_idx = end_pos -
offset using checked subtraction) and return a JsError if any check fails; also
detect and return an error for overlapping edits (e.g. ensure start_idx >=
previous end_idx) instead of silently skipping, and only then perform the
slicing/extend operations on old_content/new_content and update start to
end_idx.
---
Nitpick comments:
In `@crates/wasm/src/wasm_lang.rs`:
- Line 229: The LanguageError match arm prints a debug representation; update
the SgWasmError::LanguageError arm to use the JS error message instead of
"{:?}". Specifically, replace write!(f, "Language error: {:?}", err) in the
SgWasmError Display implementation with write!(f, "Language error: {}",
err.message()) so the human-readable JS error message is shown (you can fallback
to a debug form if needed).
In `@crates/wasm/tests/setup.js`:
- Line 1: The file imports the unused symbol "path" via const path =
require("path"); — remove that unused import line from the top of
crates/wasm/tests/setup.js (delete the const path = require("path"); statement)
and verify there are no remaining references to "path" in setup.js; run the
linter/test to confirm no unused-import warnings remain.
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (1)
crates/wasm/src/sg_node.rs (1)
50-61:root()can be simplified to use thecast_matchhelper consistently.The raw-pointer dance (
Rc::as_ptr→*const T→&'static T) is unnecessary; the same lifetime extension can be done via the existingcast_matchtransmute helper, matching every other method in the file.♻️ Proposed simplification
- pub fn root(&self) -> SgNode { - // SAFETY: WasmDoc's Node type wraps a JS SyntaxNode (GC-managed, Clone). - // It does not actually borrow from the Rust tree. The Rc keeps the - // AstGrep alive as long as any SgNode references it. - let root_ref: &'static AstGrep<WasmDoc> = - unsafe { &*(Rc::as_ptr(&self.inner) as *const AstGrep<WasmDoc>) }; - let node_match: NodeMatch<'static, WasmDoc> = root_ref.root().into(); - SgNode { - _root: self.inner.clone(), - inner: node_match, - } - } + pub fn root(&self) -> SgNode { + // SAFETY: WasmDoc's Node wraps a JS GC-managed SyntaxNode; no Rust + // lifetime constraint is actually enforced. _root keeps AstGrep alive. + let nm: NodeMatch<'_, WasmDoc> = self.inner.root().into(); + SgNode { + _root: self.inner.clone(), + inner: unsafe { SgNode::cast_match(nm) }, + } + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/wasm/src/sg_node.rs` around lines 50 - 61, The root() method is doing an unnecessary raw-pointer transmute to get a &'static AstGrep<WasmDoc> and NodeMatch; replace that pointer dance with the existing cast_match helper used elsewhere: call cast_match on self.inner (or on the result of self.inner.root()) to extend lifetimes safely and produce a NodeMatch<'static, WasmDoc>, then construct and return SgNode with _root: self.inner.clone() and inner: that cast match; this keeps behavior identical while removing the manual Rc::as_ptr/unsafe pointer conversion.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/wasm/Cargo.toml`:
- Line 2: The crate's library name is wrong for tests: add a [lib] section that
sets name = "ast_grep_wasm" so the Rust crate is exported as ast_grep_wasm
(matching references in tests like ast_grep_wasm::) while leaving the package
name = "wasm" for npm packaging; update Cargo.toml by adding a [lib] block
containing name = "ast_grep_wasm".
In `@crates/wasm/package.json`:
- Line 37: Update the tree-sitter-python dependency to match the runtime ABI by
changing the version specifier for "tree-sitter-python" in package.json from
^0.23.0 to ^0.25.0 so it aligns with "tree-sitter-javascript" and
"web-tree-sitter"; after updating, run npm install (or yarn) and rebuild any
WASM/grammar artifacts to ensure the grammar WASM binaries are compiled against
the same tree-sitter ABI as web-tree-sitter.
In `@crates/wasm/README.md`:
- Around line 132-134: Update the filename() docs to remove the incorrect
reference to the non-existent findInFiles API and clarify behavior: state that
Document.filename() returns the file path when the document was created/loaded
from a file (e.g., via parse of file input or the WASM package's loader) and
returns "anonymous" when the document was created via parse from a string;
adjust the text around filename(), parse, and any mention of findInFiles to
avoid implying that findInFiles exists in the WASM package.
In `@crates/wasm/scripts/patch-pkg.mjs`:
- Line 5: The current assignment to pkgPath uses new URL('../pkg/package.json',
import.meta.url).pathname which yields an invalid Windows path; import
fileURLToPath from 'url' and set pkgPath = fileURLToPath(new
URL('../pkg/package.json', import.meta.url)) instead, adding the import {
fileURLToPath } from 'url' at the top of the module and leaving the rest of the
script (reads/writes using pkgPath) unchanged.
---
Duplicate comments:
In `@crates/wasm/__test__/index.spec.mjs`:
- Around line 228-234: The test mixes asymmetric sibling traversal names
(calling next_node() but prev()), so make them consistent: update the test to
use matching API names (e.g., call b.prev_node() to mirror a.next_node()), or
alternatively change a.next_node() to a.next() if the public API is
next()/prev(); locate the calls in the test around parse('javascript', ...),
sg.root().find(...) and replace the inconsistent method so both sibling
traversals use the same method naming convention (next_node()/prev_node() or
next()/prev()).
In `@crates/wasm/Cargo.toml`:
- Line 31: The Cargo.toml currently depends on the unmaintained "wee_alloc"
crate (the line with wee_alloc = "0.4.5"); remove that dependency from
Cargo.toml and any corresponding global allocator usage (search for wee_alloc
and #[global_allocator] in the wasm crate, e.g. in lib.rs or main.rs) and either
rely on the default allocator or replace it with a maintained alternative
allocator (choose and add the alternative crate and update its initialization if
you opt to replace). Ensure Cargo.toml no longer lists wee_alloc and that any
code referencing the wee_alloc symbol is removed or migrated to the chosen
allocator.
In `@crates/wasm/package.json`:
- Around line 15-17: The package.json currently only lists "files": ["pkg"] and
is missing the top-level entry points and an npm release workflow; update
package.json (symbols: "main", "module", "types", and "exports") to point to the
built outputs in pkg (e.g., pkg/index.cjs, pkg/index.esm.js, and pkg/index.d.ts)
so consumers can resolve the package, and add a GitHub Actions workflow (e.g.,
.github/workflows/release.yml) that builds the crate, runs tests, and publishes
to npm using the official actions (setup-node + npm publish) with proper auth
and semantic-version tagging to automate releases.
In `@crates/wasm/src/doc.rs`:
- Around line 205-224: The iterator in field_children ignores the boolean result
of cursor.goto_first_child(), so on leaf nodes it proceeds incorrectly; update
field_children to check the return of cursor.goto_first_child() and immediately
return an empty iterator when it returns false, then proceed with the existing
from_fn logic (using cursor.goto_next_sibling(), cursor.current_field_id(), and
cursor.current_node()) — ensure you don't set done = field_id.is_none() before
confirming there is a first child and initialize the iteration only after a
successful goto_first_child().
- Around line 109-119: The current WasmDoc::try_new builds Wrapper { inner:
src.chars().collect() } which turns the source into char indices and breaks
byte/UTF-8 index alignment for non-BMP characters (e.g. emoji), causing a latent
index-space mismatch with parser.parse_with_string. Fix by preserving the
original byte/UTF-8 representation instead of chars: collect src.into_bytes()
(or src.as_bytes().to_vec()) into Wrapper.inner and adjust Wrapper.inner's type
to Vec<u8> (and any code that reads it) so indices passed between WasmDoc,
Wrapper, and parser.parse_with_string remain consistent.
- Around line 66-84: accept_edit clones self.inner into a local input, applies
input.splice(...), but never writes the modified input back to self.inner so
subsequent do_edit sees stale source; update accept_edit to persist the mutated
buffer by assigning the modified input back into self.inner (converting input to
the original self.inner type if needed) before returning the ts::Edit so the
in-memory source matches the returned edit.
In `@crates/wasm/src/lib.rs`:
- Around line 18-19: The code sets a global allocator using the unmaintained
wee_alloc (the #[global_allocator] attribute and the static ALLOC =
wee_alloc::WeeAlloc::INIT), which should be removed or replaced; remove the
#[global_allocator] line and the static ALLOC declaration and instead either
rely on the default allocator for the target or replace it with a maintained
allocator (e.g., a current wasm-friendly crate) by declaring its allocator
static in place of ALLOC—search for #[global_allocator] and the static ALLOC
symbol to update or delete.
In `@crates/wasm/src/sg_node.rs`:
- Around line 12-18: Update the misleading doc comments to state these are byte
offsets (byte indices) not character/code-point offsets: change the comments on
struct WasmEdit fields start_pos and end_pos and on Pos.index (which is
populated from self.inner.range()) to explicitly say "byte offset" or "byte
index" and mention they come from tree-sitter byte indices (self.inner.range())
so readers won't assume character/code-point offsets.
- Around line 395-407: In commit_edits, you're still subtracting offset from
diff.end_pos without validation — mirror the start_pos fix: validate both
diff.start_pos and diff.end_pos against offset and old_content bounds before
using them; ensure diff.end_pos >= offset and (diff.end_pos - offset) <=
old_content.len() (and similarly for start_pos) and return an Err or skip the
edit if checks fail to avoid underflow and panics when computing start =
diff.end_pos as usize - offset and when slicing old_content[start..]. Use the
same error/skip behavior you implemented for start_pos to keep handling
consistent with Wrapper::decode_str and the final old_content slice.
- Around line 107-108: The code path that converts a JSON number to a
KindMatcher uses m.as_f64() and then casts with n as u16 in the
MatcherType::Kind branch (calling KindMatcher::from_id), which silently
truncates out-of-range or non-integer floats; change this to validate that
m.as_f64() yields an integral value within 0..=u16::MAX (e.g., check n.fract()
== 0.0 and n >= 0.0 && n <= u16::MAX as f64) and return an Err or a clear
parse/match error for invalid values instead of performing a direct cast, then
pass the safely converted u16 into KindMatcher::from_id.
---
Nitpick comments:
In `@crates/wasm/src/sg_node.rs`:
- Around line 50-61: The root() method is doing an unnecessary raw-pointer
transmute to get a &'static AstGrep<WasmDoc> and NodeMatch; replace that pointer
dance with the existing cast_match helper used elsewhere: call cast_match on
self.inner (or on the result of self.inner.root()) to extend lifetimes safely
and produce a NodeMatch<'static, WasmDoc>, then construct and return SgNode with
_root: self.inner.clone() and inner: that cast match; this keeps behavior
identical while removing the manual Rc::as_ptr/unsafe pointer conversion.
868f268 to
57d533b
Compare
|
Thanks for the wasm support !! Longing to use ast-grep in the browser for a long time! |
|
The CI build pipeline does not seem to build the wasm assets, i.e. they do not show up in github's Releases > assets page. -- Also your PR mentions newly added docs: however there seems to be none yet on the official page which would be great of discoverability !
|
|
@MentalGear release is hosted on npm manually. The release is experimental and performance is bad. use at your own discretion. https://www.npmjs.com/package/@ast-grep/wasm?activeTab=readme |
Fix #2450
Summary by CodeRabbit
New Features
Documentation
Tests
Chores