Add Prism.node_for(Method|UnboundMethod|Proc) to get a Prism node for a callable #3808

eregon · 2025-12-14T14:12:19Z

See https://bugs.ruby-lang.org/issues/21005
and https://bugs.ruby-lang.org/issues/20999

tompng · 2025-12-14T15:27:08Z

Looks like Proc generated by for in syntax is missing

lib/prism/parse_result.rb

eregon · 2025-12-14T17:56:41Z

The 2.7 failure is quite annoying: https://github.com/ruby/prism/actions/runs/20211767571/job/58018650394?pr=3808
It can't parse def inline_method = 42. We can't wrap that in eval otherwise Prism.node_for won't work on it.
I suppose we could use a separate file, but it feels messy.
I tried to repro locally but the main Gemfile doesn't even work in 2.7, so it seems difficult to repro & test locally.
I wish we could just drop 2.7, it has this partial kwargs separation which makes it pretty messy in general.

Earlopain · 2025-12-14T18:27:08Z

I wish we could just drop 2.7, it has this partial kwargs separation which makes it pretty messy in general.

This is something we should probably talk about sometime in the future. It was added for the convenience of rubocop as far as I'm aware and I appreciate that this happened. But I do not want to keep these long EOL versions supported forever.

For now I would just put this in a separate file like you said. You shoudl be able to use the version-specific gemfile with BUNDLE_GEMFILE=gemfiles/2.7/Gemfile, that works for me at least.

eregon · 2025-12-14T19:55:26Z

Looks like Proc generated by for in syntax is missing

@tompng Do you mean:

$ bin/parse -e 'for e in foo; body; end'
@ ProgramNode (location: (1,0)-(1,23))
├── flags: ∅
├── locals: [:e]
└── statements:
    @ StatementsNode (location: (1,0)-(1,23))
    ├── flags: ∅
    └── body: (length: 1)
        └── @ ForNode (location: (1,0)-(1,23))
            ├── flags: newline
            ├── index:
            │   @ LocalVariableTargetNode (location: (1,4)-(1,5))
            │   ├── flags: ∅
            │   ├── name: :e
            │   └── depth: 0
            ├── collection:
            │   @ CallNode (location: (1,9)-(1,12))
            │   ├── flags: variable_call, ignore_visibility
            │   ├── receiver: ∅
            │   ├── call_operator_loc: ∅
            │   ├── name: :foo
            │   ├── message_loc: (1,9)-(1,12) = "foo"
            │   ├── opening_loc: ∅
            │   ├── arguments: ∅
            │   ├── closing_loc: ∅
            │   ├── equal_loc: ∅
            │   └── block: ∅
            ├── statements:
            │   @ StatementsNode (location: (1,14)-(1,18))
            │   ├── flags: ∅
            │   └── body: (length: 1)
            │       └── @ CallNode (location: (1,14)-(1,18))
            │           ├── flags: newline, variable_call, ignore_visibility
            │           ├── receiver: ∅
            │           ├── call_operator_loc: ∅
            │           ├── name: :body
            │           ├── message_loc: (1,14)-(1,18) = "body"
            │           ├── opening_loc: ∅
            │           ├── arguments: ∅
            │           ├── closing_loc: ∅
            │           ├── equal_loc: ∅
            │           └── block: ∅
            ├── for_keyword_loc: (1,0)-(1,3) = "for"
            ├── in_keyword_loc: (1,6)-(1,8) = "in"
            ├── do_keyword_loc: ∅
            └── end_keyword_loc: (1,20)-(1,23) = "end"

does not have a BlockNode?
I suppose we could return the ForNode#statements or probably better the ForNode itself.

eregon · 2025-12-14T20:06:18Z

or probably better the ForNode itself.

Implemented.
@tompng Thank you for the reminder, I didn't think of the block given to for.

… a callable * See https://bugs.ruby-lang.org/issues/21005 and https://bugs.ruby-lang.org/issues/20999

lib/prism/parse_result.rb

eregon · 2025-12-14T20:50:09Z

Looking at things like node.compact_child_nodes.each maybe we should have node.each_child_node? It would avoid the Array allocation and since there is no point to yield nil there is no need to name it node.each_compact_child_node.
Definitely something for another PR, but @kddnewton WDYT?

8000

* By comparing byte offsets which folds 3 branches into 1. * Also avoids allocation of Location objects.

eregon · 2025-12-15T10:41:04Z

I found that {Method,UnboundMethod,Proc}#source_location returns columns in bytes and not in characters.
I think that's a bug and filed https://bugs.ruby-lang.org/issues/21783 but nevertheless it's not a problem for this PR.

EDIT: It also means line_and_character_column_to_byte_offset isn't used in this PR anymore but I think it's a good addition anyway, and it will become used again when/if that CRuby bug is fixed.

eregon · 2025-12-15T10:53:55Z

lib/prism/parse_result.rb

+      when DefNode
+        node.start_offset == start_offset && node.end_offset == end_offset
+      when LambdaNode
+        # Proc#source_location returns start_column 2 for `-> { ... }` (just after the `->`)


Filed https://bugs.ruby-lang.org/issues/21784 for this

eregon · 2025-12-15T11:05:22Z

lib/prism/parse_result.rb

+    end_offset = parse_result.source.line_to_byte_offset(end_line) + end_column
+    start_byte_column = start_offset - parse_result.source.line_start(start_offset)
+
+    found = root.tunnel(start_line, start_byte_column).reverse.find do |node|


rfind would be nice here but of course it's only RUby 4+.
I guess we could do a polyfill but for this case I doubt it matters, most of the time is not spent here but in creating the Ruby objects for the AST, there are typically only a few nodes returned by tunnel.

eregon · 2025-12-15T11:17:24Z

Re the test-all failure, I'll fix it, makes it clear test-unit assert_raise is quite problematic: test-unit/test-unit#347

…offsets

kddnewton · 2025-12-15T13:37:37Z

lib/prism/parse_result.rb

+    end
+    root = parse_result.value
+    # CRuby currently returns the source_location columns in bytes and not characters
+    start_offset = parse_result.source.line_to_byte_offset(start_line) + start_column


I'm not understanding what you're doing here exactly. We already have line and column, and it looks like you're calculating offset, and then using that to recompute the column? The column is already in bytes, and that's what tunnel expects, so you should just be able to use that directly.

I compute start_offset & end_offset so we can have fast comparisons below (2 comparisons vs 4) and without creating Location objects for most cases.

Right, since CRuby currently return the column in byte I can remove start_byte_column, will push a commit to do that.

Done, I amended the last commit to simplify this part.

BTW, it might be easier to review commit by commit, that would show the purpose of start_offset better.

Okay last thing is I don't think we need the line_and_character_column_to_byte_offset API now.

Yeah, it's what I mentioned in #3808 (comment).
I'll remove it in its own commit, so if we need it again it'll be just a git revert away.

…characters

…te_offset

Earlopain · 2025-12-17T10:27:06Z

For reference, this won't be available (at least in Ruby 4.0) unfortunately https://bugs.ruby-lang.org/issues/21783#note-8

kddnewton · 2025-12-18T15:49:09Z

@eregon we've marked this as blocked until and unless the source location thing gets resolved. Another option is requesting an official node_id API. Is that something you would consider implementing in TruffleRuby?

Fix docs of opening_loc/closing_loc of BlockNode

f781416

eregon force-pushed the node_for branch from 28e3a3d to 6961042 Compare December 14, 2025 14:28

kddnewton requested changes Dec 14, 2025

View reviewed changes

lib/prism/parse_result.rb Show resolved Hide resolved

lib/prism/parse_result.rb Outdated Show resolved Hide resolved

eregon force-pushed the node_for branch from 6961042 to 7f5a25c Compare December 14, 2025 17:11

Earlopain reviewed Dec 14, 2025

View reviewed changes

lib/prism/parse_result.rb Outdated Show resolved Hide resolved

eregon force-pushed the node_for branch 2 times, most recently from dd02f75 to 5d510f5 Compare December 14, 2025 17:53

eregon force-pushed the node_for branch from fcece03 to ea6438a Compare December 14, 2025 18:07

eregon force-pushed the node_for branch 2 times, most recently from 24bdf87 to 3c0b91c Compare December 14, 2025 19:45

eregon added 5 commits December 14, 2025 21:08

Add Prism.node_for(Method|UnboundMethod|Proc) to get a Prism node for…

e12969b

… a callable * See https://bugs.ruby-lang.org/issues/21005 and https://bugs.ruby-lang.org/issues/20999

Stub source_location for older Rubies in NodeForTest

37481bf

We should use version: "current" in Prism.node_for

fb7ce50

Move out the inline method definition for Ruby 2.7

bc29248

Make rake typecheck:steep pass

2884d11

eregon force-pushed the node_for branch from 142c2f1 to 85c94ae Compare December 14, 2025 20:10

eregon requested a review from kddnewton December 14, 2025 20:43

eregon commented Dec 14, 2025

View reviewed changes

lib/prism/parse_result.rb Outdated Show resolved Hide resolved

eregon commented Dec 14, 2025

View reviewed changes

lib/prism/parse_result.rb Outdated Show resolved Hide resolved

eregon force-pushed the node_for branch from b1d85c2 to bca7512 Compare December 14, 2025 20:55

Handle for _ in _ procs in Prism.node_for

130def8

eregon force-pushed the node_for branch from bca7512 to e8310b1 Compare December 14, 2025 20:57

Use Prism::Node#tunnel to optimize search in Prism.node_for

fb92f80

eregon force-pushed the node_for branch from e8310b1 to 9c6cc97 Compare December 14, 2025 21:01

Simplify and optimize Prism::Node#tunnel

f8c80e6

* By comparing byte offsets which folds 3 branches into 1. * Also avoids allocation of Location objects.

eregon force-pushed the node_for branch 2 times, most recently from 966dda3 to eafdef7 Compare December 15, 2025 10:33

eregon commented Dec 15, 2025

View reviewed changes

Add Prism::Source#line_to_byte_offset and replace direct accesses to …

389fea8

…offsets

eregon force-pushed the node_for branch from eafdef7 to 4920b5c Compare December 15, 2025 11:20

kddnewton reviewed Dec 15, 2025

View reviewed changes

CRuby currently returns the source_location columns in bytes and not …

44578e7

…characters

eregon force-pushed the node_for branch from 4920b5c to 44578e7 Compare December 15, 2025 13:51

Remove currently-unused Prism::Source#line_and_character_column_to_by…

9c7550c

…te_offset

Earlopain added the blocked label Dec 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Prism.node_for(Method|UnboundMethod|Proc) to get a Prism node for a callable #3808

Add Prism.node_for(Method|UnboundMethod|Proc) to get a Prism node for a callable #3808

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add Prism.node_for(Method|UnboundMethod|Proc) to get a Prism node for a callable #3808

Are you sure you want to change the base?

Add Prism.node_for(Method|UnboundMethod|Proc) to get a Prism node for a callable #3808

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants