-
Notifications
You must be signed in to change notification settings - Fork 440
[SwiftLexicalLookup][GSoC] Add initial name lookup functionality #2719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
case identifier(String, SyntaxProtocol) | ||
/// Declaration associated with the name. | ||
/// Could be class, struct, actor, protocol, function and more | ||
case declaration(String, DeclSyntaxProtocol) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of the declarations with names conform to the NamedDeclSyntax
protocol. Please replace the (String, DeclSyntaxProtocl)
with NamedDeclSyntax
rather than duplicating the name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it now to use NamedDeclSyntax
. I also introduced IdentifiableSyntax
protocol to introduce similar sort of logic to the rest of the relevant syntax nodes.
} | ||
|
||
/// Extracts name introduced by `closureShorthandParameter`. | ||
private static func handle(closureShorthandParameter: ClosureShorthandParameterSyntax) -> [LookupName] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know you haven't covered the whole language yet, but closure captures are another thing to consider in the design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out, I forgot about this case. Apart from closure captures, I'll try to also include guard
and file scope with this PR. With those changes, it should be pretty straight forward to introduce those and then we could move on to more complex scopes with the next PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added closure capture, if case
support, guard
and file scope. I couldn't find a case where ClosureCaptureSyntax
expression
wouldn't be a DeclReferenceExprSyntax
hence the force unwrap. Do you think we can make this assumption?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we discussed on Monday, the desired behavior for the file scope can be adjusted by passing the right config to the lookup
method. By introducing LookupConfig
protocol we could also specify more configuration options that could add more specific behavior to name lookup in the future (something like: setting some default shadowing rules, filtering specific compiler result flags etc.). Also there's currently one quirk in the guard
scope implementation, even though it responds to all name lookup queries correctly, it assigns the declarations to enclosing scope. If you think we shouldn't merge it like this, I can put it fixed in the next PR instead. This one is already much larger than I anticipated.
/// declaration, followed by the first function name, and then the second function name, | ||
/// in this exact order. The constant declaration within the function body is omitted | ||
/// due to the ordering rules that prioritize visibility within the function body. | ||
@_spi(Experimental) public func lookup(for name: String) -> [LookupName] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should proactively unify the queries for "give me all of the entities visible at this point" (which is used for code completion, typo correction, etc.) and "give me all of the entities matching at this point" (which is used for resolving names). That could be as simple as making the name
parameter optional and, when it's non-nil
, filtering out any names that don't match.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made the name parameter optional and added support for looking up all names accessible at a given node. I made one test case that exercises this type of lookup. I'm only curious about the desired ordering.
class foo {
let 1️⃣a = 0
let 2️⃣b = 0
3️⃣func foo() {
let 4️⃣a = 0
let 5️⃣c = 0
6️⃣x
}
}
Like in this snippet, when running the lookup for all names at position 6️⃣, should we return the result as [(4️⃣, 5️⃣), (1️⃣, 2️⃣, 3️⃣)]
or [(5️⃣, 4️⃣), (3️⃣, 2️⃣, 1️⃣)]
(i.e. in the order of introduction or accessibility from the node used for lookup)? I think the first option would make more sense as we wouldn't have to account for cases like an additional declaration in a member block under the scope we perform lookup in and it's also the way name lookup works now, but I'm also curious of your take on this one.
/// declaration, followed by the first function name, and then the second function name, | ||
/// in this exact order. The constant declaration within the function body is omitted | ||
/// due to the ordering rules that prioritize visibility within the function body. | ||
@_spi(Experimental) public func lookup(for name: String) -> [LookupName] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally and unrelatedly, I think we're going to need more structure in the result than a flat array [LookupName]
. The current structure does not allow a client to distinguish between "there are two things that match f
in the same scope" and "there are two things that match f
but the first one shadows the second one", which is important for clients. I suggest returning an array of "interesting" scopes, where an interesting scope contains the names the found in that scope, and perhaps other directions like "look into the instance members of ". For now, we could just have an array of scopes, where each scope has the LookupNames
found in it, so we can handle shadowing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reaso 8000 n will be displayed to describe this comment to others. Learn more.
I introduced a new enum LookupResult
that, as for now, has one case fromScope
that holds ScopeSyntax
and [LookupName]
associated with that scope. With the lookup
function result type now being [LookupResult]
, I think this could be a nice structure to work with. We could easily add new cases for the result that could encapsulate more information and e.g. act as compiler flags or provide suggestions for shadowing. What do you think about it?
) -> [LookupName] { | ||
introducedNames | ||
.filter { introducedName in | ||
(!positionSensitive || introducedName.isBefore(syntax)) && introducedName.refersTo(name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This "is before" check isn't going to work when we start modeling the precise location where a variable in a pattern is introduced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it now to isAccessible
which is used in every filtering condition and accounts for the precise location of introduction set by the parent scope. Do you think this could work in the long run?
…raction in `LookupName`.
…dd a way to customize lookup behavior through `LookupConfig`/
|
||
extension ClosureCaptureSyntax: IdentifiableSyntax { | ||
@_spi(Experimental) public var identifier: TokenSyntax { | ||
expression.as(DeclReferenceExprSyntax.self)!.baseName |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't going to handle all of the possible kinds of captures. For example, here are some captures that aren't a DeclReferenceExprSyntax
:
{ [x=y+1, weak self] in print(x) }
I think the best result here would involve reworking the syntax tree and parser to provide more structure for the parsed closure captures, so there's a single node describing the captured name that we can make IdentifiableSyntax
.
import SwiftSyntax | ||
|
||
/// Syntax node that can be refered to with an identifier. | ||
public protocol IdentifiableSyntax: SyntaxProtocol { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like the kind of thing that we should sink down into SwiftSyntax itself. It can be a refactoring we apply later, when we're more sure about the design.
case .identifier(let syntax, _): | ||
syntax.identifier.text | ||
case .declaration(let syntax, _): | ||
syntax.name.text |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should be using Identifier
here to properly work with back-ticked names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out. I changed it to use Identifier
for name comparison. That also means, the current implementation doesn't work for implicit names like self
as it's token kind is not identifier
. I removed one test case that exercised self
capture in closures, but will add it back once we properly model implicit names (hopefully in the next PR).
/// declaration, followed by the first function name, and then the second function name, | ||
/// in this exact order. The constant declaration within the function body is omitted | ||
/// due to the ordering rules that prioritize visibility within the function body. | ||
@_spi(Experimental) public func lookup(for name: String?, with configArr: [LookupConfig] = []) -> [LookupResult] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the array of LookupConfig
is more abstract than we need. Any kind of configuration is going to need custom logic within the lookup
function. I recommend replacing this array (and the LookupConfig
protocol) with a simple struct with configuration options, e.g.,
struct LookupConfig {
var fileScopeHandling: FileScopeHandlingConfig = (whatever the default is)
}
Over time, we can expand it if we need to. For example, we could add an optional build configuration when we're ready to deal with #if
s:
var buildConfiguration: (any BuildConfiguration)? = nil
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I introduced LookupConfig
struct to simplify configuration handling. Thank you for pointing out using build configuration. Do you think we could also use it now to provide default file scope behavior? Or should we leave it for later when we have more configurations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine to use it now. We'll benefit from having it threaded through the APIs that need it.
case identifier(IdentifiableSyntax, accessibleAfter: AbsolutePosition?) | ||
/// Declaration associated with the name. | ||
/// Could be class, struct, actor, protocol, function and more | ||
case declaration(NamedDeclSyntax, accessibleAfter: AbsolutePosition?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expect that we'll need another case here eventually for "implicit" names, such as the error
introduced in
do {
try something()
} catch { // the catch clause introduces the name `error`.
}
This can, of course, come later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion. I'll introduce implicit names with the next PR.
|
||
extension ClosureCaptureSyntax: IdentifiableSyntax { | ||
@_spi(Experimental) public var identifier: TokenSyntax { | ||
expression.as(DeclReferenceExprSyntax.self)!.baseName |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't work for all closure captures. For example:
{ [x=y+1, weak self] in ... }
I suspect the right answer is to rework the syntax tree so that the syntax structure always provides something that we can make an IdentifiableSyntax without the !
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(this doesn't have to happen now)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for now, I left a comment next to the force unwrap so we don't forget about it. Do you have already an idea how could we alter the syntax tree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think we want different syntactic forms for captures like x
, x=<expr>
, and weak x
, where the x
in every case is just the syntax for the capture name. It will make it a lot easier to identify the names introduced by captures.
import SwiftSyntax | ||
|
||
/// Syntax node that can be refered to with an identifier. | ||
public protocol IdentifiableSyntax: SyntaxProtocol { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's iterate with this protocol some more. We might want to sink it down into SwiftSyntax and make the NamedDeclSyntax protocol inherit from it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(this doesn't have to happen now)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll keep adding the protocol to more syntax nodes like FunctionParameterSyntax
and GenericParameterSyntax
in the coming PRs. Also, would it be a good idea to soon fix the problem with ClosureCaptureSyntax
so we can complete implementation of IdentifiableSyntax
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the fix with ClosureCaptureSyntax
should be a separate PR from this one entirely, since it will involve changing the syntax tree. I don't specifically mind whether it goes before or after this PR; we can fix it up either way. (But we shouldn't sink this protocol down into SwiftSyntax until we fix the issue)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point, I think I'm happy with this PR as the starting point, and then we can refine it in subsequence PRs to improve the modeling of closures and to introduce implicit names.
@swift-ci please test |
@swift-ci please test |
@swift-ci please test Windows |
Sources/SwiftLexicalLookup/Configurations/FileScopeNameIntroductionStrategy.swift
Show resolved
Hide resolved
/// in this exact order. The constant declaration within the function body is omitted | ||
/// due to the ordering rules that prioritize visibility within the function body. | ||
@_spi(Experimental) public func lookup( | ||
for name: String?, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these API’s take String
as a parameter or Identifier
? I think it really depends on how we envision this API to be used. If we mostly expect it to be used to look up names of eg. DeclReferenceExprSyntax
, then I’d prefer to use Identifier
because it means that callers can’t get handling of escaping using backticks wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it could maybe be a bit tricky to also support implicit name lookup like self
, newValue
etc. when using Identifier
. That’s why I actually had to remove self
capture from this PR as implicit name handling is now part of #2748. I might’ve missed something, so I’d love to hear your take on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think newValue
can be easily represented as an Identifier
. We just need to add an initializer that creates a new Identifier
from a String
.
I was briefly suspicious of self
because I thought that the keyword self
and the identifier self
could refer to different things. But looks like they are considered identical
struct Foo {
func test() {
let `self` = 1
print(self)
print(`self`)
}
}
Foo().test()
// Prints
// 1
// 1
// I might have expected
// Foo()
// 1
The following deviation of this might be a good test case as well
struct Foo {
func test() {
print(self)
let `self` = 1
print(`self`)
}
}
Another advantage of using Identifier
would be that, if we decide to introduce some semantics where eg. the self
keyword refers to something different than the self
identifier, we might be able to add that to the Identifier
type. If we use String
as the basis, our hands are tied.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created a separate branch with Identifier-based lookup. I'm not entirely sure this is the most optimal solution, as it's a naive one, so I'd love to hear your feedback.
I added a new StringIdentifier
that strips backticks and added it as an optional property to Identifier
. I also made optional the original raw
and arena
properties. I'm mainly concerned about having the old properties as optional.
public struct StringIdentifier: Equatable, Hashable, Sendable {
public let name: String
fileprivate init(_ string: String) {
let backtick = "`"
if string.count > 2 && string.hasPrefix(backtick) && string.hasSuffix(backtick) {
let startIndex = string.index(after: string.startIndex)
let endIndex = string.index(before: string.endIndex)
self.name = String(string[startIndex..<endIndex])
} else {
self.name = string
}
}
}
I also added the test cases you've mentioned, that now work as expected due to the new Identifier-based name comparison.
As it's an internal change in SwiftSyntax
module, I'd like to first ask for your feedback before adding it into the next PR. Can you perhaps think of a better way to handle this?
assertLexicalScopeQuery( | ||
source: source, | ||
methodUnderTest: { marker, argument in< 10000 /span> | ||
let result = argument.lookup(for: useNilAsTheParameter ? nil : argument.text, with: config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be argument.identifier.name
so that we handle escaped names correctly? Also seem my comment on whether the lookup
API should be based on String
or Identifier
.
let result = argument.lookup(for: useNilAsTheParameter ? nil : argument.text, with: config) | |
let result = argument.lookup(for: useNilAsTheParameter ? nil : argument.text, with: config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It wouldn’t work for implicit name references like self
, as their identifiers are nil
. In this case, those are just internals of a test method so I think it should be fine to just use argument.text
. I think, as for now at least, we should left out using token.identifier.name
over token.text
up to the client and keep using String
in name lookup for the sake of simplicity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the result of this will be whether the lookup API should be Identifier
-based or String
-based. So let’s continue that one.
@MAJKFL I've merged, but please look at Alex's comments above to address in a follow-up PR. |
Thanks for merging! Once I create a new PR with the suggested changes, I'll respond to the comments and link the fixes. |
This PR introduces
ScopeSyntax
protocol and some initial scope implementations.let (a, b) = (0, 0)
and inline chaininglet a = 0, b = 0
.LookupName
, an enum that distinguishes identifiers (extracted from variable declarations and/or closure/function parameters) from declarations (struct, class, actor, function etc.). We could possibly also add more cases that would serve as flags for the compiler later on.for
andwhile
loops and closures.assertLexicalScopeQuery
test harness.ScopeSyntax
protocol to build hierarchy and uses position of the node used for lookup to determine availability of certain declarations (like in code block orif let
).