8000 librustc_lexer: Refactor the module by popzxc · Pull Request #66015 · rust-lang/rust · GitHub
[go: up one dir, main page]

Skip to content

librustc_lexer: Refactor the module #66015

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Nov 6, 2019
Prev Previous commit
Next Next commit
librustc_lexer: Simplify "lifetime_or_char" method
  • Loading branch information
popzxc committed Nov 4, 2019
commit ecd26739d45837ee21fe0e2941f957086fbf6a47
67 changes: 37 additions & 30 deletions src/librustc_lexer/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -498,41 +498,48 @@ impl Cursor<'_> {

fn lifetime_or_char(&mut self) -> TokenKind {
debug_assert!(self.prev() == '\'');
let mut starts_with_number = false;

// Check if the first symbol after '\'' is a valid identifier
// character or a number (not a digit followed by '\'').
if (is_id_start(self.nth_char(0))
|| self.nth_char(0).is_digit(10) && {
starts_with_number = true;
true
})
&& self.nth_char(1) != '\''
{
self.bump();

// Skip the identifier.
while is_id_continue(self.nth_char(0)) {
self.bump();
}
let can_be_a_lifetime = if self.second() == '\'' {
// It's surely not a lifetime.
false
} else {
// If the first symbol is valid for identifier, it can be a lifetime.
// Also check if it's a number for a better error reporting (so '0 will
// be reported as invalid lifetime and not as unterminated char literal).
is_id_start(self.first()) || self.first().is_digit(10)
};

return if self.nth_char(0) == '\'' {
self.bump();
let kind = Char { terminated: true };
Literal { kind, suffix_start: self.len_consumed() }
} else {
Lifetime { starts_with_number }
};
if !can_be_a_lifetime {
let terminated = self.single_quoted_string();
let suffix_start = self.len_consumed();
if terminated {
self.eat_literal_suffix();
}
let kind = Char { terminated };
return Literal { kind, suffix_start };
}

// This is not a lifetime (checked above), parse a char literal.
let terminated = self.single_quoted_string();
let suffix_start = self.len_consumed();
if terminated {
self.eat_literal_suffix();
// Either a lifetime or a character literal with
// length greater than 1.

let starts_with_number = self.first().is_digit(10);

// Skip the literal contents.
// First symbol can be a number (which isn't a valid identifier start),
// so skip it without any checks.
self.bump();
self.eat_while(is_id_continue);

// Check if after skipping literal contents we've met a closing
// single quote (which means that user attempted to create a
// string with single quotes).
if self.first() == '\'' {
self.bump();
let kind = Char { terminated: true };
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, I'm not sure why we're consuming the literal suffix above, but do not consume here.
As a result, we have a different errors for non-single character single-quoted literals with suffixes depending on the first symbol:
Playground 1 / Playground 2

That's pretty esoteric, I know, but nevertheless it seems a bit inconsistent to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'm not sure if that can even be called a bug since the code in example is completely invalid)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we detect error in the char literal, it's better to recover the next token as identifier, rather than treat it as a suffix

return Literal { kind, suffix_start: self.len_consumed() };
}
let kind = Char { terminated };
return Literal { kind, suffix_start };

return Lifetime { starts_with_number };
}

fn single_quoted_string(&mut self) -> bool {
Expand Down
0