8000 Base stat/expr checking on the type system in IRChecker by gzm0 · Pull Request #4439 · scala-js/scala-js · GitHub
[go: up one dir, main page]

Skip to content

Base stat/expr checking on the type system in IRChecker #4439

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 7, 2021

Conversation

gzm0
Copy link
Contributor
@gzm0 gzm0 commented Feb 25, 2021

This allows us to remove some duplicate code for trees that can be in both statement or expression position.

@gzm0 gzm0 requested a review from sjrd February 25, 2021 07:29
@gzm0
Copy link
Contributor Author
gzm0 commented Feb 25, 2021

Note that typecheckExpect at this point essentially encodes the notion that AnyType <:< NoType. We might go one step further and put this into Types.isSubtype itself.


val expectedType =
if (tpe == NoType) AnyType
else tpe
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit suspicious. I guess the deal here is that return must take an expression. I wonder if we should require UndefType instead (in a different PR, that would be a non-backward compatible change).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it's weird that a Return node demands an expression, even when the result of the Labeled block is NoType.

For what it's worth, in the theory it was the Return node that had a specific exception, making void (NoType) invalid:
image
(note the S ≠ void)

Perhaps we could align the code of the IR checker with the typing rules a bit better by moving this change to the checking of Return.

If we have to change something in the future, I would rather relax the restriction on Return nodes, and make them accept "expressions" of type NoType if the type of the Labeled is NoType. That would be much nicer than requiring UndefType I believe. Plus, it would be backward compatible.


case Block(trees) =>
if (trees.isEmpty)
reportError("illegal empty block")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is unnecessary. A Block will fail to be created with no trees as it cannot determine it's type.

sjrd
sjrd previously requested changes Feb 25, 2021
Copy link
Member
@sjrd sjrd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This is much nicer than what we had before. :)

@@ -1213,11 +1154,21 @@ private final class IRChecker(unit: LinkingUnit, logger: Logger) {
typecheckExpect(value, env, ctpe)
}

case _ =>
reportError(i"Invalid expression tree")
case _: JSSuperConstructorCall | _: RecordSelect | _: RecordValue | _: Transient =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
case _: JSSuperConstructorCall | _: RecordSelect | _: RecordValue | _: Transient =>
case _:JSSuperConstructorCall | _:RecordSelect | _:RecordValue | _:Transient =>


val expectedType =
if (tpe == NoType) AnyType
else tpe
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it's weird that a Return node demands an expression, even when the result of the Labeled block is NoType.

For what it's worth, in the theory it was the Return node that had a specific exception, making void (NoType) invalid:
image
(note the S ≠ void)

Perhaps we could align the code of the IR checker with the typing rules a bit better by moving this change to the checking of Return.

If we have to change something in the future, I would rather relax the restriction on Return nodes, and make them accept "expressions" of type NoType if the type of the Labeled is NoType. That would be much nicer than requiring UndefType I believe. Plus, it would be backward compatible.

if (tpe == NoType) AnyType
else tpe

typecheck(body, env.withLabeledReturnType(label.name, expectedType))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
typecheck(body, env.withLabeledReturnType(label.name, expectedType))
typecheckExpect(body, env.withLabeledReturnType(label.name, expectedType), tpe)

This is important, as the type of the body must also be a subtype of the type of the Labeled block.

@sjrd
Copy link
Member
sjrd commented Feb 25, 2021

Note that typecheckExpect at this point essentially encodes the notion that AnyType <:< NoType. We might go one step further and put this into Types.isSubtype itself.

It's worth investigating, at least.

@gzm0
Copy link
Contributor Author
gzm0 commented Feb 25, 2021

Hmmm... this might expose some genuine failures. In fact, in this version the IR checker is much stricter: It expects intermediate trees to be properly typed as well. Consider:

Block(
  If(<cond>, IntLiteral(1), Skip()),
  IntLiteral(2)
)

In the previous version of the IR checker, the If node may have any (!) type, even something completely unrelated. This new version will fail (correctly so IMO), if the If subtree isn't properly typed; independent of its usage.

For example this failure:

scala-js/javalib/src/main/scala/java/util/regex/GroupStartMapper.scala(447:9:If): any expected but <notype> found for tree of type org.scalajs.ir.Trees$Block

hints at that being the problem.

The relevant if block:

8000
if (baseNode ne null) { // null if we just completed an alternative
(pattern.charAt(pIndex): @switch) match {
case '+' | '*' | '?' =>
val startIndex = pIndex
if (pattern.charAt(startIndex + 1) == '?') // non-greedy mark
pIndex += 2
else
pIndex += 1
val repeater = pattern.substring(startIndex, pIndex)
sequence.push(new RepeatedNode(baseNode, repeater))
case '{' =>
// parse until end of occurrence
val startIndex = pIndex
pIndex = pattern.indexOf("}", startIndex + 1) + 1
if (pattern.charAt(pIndex) == '?') // non-greedy mark
pIndex += 1
val repeater = pattern.substring(startIndex, pIndex)
sequence.push(new RepeatedNode(baseNode, repeater))
case _ =>
val sequenceLen = sequence.length
if (sequenceLen != 0 && baseNode.isInstanceOf[LeafRegexNode] &&
sequence(sequenceLen - 1).isInstanceOf[LeafRegexNode]) {
val fused = new LeafRegexNode(
sequence(sequenceLen - 1).asInstanceOf[LeafRegexNode].regex +
baseNode.asInstanceOf[LeafRegexNode].regex)
sequence(sequenceLen - 1) = fused
} else {
sequence.push(baseNode)
}
}
}
}

if ((baseNode !== null)) {
  val x1$2{x1}: char = this.java.util.regex.GroupStartMapper$Parser::pattern.charAt;I;C(this.java.util.regex.GroupStartMapper$Parser::pIndex);
  match (((int)x1$2)) {
    case 43 | 42 | 63:
      {
        val startIndex$3{startIndex}: int = this.java.util.regex.GroupStartMapper$Parser::pIndex;
        if ((((int)this.java.util.regex.GroupStartMapper$Parser::pattern.charAt;I;C((startIndex$3 +[int] 1))) ==[int] ((int)'?'))) {
          this.java.util.regex.GroupStartMapper$Parser::pIndex = (this.java.util.regex.GroupStartMapper$Parser::pIndex +[int] 2)
        } else {
          this.java.util.regex.GroupStartMapper$Parser::pIndex = (this.java.util.regex.GroupStartMapper$Parser::pIndex +[int] 1)
        };
        val repeater: java.lang.String = this.java.util.regex.GroupStartMapper$Parser::pattern.substring;I;I;Ljava.lang.String(startIndex$3, this.java.util.regex.GroupStartMapper$Parser::pIndex);
        sequence["push"](new java.util.regex.GroupStartMapper$RepeatedNode().<init>;Ljava.util.regex.GroupStartMapper$Node;Ljava.lang.String;V(baseNode, repeater)).asInstanceOf[int]
      };
    case 123:
      {
        val startIndex$4{startIndex}: int = this.java.util.regex.GroupStartMapper$Parser::pIndex;
        this.java.util.regex.GroupStartMapper$Parser::pIndex = (this.java.util.regex.GroupStartMapper$Parser::pattern.indexOf;Ljava.lang.String;I;I("}", (startIndex$4 +[int] 1)) +[int] 1);
        if ((((int)this.java.util.regex.GroupStartMapper$Parser::pattern.charAt;I;C(this.java.util.regex.GroupStartMapper$Parser::pIndex)) ==[int] ((int)'?'))) {
          this.java.util.regex.GroupStartMapper$Parser::pIndex = (this.java.util.regex.GroupStartMapper$Parser::pIndex +[int] 1)
        };
        val repeater$2{repeater}: java.lang.String = this.java.util.regex.GroupStartMapper$Parser::pattern.substring;I;I;Ljava.lang.String(startIndex$4, this.java.util.regex.GroupStartMapper$Parser::pIndex);
        sequence["push"](new java.util.regex.GroupStartMapper$RepeatedNode().<init>;Ljava.util.regex.GroupStartMapper$Node;Ljava.lang.String;V(baseNode, repeater$2)).asInstanceOf[int]
      };
    default:
      {
        val sequenceLen: int = sequence["length"].asInstanceOf[int];
        if ((sequenceLen !=[int] 0) && baseNode.isInstanceOf[java.util.regex.GroupStartMapper$LeafRegexNode] && sequence[(sequenceLen -[int] 1)].isInstanceOf[java.util.regex.GroupStartMapper$LeafRegexNode]) {
          val fused: java.util.regex.GroupStartMapper$LeafRegexNode = new java.util.regex.GroupStartMapper$LeafRegexNode().<init>;Ljava.lang.String;V((sequence[(sequenceLen -[int] 1)].asInstanceOf[java.util.regex.GroupStartMapper$LeafRegexNode].regex;Ljava.lang.String() +[string] baseNode.asInstanceOf[java.util.regex.GroupStartMapper$LeafRegexNode].regex;Ljava.lang.String()));
          sequence[(sequenceLen -[int] 1)] = fused;
          (void 0)
        } else {
          sequence["push"](baseNode).asInstanceOf[int]
        }
      };
  }
} else {
  (void 0)
}

is in statement position, so the inner block may indeed be a statement. However, the If might end-up being typed as any, hence the failure. This is only a hypothesis at this point. I couldn't reproduce this with a smaller example.

@sjrd
Copy link
Member
sjrd commented Feb 25, 2021

Ouch, that would be bad. It seems like a plausible explanation to me, though. I wouldn't be surprised if scalac left If nodes typed as Object if one of the branches is () but the other isn't; and I wouldn't be surprised if we didn't do anything about it.

@gzm0
Copy link
Contributor Author
gzm0 commented Feb 26, 2021

Yes, adjusting printers to print the type, confirms this:

if [any] ((baseNode !== null)) {

Further, it is interesting to observe that the else branch actually contains (void 0) rather than a Skip() (and hence not being printed).

@gzm0
Copy link
Contributor Author
gzm0 commented May 30, 2021

I'm coming back to this, after fixing the underlying problems. We'll probably need to put a deser hack in place, that changes the type of the following trees to NoType, if they are in statement position:

  • If
  • TryCatch
  • Match

We have two ways to do this:

  • When we read each individual tree => nicely local, but wiring to get stat/expr pos or not is non-local, and duplicates the transformer logic. Further, we must not use it for non-hack deser logic.
  • Post read, using a Transformer => Transformer keeps track of stat/expr pos for us, but now we need to intercept the reading sites (in readClassDef / readMemberDef).

@sjrd
Copy link
Member
sjrd commented May 30, 2021

That seems like a good analysis. Overall I think I prefer the second option. It seems to be less intrusive to the design of the deserializer. It is also probably more efficient for reading newer IR (version 1.6+).

@sjrd
Copy link
Member
sjrd commented Jun 4, 2021

@gzm0 I updated the PR with:

  • Rebase, and fix conflicts
  • The new IR checker comes first, with a deser hack
  • It is followed by the compiler fix, conditioning the deser hack
  • Also apply the fix to TryCatch nodes, to make run/interpolationMultiline2.scala pass

@sjrd sjrd dismissed their stale review June 4, 2021 12:54

Addressed.

@@ -1213,6 +1213,33 @@ object Serializers {
implicit val pos = readPosition()
val tag = readByte()

def bodyHack15(body: Tree, isStat: Boolean): Tree = {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need to apply this treatment also to jsSuperClass?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jsSuperclass is always a VarRef, when created by existing compilers, so it's not necessary.

(I'm pretty sure the names would be similarly safe, but I couldn't convince myself quickly enough.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. Probably warrants a comment?

override def transform(tree: Tree, isStat: Boolean): Tree = {
val newTree = super.transform(tree, isStat)
if (isStat && newTree.tpe != NoType) {
super.transform(newTree, isStat) match {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, why are we transforming the tree twice?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, that's a leftover from a version before a refactoring. This should just be newTree match

@sjrd
Copy link
Member
sjrd commented Jun 4, 2021

Updated.

@gzm0 gzm0 linked an issue Jun 5, 2021 that may be closed by this pull request
Copy link
Contributor Author
@gzm0 gzm0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, modulo a comment RE why jsSuperClass is not hacked.

gzm0 added 3 commits June 7, 2021 11:50
Instead of distinguishing statements and expressions, we typecheck
everything in the same way. Statements are just regular trees whose
type happens to be NoType.

We use this opportunity to re-order the match statements in the same
order as the trees appear in Trees.scala.

This change uncovered issues in the way we emit If, Match and
TryCatch nodes in the compiler. Some of those nodes, when in
statement position, were typed as something else than NoType, but
had children typed as NoType. The previous IR checker did not flag
those cases as errors, whereas the new one does. We introduce a
deserialization hack to fix the type of If, Match and TryCatch
nodes in statement position to always be NoType.
…osition

In statement position, they must always be typed as NoType, since
their children might be typed as such.

We condition the deserialization hack introduced in the parent
commit so that it only applies to old code.
gzm0 added 2 commits June 7, 2021 11:50
This corresponds to the formal IR specification and solidifies the
fact that any expression is allowed in statement position.

The optimizer is only using the subtyping checks for expressions. We
assert this in the code.
This will move errors to the containing trees, rather than the checked
trees itself. This makes sense: Whether or not a tree is in expression
position is a property of the usage, not the tree itself.
@sjrd sjrd merged commit bdfe1a0 into scala-js:master Jun 7, 2021
@gzm0 gzm0 deleted the use-type branch June 12, 2021 11:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Trees in statement position may have incorrect types
2 participants
0