10000 Fix #1171: Split: Sometimes remove leading empty match by gzm0 · Pull Request #1198 · scala-js/scala-js · GitHub
[go: up one dir, main page]

Skip to content

Fix #1171: Split: Sometimes remove leading empty match #1198

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 28, 2014

Conversation

gzm0
Copy link
Contributor
@gzm0 gzm0 commented Oct 27, 2014

The conditions were determined experimentally since the JavaDoc is
note precise enough.

@gzm0
Copy link
Contributor Author
gzm0 commented Oct 27, 2014

Review by @sjrd

@scala-jenkins
Copy link

Test FAILed.
Refer to this link for build results: https://scala-webapps.epfl.ch/jenkins/job/scalajs-pr/781/


val result = new js.Array[String](0)
val result = mutable.Buffer.empty[String]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the change? I would keep new js.Array, here (potentially without length argument, as then it will become simply []).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mainly removed it so we can drop the loop that copies to a scala.Array. But I can revert to js.Array if you prefer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well now you can use .toArray on a js.Array. You don't need the loop.
And mutable.Buffer.empty creates a js.Array under the hood anyway, nowadays, so your argument is void.

@sjrd
Copy link
Member
sjrd commented Oct 27, 2014

That's all.

Quite an "experiment-driven development" ^^

result += inputStr.substring(prevEnd)

// Remove a leading empty element iff there is no other place the
// regex matches (than at the beginning)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgive my lack of tact here but there's no way this is the correct logic. It should be:

  1. r = input
  2. if r matches, store left-match in result-array, set r to right-match
  3. loop until r empty or no match.
  4. store r in result-array if non-empty.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the splitting logic, but post-processing (and it actually is wrong, but for another reason). The splitting logic is in the while-loop just above and is almost as you describe (we actually have to discard the matching string, rather than adding it to the result).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The post-processing logic is wrong, because it should only remove the leading empty element if the match that produced it is zero-length. Consider this:

"abc".split("a") // -> Array("", "bc")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, by left-match I meant stuff left of the current match (since last match) while, yes, I agree the match itself is gobbled.

@japgolly
Copy link
Contributor

Now that this patch shows me where to make changes and where to add tests, I could build on it if you like? I'm always using regex, I think I could fix & improve it.

Different question, do we expect regex in Scala.js output to behave the same as JVM regex? Because if so there are a bunch of another regex things I've noticed don't work...

The conditions were determined experimentally since the JavaDoc is
note precise enough.
@gzm0
Copy link
Contributor Author
gzm0 commented Oct 28, 2014

Updated

@gzm0
Copy link
Contributor Author
gzm0 commented Oct 28, 2014

@japgolly About your general question about regex: Of course the ideal situation would be that Regex in Scala.js behaves exactly like on the JVM. However, since Scala.js uses JavaScript's regexes to implement the JVM regexes, there are inherent semantic differences.

It would probably be exaggerated to package a regex engine written in JavaScript with Scala.js just to have the same semantics (although these things exist).

On the other hand, if a semantic alignment with the JVM is easy (in terms of implementation on Scala.js' side), that is definitely desirable (and PRs & tests are very welcome).

@scala-jenkins
Copy link

Test FAILed.
Refer to this link for build results: https://scala-webapps.epfl.ch/jenkins/job/scalajs-pr/790/

@gzm0
Copy link
Contributor Author
gzm0 commented Oct 28, 2014

retest this please

@sjrd
Copy link
Member
sjrd commented Oct 28, 2014

LGTM, even though I fear we're not done yet with this method ^^

@scala-jenkins
Copy link

Test FAILed.
Refer to this link for build results: https://scala-webapps.epfl.ch/jenkins/job/scalajs-pr/791/

@gzm0
Copy link
Contributor Author
gzm0 commented Oct 28, 2014

retest this please

@scala-jenkins
Copy link

Test PASSed.
Refer to this link for build results: https://scala-webapps.epfl.ch/jenkins/job/scalajs-pr/793/

sjrd added a commit that referenced this pull request Oct 28, 2014
Fix #1171: Split: Sometimes remove leading empty match
@sjrd sjrd merged commit 5205704 into scala-js:master Oct 28, 2014
@gzm0 gzm0 deleted the fix-regex branch October 28, 2014 19:21
@japgolly
Copy link
Contributor

@gzm0 Gotcha. It's the dependence on the JS regex that makes things difficult (as I'm sure you know). Hold on, let me move this conversation elsewhere...

@japgolly japgolly mentioned this pull request Oct 28, 2014
@scala-jenkins
Copy link

Test PASSed.
Refer to this link for build results: https://scala-webapps.epfl.ch/jenkins/job/scalajs-pr/795/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0