8000 Fix escaping of <py-script> · Issue #1764 · pyscript/pyscript · GitHub
[go: up one dir, main page]

Skip to content

Fix escaping of <py-script> #1764

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
antocuni opened this issue Sep 28, 2023 · 2 comments
Closed

Fix escaping of <py-script> #1764

antocuni opened this issue Sep 28, 2023 · 2 comments

Comments

@antocuni
Copy link
Contributor
antocuni commented Sep 28, 2023

This is a sub issue of #1762 and it's related to the following test:

@skip_worker("NEXT: something very weird happens here")
def test_escaping_of_angle_brackets(self):
"""
Check that script tags escape angle brackets
"""
self.pyscript_run(
"""
<script type="py">import js; js.console.log("A", 1<2, 1>2)</script>
<script type="py">import js; js.console.log("B <div></div>")</script>
<py-script>import js; js.console.log("C", 1<2, 1>2)</py-script>
<py-script>import js; js.console.log("D <div></div>")</py-script>
"""
)
# in workers the order of execution is not guaranteed, better to play
# safe
lines = sorted(self.console.log.lines[-4:])
assert lines == [
"A true false",
"B <div></div>",
"C true false",
"D <div></div>",
]

Something is very weird when it comes to <py-script> parsing. If you try to run this code:

    <py-script>import js; js.console.log("C", 1<2, 1>2)</py-script>
    <py-script>import js; js.console.log("D <div></div>")</py-script>

    <py-script>
        import js
        js.console.log("E", 1<2, 1>2);
        js.console.log("F <div></div>")
    </py-script>

With PyScript classic, you get the following output, as expected:

C true false
D <div></div>
E true false
F <div></div>

With 2023.09.1RC1, something very weird happens: the "C" and "D" line are parsed correctly, but the "E" line causes troubles:
image

You can see it in action here (and you can also see that by using 2023.05.1 it works):
https://pyscript.com/@antocuni/py-script-escaping/latest

I know that <py-script> parsing is fragile and bad things can happen, but if pyscript classic was able to deal with it correctly, then pyscript next should do the same.

For reference, this is the code which was responsible to decode the content inside <py-script> tags in classic:

export function htmlDecode(input: string): string | null {
const doc = new DOMParser().parseFromString(ltrim(escape(input)), 'text/html');
return doc.documentElement.textContent;
}

@WebReflection
Copy link
Contributor

this was already fixed and it still works out of this smoke test https://github.com/pyscript/pyscript/blob/main/pyscript.core/test/html-decode.html

but I went ahead and noticed that we have issues only if there are repeated shenanigans:

      <py-script>
        # works
        import js
        # js.console.log("E", 1<2, 1>2)
        js.console.log("F <div></div>")
      </py-script>
      <py-script>
        # works
        import js
        js.console.log("E", 1<2, 1>2)
      </py-script>
      <py-script>
        # fail
        import js
        js.console.log("E", 1<2, 1>2)
        # js.console.log("F <div></div>")
      </py-script>

So yes, this is a bug but it's a weird one.

@WebReflection
Copy link
Contributor

P.S. just to provide some background:

For reference, this is the code which was responsible to decode the content inside tags in classic:

not that was not it, both escape and ltrim were slow and ugly and most importantly likely not necessary but surely this use case was somehow fixed ... we just need to figure out what is it that makes the conversion bananas now but the underlying code does pretty much the same as it used to be, just out of DOM primitives (less home-made code to maintain).

We need to do that right before reverting current code, imho.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
0