-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed language detection to support parsing of HTML fragments #121
Conversation
This looks good (simple enough of an update), but can you add a test case? Specifically I'm wondering if this still parses language from the HTML fragments or skips it. |
@gRegorLove I tried running the present tests but 2 of them fail for me (both having to do with fetching remote documents from Barnaby's and Aaron's sites, so I guess it's a problem on my side). I don't know if there's a test included that checks for the language, but if so, it should still work. The new release Aaron just pushed out broke my code (micrometa) because of the mandatory |
@gRegorLove Re-read your comment. Well, there should be no difference in language handling after adding the check for |
@gRegorLove I now added a simple test for parsing the language of an HTML fragment without enclosing |
Awesome, thanks! I think I misunderstood the update initially. Having another test is good though. :) |
I tried running the test case without the change you submitted, but it doesn't fail the test. Can you provide example HTML that causes this error before the fix? |
Got it, thanks! |
Interesting! I just realized that the problem hits you only in an edge case: The parser allows to pass in a |
The current way of language detection breaks when parsing an HTML fragment only (instead of a complete HTML document). The parser expects an
<html>
element to be present and will run intootherwise. Making sure the parent node is a
DOMElement
before recursing fixes the problem.