8000 editorial changes · awesome-python/html5lib-python@7568d31 · GitHub
[go: up one dir, main page]

Skip to content

Commit 7568d31

Browse files
committed
editorial changes
--HG-- extra : convert_revision : svn%3Aacbfec75-9323-0410-a652-858a13e371e0/trunk%40914
1 parent 8f6e2e7 commit 7568d31

File tree

2 files changed

+7
-4
lines changed

2 files changed

+7
-4
lines changed

README

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
html5lib is a pure-python library for parsing HTML. It is designed to
2-
conform to the Web Applications 1.0 specification, which has
3-
formalized the error handling algorithms of popular web browsers.
2+
conform to the HTML 5 specification, which has formalized the error handling
3+
algorithms of popular web browsers.
44

55
= Installation =
66

@@ -36,3 +36,4 @@ http://code.google.com/p/html5lib/issues/list
3636
Contributions to code or documenation are actively encouraged. Submit
3737
patches to the issue tracker or discuss changes on irc in the #whatwg
3838
channel on freenode.net
39+

src/html5lib/tokenizer.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -297,11 +297,15 @@ def emitCurrentToken(self):
297297

298298
def dataState(self):
299299
data = self.stream.char()
300+
301+
# Keep a charbuffer to handle the escapeFlag
300302
if self.contentModelFlag in\
301303
(contentModelFlags["CDATA"], contentModelFlags["RCDATA"]):
302304
if len(self.lastFourChars) == 4:
303305
self.lastFourChars.pop(0)
304306
self.lastFourChars.append(data)
307+
308+
# The rest of the logic
305309
if data == "&" and self.contentModelFlag in\
306310
(contentModelFlags["PCDATA"], contentModelFlags["RCDATA"]) and not\
307311
self.escapeFlag:
@@ -328,8 +332,6 @@ def dataState(self):
328332
# Directly after emitting a token you switch back to the "data
329333
# state". At that point spaceCharacters are important so they are
330334
# emitted separately.
331-
# XXX need to check if we don't need a special "spaces" flag on
332-
# characters.
333335
self.tokenQueue.append({"type": "SpaceCharacters", "data":
334336
data + self.stream.charsUntil(spaceCharacters, True)})
335337
else:

0 commit comments

Comments
 (0)
0