File tree Expand file tree Collapse file tree 2 files changed +7
-4
lines changed Expand file tree Collapse file tree 2 files changed +7
-4
lines changed Original file line number Diff line number Diff line change 1
1
html5lib is a pure-python library for parsing HTML. It is designed to
2
- conform to the Web Applications 1.0 specification, which has
3
- formalized the error handling algorithms of popular web browsers.
2
+ conform to the HTML 5 specification, which has formalized the error handling
3
+ algorithms of popular web browsers.
4
4
5
5
= Installation =
6
6
@@ -36,3 +36,4 @@ http://code.google.com/p/html5lib/issues/list
36
36
Contributions to code or documenation are actively encouraged. Submit
37
37
patches to the issue tracker or discuss changes on irc in the #whatwg
38
38
channel on freenode.net
39
+
Original file line number Diff line number Diff line change @@ -297,11 +297,15 @@ def emitCurrentToken(self):
297
297
298
298
def dataState (self ):
299
299
data = self .stream .char ()
300
+
301
+ # Keep a charbuffer to handle the escapeFlag
300
302
if self .contentModelFlag in \
301
303
(contentModelFlags ["CDATA" ], contentModelFlags ["RCDATA" ]):
302
304
if len (self .lastFourChars ) == 4 :
303
305
self .lastFourChars .pop (0 )
304
306
self .lastFourChars .append (data )
307
+
308
+ # The rest of the logic
305
309
if data == "&" and self .contentModelFlag in \
306
310
(contentModelFlags ["PCDATA" ], contentModelFlags ["RCDATA" ]) and not \
307
311
self .escapeFlag :
@@ -328,8 +332,6 @@ def dataState(self):
328
332
# Directly after emitting a token you switch back to the "data
329
333
# state". At that point spaceCharacters are important so they are
330
334
# emitted separately.
331
- # XXX need to check if we don't need a special "spaces" flag on
332
- # characters.
333
335
self .tokenQueue .append ({"type" : "SpaceCharacters" , "data" :
334
336
data + self .stream .charsUntil (spaceCharacters , True )})
335
337
else :
You can’t perform that action at this time.
0 commit comments