8000 removing empty paragraphs is not very useful, and can break some (stu… · escape-char/python-readability@2b6a2d3 · GitHub
[go: up one dir, main page]

Skip to content

Commit 2b6a2d3

Browse files
author
gfxmonk
committed
removing empty paragraphs is not very useful, and can break some (stupid) websites
1 parent 1d862a0 commit 2b6a2d3

File tree

1 file changed

+0
-5
lines changed

1 file changed

+0
-5
lines changed

readability/readability.py

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -231,11 +231,6 @@ def sanitize(self, node, candidates):
231231
for elem in self.tags(node, "form", "iframe"):
232232
elem.extract()
233233

234-
# remove empty <p> tags
235-
for elem in node.findAll("p"):
236-
if not (elem.string or elem.contents):
237-
elem.extract()
238-
239234
# Conditionally clean <table>s, <ul>s, and <div>s
240235
for el in self.tags(node, "table", "ul", "div"):
241236
weight = self.class_weight(el)

0 commit comments

Comments
 (0)
0