8000 Bump to version 0.2 · EvaSDK/python-readability@61715dc · GitHub
[go: up one dir, main page]

Skip to content

Commit 61715dc

Browse files
committed
Bump to version 0.2
1 parent 21906f1 commit 61715dc

File tree

3 files changed

+8
-3
lines changed

3 files changed

+8
-3
lines changed

README

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,14 +13,17 @@ Based on:
1313
- Ruby port by starrhorne and iterationlabs
1414
- Python port by gfxmonk ( https://github.com/gfxmonk/python-readability , based on BeautifulSoup )
1515
- Decruft effort to move to lxml ( http://www.minvolai.com/blog/decruft-arc90s-readability-in-python/ )
16+
- "BR to P" fix from readability.js which improves quality for smaller texts.
17+
- Github users contributions.
1618

1719
Usage:
1820

21+
from readability.readability import Document
1922
import urllib
2023
html = urllib.urlopen(url).read()
2124
readable_article = Document(html).summary()
2225
readable_title = Document(html).short_title()
2326

2427
Command-line usage:
2528

26-
python -m readability.readability -u http://yoursite.com/yourpage
29+
python -m readability.readability -u http://pypi.python.org/pypi/readability-lxml

readability/readability.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -120,7 +120,9 @@ def summary(self):
120120
continue
121121
else:
122122
logging.debug("Ruthless and lenient parsing did not work. Returning raw html")
123-
article = self.html.find('body') or self.html
123+
article = self.html.find('body')
124+
if article is None:
125+
article = self.html
124126

125127
cleaned_article = self.sanitize(article, candidates)
126128
of_acceptable_length = len(cleaned_article or '') >= (self.options['retry_length'] or self.RETRY_LENGTH)

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33

44
setup(
55
name="readability-lxml",
6-
version="0.1dev",
6+
version="0.2",
77
author="Yuri Baburov",
88
author_email="burchik@gmail.com",
99
description="python port of arc90's readability bookmarklet",

0 commit comments

Comments
 (0)
0