8000 Added code to check declared encodings first · Pull Request #48 · buriy/python-readability · GitHub
[go: up one dir, main page]

Skip to content

Added code to check declared encodings first 8000 #48

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from May 18, 2014
Merged

Added code to check declared encodings first #48

merged 1 commit into from May 18, 2014

Conversation

ghost
Copy link
@ghost ghost commented May 13, 2014

from kennethreitz/requests/utils.py. Also I added some superset encodings I have found in Chinese pages that are mishandled by chardet/character declarations. For example, gb2312 is often declared when the text contains characters not supported and the computer system will actually use gb18030 to encode/decode, which is a superset.

There's no netcode in the main functionality of this project, but anyone using this project should check content encoding headers in the HTTP response. Maybe a simple wrapper can be made to handle the HTTP request in the future.

from kennethreitz/requests/utils.py.  Also I added some superset
encodings I have found in Chinese pages that are mishandled by
chardet/character declarations.
buriy added a commit that referenced this pull request May 18, 2014
Added code to check declared encodings first
@buriy buriy merged commit 2fab5ff into buriy:master May 18, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0