Document classification example: One should use 'latin-1' encoding · Issue #8229 · scikit-learn/scikit-learn · GitHub

8000 Document classification example: One should use 'latin-1' encoding · Issue #8229 · scikit-learn/scikit-learn · GitHub

Document classification example: One should use 'latin-1' encoding #8229

Closed

Closed

Document classification example: One should use 'latin-1' encoding#8229

There seems to be a bug in the documentation of the classification of text documents: http://scikit-learn.org/stable/auto_examples/text/mlcomp_sparse_document_classification.html#sphx-glr-auto-examples-text-mlcomp-sparse-document-classification-py

The files are opened as utf-8 which leads to a bug. I have solved the issue changing "open(f)" into "open(f, encoding='latin1')".

Metadata

Assignees

No one assigned

Labels

No labels

No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

0