8000 Fix doc for `regexp_span_tokenize` to explicitly use `regexp_span_tok… · ExplodingCabbage/nltk@62f7647 · GitHub
[go: up one dir, main page]

Skip to content

Commit 62f7647

Browse files
committed
Fix doc for regexp_span_tokenize to explicitly use regexp_span_tokenize
1 parent c14c15a commit 62f7647

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

nltk/tokenize/util.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -45,10 +45,10 @@ def regexp_span_tokenize(s, regexp):
4545
Return the offsets of the tokens in *s*, as a sequence of ``(start, end)``
4646
tuples, by splitting the string at each successive match of *regexp*.
4747
48-
>>> from nltk.tokenize import WhitespaceTokenizer
48+
>>> from nltk.tokenize.util import regexp_span_tokenize
4949
>>> s = '''Good muffins cost $3.88\nin New York. Please buy me
5050
... two of them.\n\nThanks.'''
51-
>>> list(WhitespaceTokenizer().span_tokenize(s))
51+
>>> list(regexp_span_tokenize(s, r'\s'))
5252
[(0, 4), (5, 12), (13, 17), (18, 23), (24, 26), (27, 30), (31, 36),
5353
(38, 44), (45, 48), (49, 51), (52, 55), (56, 58), (59, 64), (66, 73)]
5454

0 commit comments

Comments
 (0)
0