@@ -10,41 +10,101 @@ This module implements regular expression operations. Regular expression
10
10
syntax supported is a subset of CPython ``re `` module (and actually is
11
11
a subset of POSIX extended regular expressions).
12
12
13
- Supported operators are:
13
+ Supported operators and special sequences are:
14
14
15
- ``'.' ``
15
+ ``. ``
16
16
Match any character.
17
17
18
- ``' [...]' ``
18
+ ``[...] ``
19
19
Match set of characters. Individual characters and ranges are supported,
20
20
including negated sets (e.g. ``[^a-c] ``).
21
21
22
- ``'^' ``
22
+ ``^ ``
23
+ Match the start of the string.
23
24
24
- ``'$' ``
25
+ ``$ ``
26
+ Match the end of the string.
25
27
26
- ``'?' ``
28
+ ``? ``
29
+ Match zero or one of the previous sub-pattern.
27
30
28
- ``'*' ``
31
+ ``* ``
32
+ Match zero or more of the previous sub-pattern.
29
33
30
- ``'+' ``
34
+ ``+ ``
35
+ Match one or more of the previous sub-pattern.
31
36
32
- ``'??' ``
37
+ ``?? ``
38
+ Non-greedy version of ``? ``, match zero or one, with the preference
39
+ for zero.
33
40
34
- ``'*?' ``
41
+ ``*? ``
42
+ Non-greedy version of ``* ``, match zero or more, with the preference
43
+ for the shortest match.
35
44
36
- ``'+?' ``
45
+ ``+? ``
46
+ Non-greedy version of ``+ ``, match one or more, with the preference
47
+ for the shortest match.
37
48
38
- ``'|' ``
49
+ ``| ``
50
+ Match either the left-hand side or the right-hand side sub-patterns of
51
+ this operator.
39
52
40
- ``' (...)' ``
53
+ ``(...) ``
41
54
Grouping. Each group is capturing (a substring it captures can be accessed
42
55
with `match.group() ` method).
43
56
44
- **NOT SUPPORTED **: Counted repetitions (``{m,n} ``), more advanced assertions
45
- (``\b ``, ``\B ``), named groups (``(?P<name>...) ``), non-capturing groups
46
- (``(?:...) ``), etc.
57
+ ``\d ``
58
+ Matches digit. Equivalent to ``[0-9] ``.
47
59
60
+ ``\D ``
61
+ Matches non-digit. Equivalent to ``[^0-9] ``.
62
+
63
+ ``\s ``
64
+ Matches whitespace. Equivalent to ``[ \t-\r] ``.
65
+
66
+ ``\S ``
67
+ Matches non-whitespace. Equivalent to ``[^ \t-\r] ``.
68
+
69
+ ``\w ``
70
+ Matches "word characters" (ASCII only). Equivalent to ``[A-Za-z0-9_] ``.
71
+
72
+ ``\W ``
73
+ Matches non "word characters" (ASCII only). Equivalent to ``[^A-Za-z0-9_] ``.
74
+
75
+ ``\ ``
76
+ Escape character. Any other character following the backslash, except
77
+ for those listed above, is taken literally. For example, ``\* `` is
78
+ equivalent to literal ``* `` (not treated as the ``* `` operator).
79
+ Note that ``\r ``, ``\n ``, etc. are not handled specially, and will be
80
+ equivalent to literal letters ``r ``, ``n ``, etc. Due to this, it's
81
+ not recommended to use raw Python strings (``r"" ``) for regular
82
+ expressions. For example, ``r"\r\n" `` when used as the regular
83
+ expression is equivalent to ``"rn" ``. To match CR character followed
84
+ by LF, use ``"\r\n" ``.
85
+
86
+ **NOT SUPPORTED **:
87
+
88
+ * counted repetitions (``{m,n} ``)
89
+ * named groups (``(?P<name>...) ``)
90
+ * non-capturing groups (``(?:...) ``)
91
+ * more advanced assertions (``\b ``, ``\B ``)
92
+ * special character escapes like ``\r ``, ``\n `` - use Python's own escaping
93
+ instead
94
+ * etc.
95
+
96
+ Example::
97
+
98
+ import ure
99
+
100
+ # As ure doesn't support escapes itself, use of r"" strings is not
101
+ # recommended.
102
+ regex = ure.compile("[\r\n]")
103
+
104
+ regex.split("line1\rline2\nline3\r\n")
105
+
106
+ # Result:
107
+ # ['line1', 'line2', 'line3', '', '']
48
108
49
109
Functions
50
110
---------
@@ -64,6 +124,22 @@ Functions
64
124
string for first position which matches regex (which still may be
65
125
0 if regex is anchored).
66
126
127
+ .. function :: sub(regex_str, replace, string, count=0, flags=0)
128
+
129
+ Compile *regex_str * and search for it in *string *, replacing all matches
130
+ with *replace *, and returning the new string.
131
+
132
+ *replace * can be a string or a function. If it is a string then escape
133
+ sequences of the form ``\<number> `` and ``\g<number> `` can be used to
134
+ expand to the corresponding group (or an empty string for unmatched groups).
135
+ If *replace * is a function then it must take a single argument (the match)
136
+ and should return a replacement string.
137
+
138
+ If *count * is specified and non-zero then substitution will stop after
139
+ this many substitutions are made. The *flags * argument is ignored.
140
+
141
+ Note: availability of this function depends on MicroPython port.
142
+
67
143
.. data :: DEBUG
68
144
69
145
Flag value, display debug information about compiled expression.
@@ -79,8 +155,10 @@ Compiled regular expression. Instances of this class are created using
79
155
80
156
.. method :: regex.match(string)
81
157
regex.search(string)
158
+ regex.sub(replace, string, count=0, flags=0)
82
159
83
- Similar to the module-level functions :meth: `match ` and :meth: `search `.
160
+ Similar to the module-level functions :meth: `match `, :meth: `search `
161
+ and :meth: `sub `.
84
162
Using methods is (much) more efficient if the same regex is applied to
85
163
multiple strings.
86
164
@@ -93,9 +171,31 @@ Compiled regular expression. Instances of this class are created using
93
171
Match objects
94
172
-------------
95
173
96
- Match objects as returned by `match() ` and `search() ` methods.
174
+ Match objects as returned by `match() ` and `search() ` methods, and passed
175
+ to the replacement function in `sub() `.
97
176
98
177
.. method :: match.group([index])
99
178
100
179
Return matching (sub)string. *index * is 0 for entire match,
101
180
1 and above for each capturing group. Only numeric groups are supported.
181
+
182
+ .. method :: match.groups()
183
+
184
+ Return a tuple containing all the substrings of the groups of the match.
185
+
186
+ Note: availability of this method depends on MicroPython port.
187
+
188
+ .. method :: match.start([index])
189
+ match.end([index])
190
+
191
+ Return the index in the original string of the start or end of the
192
+ substring group that was matched. *index * defaults to the entire
193
+ group, otherwise it will select a group.
194
+
195
+ Note: availability of these methods depends on MicroPython port.
196
+
197
+ .. method :: match.span([index])
198
+
199
+ Returns the 2-tuple ``(match.start(index), match.end(index)) ``.
200
+
201
+ Note: availability of this method depends on MicroPython port.
0 commit comments