100% found this document useful (1 vote)

223 views17 pages

An Introduction To Regular Expressions (9781492082569)

Uploaded by

Sanan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

223 views17 pages

An Introduction To Regular Expressions (9781492082569)

Uploaded by

Sanan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

An Introduction to Regular

Expressions

Thomas Nield
An Introduction to Regular Expressions
by Thomas Nield
Copyright © 2019 O’Reilly Media. All rights reserved.
Printed in the United States of America.
Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North,
Sebastopol, CA 95472.
O’Reilly books may be purchased for educational, business, or sales
promotional use. Online editions are also available for most titles
(http://oreilly.com). For more information, contact our corporate/institutional
sales department: 800-998-9938 or corporate@oreilly.com.

Editors: Nicole Tache and Jessica Haberman

Interior Designer: David Futato

Cover Designer: Karen Montgomery

Illustrator: Rebecca Demarest

May 2019: First Edition

Revision History for the First Edition
2019-05-13: First Release
The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. An
Introduction to Regular Expressions, the cover image, and related trade dress
are trademarks of O’Reilly Media, Inc.
The views expressed in this work are those of the author, and do not represent
the publisher’s views. While the publisher and the author have used good
faith efforts to ensure that the information and instructions contained in this
work are accurate, the publisher and the author disclaim all responsibility for
errors or omissions, including without limitation responsibility for damages
resulting from the use of or reliance on this work. Use of the information and
instructions contained in this work is at your own risk. If any code samples or
other technology this work contains or describes is subject to open source
licenses or the intellectual property rights of others, it is your responsibility to
ensure that your use thereof complies with such licenses and/or rights.
978-1-492-08255-2
Chapter 1. An Introduction to
Regular Expressions

Many data science, analyst, and technology professionals have encountered

regular expressions at some point. This esoteric, miniature language is used
for matching complex text patterns, and looks mysterious and intimidating at
first. However, regular expressions (also called “regex”) are a powerful tool
that only require a small time investment to learn. They are almost
ubiquitously supported wherever there is data. Several analytical and
technology platforms support them, including SQL, Python, R, Alteryx,
Tableau, LibreOffice, Java, Scala, .NET, and Go. Major text editors and
IDE’s like Atom Editor, Notepad++, Emacs, Vim, Intellij IDEA, and
PyCharm also support searching files with regular expressions.
The ubiquity of regular expressions must mean they offer universal utility,
and, surprisingly, they do not have a steep learning curve. If you frequently
find yourself manually scanning documents or parsing substrings just to
identify text patterns, you might want to give them a look. Especially in data
science and data engineering, they can assist in a wide spectrum of tasks,
from wrangling data to qualifying and categorizing it.
In this report, I will cover enough regular expression features to make them
useful for a great majority of tasks you may encounter.

Setting Up
You can test these examples I am about to walk through in a number of
places. I recommend using Regular Expressions 101, a free web-based
application to test a regular expression against text inputs. As we go through
these examples, type in the regular expression pattern in the “Regular
Expression” field, and a sample text in the “Test String” field. You will then
immediately see in the right panel whether a full or partial match succeeded,
as well as a broken down explanation of what your regex is doing (see Figure
1).

Figure 1-1. The regex101.com site is a helpful tool to test regular expressions against text inputs.

For Python, you can also import and use the native re package as shown
below. The fullmatch() function will accept a regex pattern and an input
string to test against. It will return a match object if a full match exists.

import re
result = re.fullmatch(pattern="[A-Z]{2}", string="TX")

if result:
print("match")
else:
print("Doesn't match")

Now that you are set up, we will walk through all the major functionalities
offered by regular expressions.

Literals and Special Characters

A regular expression matches a broad or specific text pattern, and is strictly
read left-to-right. It is input as a text string itself, and will compile into a mini
program built specifically to identify that pattern. That pattern can be used to
match, search, substring, or split text.
Most characters, including alphabetic and numeric characters, have no special
functionality and literally represent those characters. For instance, a regex of
TX will only match the string TX.

REGEX: TX
INPUT: TX
MATCH: true

REGEX: TX
INPUT: AZ
MATCH: false

However, a small subset of characters have special functionalities we will

learn throughout this article. These characters include the following:
[\^$.|?*+()
If you want to treat these characters as literals, you need to precede them with
an escape \. To create a literal regex that matches $180, we need to escape
that dollar sign so it matches a dollar sign. Otherwise it will treat it as an
“end-of-line” character, which we will learn about later.
REGEX: \$180
INPUT: $180
MATCH: true

Conversely, putting a \ on certain letters will yield a special character. One of

the most common is \s, which will match any whitespace.

REGEX: Lorem\sIpsum
INPUT: Lorem Ipsusm
MATCH: true

Character Ranges
For a given position in a string, we can qualify only a range of characters. To
match a string containing a character of 0, 1, or 3 followed by an F, X, or B,
we can declare a regular expression with character ranges inside square
brackets [].

REGEX: [013][FXB]
INPUT: 1X
MATCH: true

REGEX: [013][FXB]
INPUT: 1Z
MATCH: false

You can also define a consecutive span of letters or numbers by putting a -

between them. We can qualify a character that is any number between 1
through 4 followed by any character that is A through Z.

REGEX: [1-4][A-Z]
INPUT: 1X
MATCH: true

REGEX: [1-4][A-Z]
INPUT: 51
MATCH: false

You can also qualify multiple ranges on a single character. For instance, we
can qualify the first character in a two-character string to be an uppercase
letter, a lowercase letter, or a number.

REGEX: [A-Za-z0-9][0-9]
INPUT: i5
MATCH: true

REGEX: [A-Za-z0-9][0-9]
INPUT: 1X
MATCH: false

To negate characters, meaning you want anything but the specified

characters, start your character range with a carrot ^. For example, we can
qualify non-vowel letters:

REGEX: [^AEIOU]
INPUT: X
MATCH: true

REGEX: [^AEIOU]
INPUT: E
MATCH: false

If you want a literal dash - character to be part of the character range, declare
it first in the range.

REGEX: [-0-9][0-9]
INPUT: -9
MATCH: true

REGEX: [-0-9][0-9]
INPUT: 99
MATCH: true

Anchors
Sometimes you will want to qualify the start ^ and end $ of a line or string.
This can be handy if you are searching a document and want to qualify the
start or end of a line as part of your regular expression. You can use this
regular expression to match all numbers that start a line in a document as
shown here:

^[0-9]

Figure 1-2. Using Atom Editor to search for numbers that start a line.

Conversely, an end-of-line $ can be used to qualify the end of a line. Below is

a regular expression that will match numbers that are the last character on a
line.

[0-9]$

Depending on your environment, using both the start-of-line ^ and end-of-

line $ together can be helpful to force a full match and ignore partial ones.
This is because qualifying the start ^ and end $ of a string forces everything
between them to be the only contents allowed in the input.

REGEX: [0-9][0-9]
INPUT: 1432
MATCH: true

REGEX: ^[0-9][0-9]$
INPUT: 1432
MATCH: false

Quantifiers
A critical feature of regular expressions is quantifiers, which repeat the
preceding clause of a regular expression.
For instance, it is a bit redundant to express [A-Z] three times to match three
uppercase letters.

Fixed Repetitions
REGEX: [A-Z][A-Z][A-Z]
INPUT: YCA
MATCH: true

Instead, we can follow the [A-Z] with a quantifier {3} to specify repeating
that character range three times, as in [A-Z]{3}. This accomplishes the same
task as [A-Z][A-Z][A-Z], but more succinctly expresses it as three
repetitions.

REGEX: [A-Z]{3}
INPUT: YCA
MATCH: true

We can use the regular expression below to match a 10-digit phone number
with dashes.
REGEX: [0-9]{3}-[0-9]{3}-[0-9]{4}
INPUT: 470-127-7501
MATCH: true

REGEX: [0-9]{3}-[0-9]{3}-[0-9]{4}
INPUT: 75663-2372
MATCH: false

Min and Max Repetitions

You can also express a minimum and maximum number of allowable
repetitions. [A-Z]{2,3} will require a minimum of 2 repetitions but a
maximum of 3.

REGEX: [A-Z]{2,3}
INPUT: YCA
MATCH: true

REGEX: [A-Z]{2,3}
INPUT: AZ
MATCH: true

Leaving the second argument empty and having a comma still present will
result in an infinite maximum, and therefore specify a minimum. Below, we
have a regex that will match on a minimum of two alphanumeric characters.

REGEX: [A-Za-z0-9]{2,}
INPUT: YZ1
MATCH: true

REGEX: [A-Za-z0-9]{2,}
INPUT: YZSDjhfhSBH2342SDFSDFsdfw123412
MATCH: true

0 or 1 Repetition (a.k.a., Optional)

There are a couple of shorthand symbols for common quantifiers. For
instance, a question mark ? is the same as {0,1}, which makes that part of the
regex optional. If you wanted two uppercase alphabetic characters to
optionally be preceded with a number, you can do so like this:
REGEX: [0-9]?[A-Z]{2}
INPUT: BC
MATCH: true

REGEX: [0-9]?[A-Z]{2}
INPUT: 3BC
MATCH: true

As you start combining different operations, a regular expression can start to

look overwhelming. But the secret is to read a regex left-to-right, and looking
at the case above you can interpret it as, “I’m looking for a number that is
optional, followed by an uppercase alphabetic character repeated two times.”
Taking our phone number example earlier, we can make the dashes now
optional as shown here:

REGEX: [0-9]{3}-?[0-9]{3}-?[0-9]{4}
INPUT: 470-127-7501
MATCH: true

REGEX: [0-9]{3}-?[0-9]{3}-?[0-9]{4}
INPUT: 4701277501
MATCH: true

1 or More Repetitions
A + is a shorthand for {1,}, which requires a minimum of 1 repetition, but
will capture any number of repetitions after that.

REGEX: [XYZ]+
INPUT: Z
MATCH: true

REGEX: [XYZ]+
INPUT: XYZZZYZXXX
MATCH: true

REGEX: [XYZ]+[0-9]+
INPUT: XYZZZYZXXX2374676128963453452990
MATCH: true

0 or More Repetitions
A * is a shorthand for {0,}, which makes whatever it is quantifying
completely optional, but will capture as many repetitions it can if they do
exist.

REGEX: [0-3]+[XYZ]*
INPUT: 34
MATCH: true

REGEX: [0-3]+[XYZ]*
INPUT: 34YYXZZ
MATCH: true

Wildcards
A dot . is a wildcard for any character, making it the broadest operator you
can use. It will match not just alphabetic or numeric characters, but also
whitespaces, newlines, punctuation, and any other symbols.

REGEX: ...
INPUT: B/C
MATCH: true

REGEX: .{3}
INPUT: B/C
MATCH: true

REGEX: H.{3}O
INPUT: HELLO
MATCH: true

A common operation you may see is .*, which allows 0 or more repetitions of
any character. This is often used to match any text, making it function as an
“everything” wildcard. This can be helpful when using regular expressions as
qualifiers, and if you do not want that parameter to restrict anything just
make it a .*.

REGEX: .*
INPUT: AsdfSJDFJSVdsfBLKJXCasdBNVJWB$TJ$@#ASDFSD@
MATCH: true
REGEX: .*
INPUT: Alpha
MATCH: true

Grouping
It can be helpful to group up parts of a regular expression in parentheses,
often to use a quantifier on that whole group. For instance, if you want to
qualify an uppercase letter followed by three numeric digits, but want to
repeat that whole operation with a quantifier, you can do so like this:

REGEX: ([A-Z][0-9]{3})+
INPUT: A563
MATCH: true

REGEX: ([A-Z][0-9]{3})+
INPUT: A563X264
MATCH: true

REGEX: ([A-Z][0-9]{3}-?)+
INPUT: A563-X264-C578
MATCH: true

If we wanted to identify phone numbers (with optional dashes -), but make
the area code (the first three digits) optional, we can do so like this:

REGEX: ([0-9]{3}-)?[0-9]{3}-?[0-9]{4}
INPUT: 470-127-7501
MATCH: true

REGEX: ([0-9]{3}-?)?[0-9]{3}-?[0-9]{4}
INPUT: 127-7501
MATCH: true

Alternation
Alternation is expressed with a | and essentially operates as an “OR”. It
alternates two or more valid patterns where at least one of those patterns must
match in that position.
For instance, if we want to capture 5-digit U.S. ZIP codes that end in “35” or
“75,” we can tail a repeated numeric range with a (35|75). We must group
that in parentheses so the | does not mangle the 35 with the repeated numeric
range.

REGEX: [0-9]{3}(35|75)
INPUT: 75035
MATCH: true

REGEX: [0-9]{3}(35|75)
INPUT: 75062
MATCH: false

Sometimes an alternator is used simply to qualify a set of literal values. For

instance, if I want to only match ALPHA, BETA, and GAMMA, I can use an
alternator to achieve this.

REGEX: ALPHA|BETA|GAMMA
INPUT: BETA
MATCH: true

REGEX: ALPHA|BETA|GAMMA
INPUT: DELTA
MATCH: false

Prefixes and Suffixes

Especially when you are scanning documents, it can be helpful to qualify
something that precedes or follows your targeted text without capturing it.
Prefixes and suffixes allow this, and can be leveraged with (?<=regex) and (?
=regex) respectively, where “regex” is the pattern for the head or tail you
want to qualify but not include.
For instance, if I want to extract numbers that are preceded by uppercase
letters, but I don’t want to include those letters, I can use a prefix like this:

REGEX: (?<=[A-Z]+)[0-9]+
INPUT: ALPHA12
MATCH: 12
REGEX: (?<=[A-Z]+)[0-9]+
INPUT: 167
MATCH: false

A suffix works similarly, but matches a tail without including that tail.

REGEX: [0-9]+(?=[A-Z]+)
INPUT: 12ALPHA
MATCH: 12

REGEX: [0-9]+(?=[A-Z]+)
INPUT: 167
MATCH: false

Conclusions
It is important to remember that you often only need to make a regular
expression as specific as it needs to be, depending on how predictable your
data is. Qualifying a number with [0-9.]+ will work to match an IP
address such as 172.18.83.200. But keep in mind it will also match
237476231.345342342334.23423756756856234, which is definitely not an
IP address. If you do not know your data well, you should probably err on
being more specific, as demonstrated in this Stack Overflow question.
Regular expressions may seem niche, but they can rise up heroically to the
most unexpected tasks in your day-to-day work. Hopefully this article has
helped you feel more comfortable with regular expressions and find them
useful. They can assist in data munging, qualification, categorization, and
parsing as well as document editing.

Gottlob Sallinger Slides
No ratings yet
Gottlob Sallinger Slides
197 pages
20 Types of LLM Guardrails
No ratings yet
20 Types of LLM Guardrails
12 pages
Open CompSci Notes - Unit 1 - MST Creator 2
No ratings yet
Open CompSci Notes - Unit 1 - MST Creator 2
24 pages
ALi STB Single Chip Solution
No ratings yet
ALi STB Single Chip Solution
19 pages
Database Performance and Query Optimization
No ratings yet
Database Performance and Query Optimization
334 pages
Knowledge Representation
No ratings yet
Knowledge Representation
10 pages
Chapter 9: Virtual Memory: Silberschatz, Galvin and Gagne ©2013 Operating System Concepts - 9 Edition
100% (1)
Chapter 9: Virtual Memory: Silberschatz, Galvin and Gagne ©2013 Operating System Concepts - 9 Edition
78 pages
@vtucode - in 21AI63 Model Set 1 Paper
No ratings yet
@vtucode - in 21AI63 Model Set 1 Paper
2 pages
Data Lake Bootcamp: Building Reliable Data Lakes
No ratings yet
Data Lake Bootcamp: Building Reliable Data Lakes
29 pages
Data Sheet: Product: AFS670-EREEDAAATTEEEEEEPZZX08.0 Configurator: AFS670/675
No ratings yet
Data Sheet: Product: AFS670-EREEDAAATTEEEEEEPZZX08.0 Configurator: AFS670/675
3 pages
CP Electronics RAPID Installation Guide v1.5
No ratings yet
CP Electronics RAPID Installation Guide v1.5
65 pages
Falk M. A First Course On Time Series Analysis Examples With SAS (U. of Wurzburg, 2005) (214s) - GL
100% (1)
Falk M. A First Course On Time Series Analysis Examples With SAS (U. of Wurzburg, 2005) (214s) - GL
214 pages
Super Study Guide: Data Science Tools: Afshine Amidi and Shervine Amidi August 21, 2020
No ratings yet
Super Study Guide: Data Science Tools: Afshine Amidi and Shervine Amidi August 21, 2020
23 pages
Aws SDK Go DG
No ratings yet
Aws SDK Go DG
207 pages
Mid-Term M-Commerce
No ratings yet
Mid-Term M-Commerce
41 pages
Red Hat Enterprise Linux-9-Performing Disaster Recovery With Identity Management-En-Us
No ratings yet
Red Hat Enterprise Linux-9-Performing Disaster Recovery With Identity Management-En-Us
41 pages
DAA-C01 Dumps - Snowflake Certified SnowPro Advanced - Data Analyst
No ratings yet
DAA-C01 Dumps - Snowflake Certified SnowPro Advanced - Data Analyst
11 pages
Tracing For Java Developers
100% (1)
Tracing For Java Developers
79 pages
CIP 002 5.1a
No ratings yet
CIP 002 5.1a
37 pages
LJMU Student Guide v1.9
No ratings yet
LJMU Student Guide v1.9
49 pages
Daily Dose of Data Science
No ratings yet
Daily Dose of Data Science
290 pages
Operating Instructions Estefold 4210 + 4211
No ratings yet
Operating Instructions Estefold 4210 + 4211
17 pages
Dataeng-Zoomcamp - 5 - Batch - Processing - MD at Main Ziritrion - Dataeng-Zoomcamp GitHub
No ratings yet
Dataeng-Zoomcamp - 5 - Batch - Processing - MD at Main Ziritrion - Dataeng-Zoomcamp GitHub
41 pages
Material For Student DMPC V012022A EN 1
100% (1)
Material For Student DMPC V012022A EN 1
52 pages
Process Scheduling Simplilified Notes
No ratings yet
Process Scheduling Simplilified Notes
7 pages
Ec2 WG PDF
No ratings yet
Ec2 WG PDF
1,186 pages
Regex
No ratings yet
Regex
24 pages
20MCMS017035-D Kartheek - Mis Lab Manual
No ratings yet
20MCMS017035-D Kartheek - Mis Lab Manual
11 pages
Flask Restplus
No ratings yet
Flask Restplus
86 pages
CBSE - Class X Maths - Notes & Solution
100% (2)
CBSE - Class X Maths - Notes & Solution
122 pages
Apache Spark Essential Training
No ratings yet
Apache Spark Essential Training
30 pages
Digital Home Theater Receiver
No ratings yet
Digital Home Theater Receiver
10 pages
CC CHP 12
No ratings yet
CC CHP 12
31 pages
MPC Procedure
No ratings yet
MPC Procedure
13 pages
Python Classes in Pune
No ratings yet
Python Classes in Pune
11 pages
NCP-IB Exam Questions
No ratings yet
NCP-IB Exam Questions
3 pages
Adobe After Effects Notes
No ratings yet
Adobe After Effects Notes
30 pages
Js SDK DG
No ratings yet
Js SDK DG
380 pages
Causes and Effects of Climate Change
No ratings yet
Causes and Effects of Climate Change
19 pages
ReactJS - CredoSystems
No ratings yet
ReactJS - CredoSystems
14 pages
Chapter One Introduction To Modeling and Simulation
No ratings yet
Chapter One Introduction To Modeling and Simulation
8 pages
Chapter 5 Software Effort Estimation
50% (2)
Chapter 5 Software Effort Estimation
11 pages
History of Computing Hardware - Wikipedia
No ratings yet
History of Computing Hardware - Wikipedia
11 pages
Text Mining Project Report
No ratings yet
Text Mining Project Report
27 pages
General Specifications: Centum Cs 3000 Integrated Production Control System (For Vnet/IP)
No ratings yet
General Specifications: Centum Cs 3000 Integrated Production Control System (For Vnet/IP)
4 pages
Regular Expressions (Slides)
No ratings yet
Regular Expressions (Slides)
20 pages
Brochure-Primacy 2 A4 LR
No ratings yet
Brochure-Primacy 2 A4 LR
4 pages
OOPs With Java - Introduction To Java
No ratings yet
OOPs With Java - Introduction To Java
53 pages
Elastic Search
0% (1)
Elastic Search
18 pages
Web Focus Training
No ratings yet
Web Focus Training
26 pages
TBC To Earthworks 1
No ratings yet
TBC To Earthworks 1
6 pages
S&s Question
No ratings yet
S&s Question
10 pages
Honeywell - Aquatrol AQ2000 Series Brochure
No ratings yet
Honeywell - Aquatrol AQ2000 Series Brochure
12 pages
Data Streams: Models and Algorithms
No ratings yet
Data Streams: Models and Algorithms
372 pages
GPS104 User Manual-20150317
No ratings yet
GPS104 User Manual-20150317
23 pages
Segmentation and Object Recognition Using Edge Detection Techniques
No ratings yet
Segmentation and Object Recognition Using Edge Detection Techniques
9 pages
AZ-104 Microsoft Azure Administrator Exam Dumps 1
No ratings yet
AZ-104 Microsoft Azure Administrator Exam Dumps 1
14 pages
Cognos Query Tips and Guidelines
No ratings yet
Cognos Query Tips and Guidelines
11 pages
CAF206 - Computer Networks
No ratings yet
CAF206 - Computer Networks
2 pages
Morphological PCB
No ratings yet
Morphological PCB
5 pages
Getting Started With Julia - Sample Chapter
No ratings yet
Getting Started With Julia - Sample Chapter
24 pages
Bda Unit-Iii-R20
No ratings yet
Bda Unit-Iii-R20
44 pages
Empowering Fund Managers With Multi Agent AI Financial Technology WP
No ratings yet
Empowering Fund Managers With Multi Agent AI Financial Technology WP
15 pages
Flume Case Study
No ratings yet
Flume Case Study
2 pages
React Js
No ratings yet
React Js
21 pages
Poisson Distribution
100% (1)
Poisson Distribution
6 pages
Journey To Event Driven - Part 4 - Four Pillars of Event Streaming Microservices - Confluent
No ratings yet
Journey To Event Driven - Part 4 - Four Pillars of Event Streaming Microservices - Confluent
33 pages
Honors Physics Equations
No ratings yet
Honors Physics Equations
3 pages
Technical Interview Questions For Freshers - With Answers (2024)
No ratings yet
Technical Interview Questions For Freshers - With Answers (2024)
7 pages
Rugged Notebook EM-Q225M
No ratings yet
Rugged Notebook EM-Q225M
2 pages
Neo4j-Manual-2 0 1
No ratings yet
Neo4j-Manual-2 0 1
593 pages
1 - Introduction To React JS
No ratings yet
1 - Introduction To React JS
13 pages
Object Oriented Programming With Python
No ratings yet
Object Oriented Programming With Python
36 pages
Inclusion Classification by Computer Vision and Machine Learning
No ratings yet
Inclusion Classification by Computer Vision and Machine Learning
6 pages
XL Wings
No ratings yet
XL Wings
214 pages
How-To - Install CDH On Mac OSX 10
No ratings yet
How-To - Install CDH On Mac OSX 10
20 pages
Agile Programming: A Brief Presentation by Pradeep
No ratings yet
Agile Programming: A Brief Presentation by Pradeep
18 pages
(English (Auto-Generated) ) Building End-to-End Delta Pipelines On GCP (DownSub - Com)
No ratings yet
(English (Auto-Generated) ) Building End-to-End Delta Pipelines On GCP (DownSub - Com)
24 pages
Learning Cypher Sample Chapter
No ratings yet
Learning Cypher Sample Chapter
26 pages
How To Create Secrets in Databricks? - by Ashish Garg - Medium
No ratings yet
How To Create Secrets in Databricks? - by Ashish Garg - Medium
13 pages
Gate 2021: Computer Science & Information Technology
No ratings yet
Gate 2021: Computer Science & Information Technology
13 pages
ICT - 8 - Q3 - Periodical Test
No ratings yet
ICT - 8 - Q3 - Periodical Test
4 pages
DBMS-Unit 5
No ratings yet
DBMS-Unit 5
27 pages
Analysis Node - Js Platform Web Application Security
No ratings yet
Analysis Node - Js Platform Web Application Security
60 pages
Pattern Matching With Regular Expressions - by Zohaib Shahzad - The Startup - Medium
No ratings yet
Pattern Matching With Regular Expressions - by Zohaib Shahzad - The Startup - Medium
8 pages
Data Warehousing Experienced Level Questions
No ratings yet
Data Warehousing Experienced Level Questions
11 pages
Smart Traffic Management System Using IOT and Machine Learning Approach
No ratings yet
Smart Traffic Management System Using IOT and Machine Learning Approach
6 pages
Bridgewave 2x80ghz
No ratings yet
Bridgewave 2x80ghz
2 pages
Famepilot Django Assignment
No ratings yet
Famepilot Django Assignment
1 page
Rule Engine
No ratings yet
Rule Engine
2 pages

An Introduction To Regular Expressions (9781492082569)

Uploaded by

An Introduction To Regular Expressions (9781492082569)

Uploaded by

An Introduction to Regular

Editors: Nicole Tache and Jessica Haberman

Interior Designer: David Futato

Cover Designer: Karen Montgomery

Illustrator: Rebecca Demarest

May 2019: First Edition

Many data science, analyst, and technology professionals have encountered

Literals and Special Characters

However, a small subset of characters have special functionalities we will

Conversely, putting a \ on certain letters will yield a special character. One of

You can also define a consecutive span of letters or numbers by putting a -

To negate characters, meaning you want anything but the specified

Conversely, an end-of-line $ can be used to qualify the end of a line. Below is

Depending on your environment, using both the start-of-line ^ and end-of-

Min and Max Repetitions

0 or 1 Repetition (a.k.a., Optional)

As you start combining different operations, a regular expression can start to

Sometimes an alternator is used simply to qualify a set of literal values. For

Prefixes and Suffixes

You might also like