5.10 Regular Expressions

20210920

Regular expressions, or regex for short, are widely utilised through command line commands and in programs.

Visit https://regex101.com for an interactive tool to build and test regular expressions.

Pattern Explanation
. anything, generally except newline (\n)
^ start of string or line
$ end of string or line
+ 1 or more of previous pattern
* 0 or more of previous pattern
? option previous pattern
{n} exactly n previous pattern
{n,} n or more previous pattern
{n,m} n to m of previous pattern
\A start of string
\b word boundary
\B not word boundary
\d digit [0-9]
\D not digit [^0-9]
\n newline
\s whitespace [\t\r\n\v\f]
\s not whitespace [^\\t\\r\\n\\v\\f]
\t tab
\w word [A-Za-z0-9_]
\W not word [^A-Za-z0-9_]
\Z end of string
(…) indexed group
(a b)
(?:…) group not indexed
[abc] single char match a or b or c
[^abc] single char not match a or b or c


Your donation will support ongoing availability and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984. Copyright © 1995-2022 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0