René Nyffenegger's collection of things on the web
René Nyffenegger on Oracle - Most wanted - Feedback -
 

Regular Expressions

.... Yet to be finished ....

Elements in regular expressions

y

A character that has no special meaning (such as for example y) is matched against itself

.

A dot matches any character, except new lines.

*

A star matches 0, 1 or more repetitions of the previous character, grouped re or character class.
0 repetitions means that the previous element doesn't exist!

+

Matches 1 or more repetitions of the previous character, grouped re or character class.
So, \+ is like a *, except that a * matches also if the previous element doesn't occur.

?

Matches 0 or 1 repetitions of the previous character, grouped re or character class.
0 repetitions means that the previous element doesn't exist!

{n}

Matches n repetitions of the previous character, grouped re or character class.

{n,m}

Matches between n and m repetitions of the previous character, grouped re or character class.

{,m}

Matches up to m repetitions of the previous character, grouped re or character class.

{n,}

Matches at least n repetitions of the previous character, grouped re or character class.

(regular-expressions)

A regular expression group.
Regular expression groups are used to apply *, \+, \? and the \{...\} elements.
Back references refer to regular expression groups.

\s

Whitespace characters (tabulator, space, newline?)

\S

Opposite of \s.

\d

Matches a digits. equivalent to [0123456789] or [0-9].

\D

Opposite of \d. equivalent to [^0-9].

\x

Matches a hex digit, equivalent to [0-9A-Fa-f].

\X

Opposite of \X.

\o

Matches an octal digit, equivalent to [0-7].

\O

Opposite of \o.

\w

Matches a word character, equivalent to [0-9A-Za-z].

\W

Opposite of \w.

\h

Head of word character (vim only?)

\H

Opposite of \h.

\a

Matches an alphabetic character, equivalent to [A-Za-z].

\A

Opposite of \a

\l

Lowercase character, equivalent to [a-z].

\L

Opposite of \l

\u

Uppercase letter, equivalent to [A-Z].

\U

Opposite of \u.

^

Matches the start of a line.

$

Matches the end of a line.

[abcd]

Matches one character that is between [ and ]. So [abcd] matches either a, b, c or d.

[^abcd]

Matches one character that is not between [ and ]. So [^abcd] matches neither a nor b nor c nor d. But it matches A, e, ...

regular-expression-1|regular-expression-2

Matches either regular-expression-1 or regular-expression-2

[[:META:]]

META can be:
  • digit
    digits 0 throuh 9
  • upper
    Uppercase characters in the alphabet (A,B,C..Z)
  • Lower
    Lowercase characters in the alphabet (a,b,c..z)
  • alpha
    [:upper:] + [:lower:]
  • alnum
    [:digit:] + [:alpha:]
  • blank
    space and tabulator
  • punct
    . , " ' ? ! ; :
  • print
    Printable characters
  • space
    Space characters
  • contr
    Control characters
It can be combined like so:
[[:digit:]a-f]
That would match hexa decimal representations of numbers.

\n

\n: a number. Matches the nth regular expression group.

Links