René Nyffenegger's collection of things on the web
René Nyffenegger on Oracle - Most wanted - Feedback -

Regular Expressions

.... Yet to be finished ....

Elements in regular expressions


A character that has no special meaning (such as for example y) is matched against itself


A dot matches any character, except new lines.


A star matches 0, 1 or more repetitions of the previous character, grouped re or character class.
0 repetitions means that the previous element doesn't exist!


Matches 1 or more repetitions of the previous character, grouped re or character class.
So, \+ is like a *, except that a * matches also if the previous element doesn't occur.


Matches 0 or 1 repetitions of the previous character, grouped re or character class.
0 repetitions means that the previous element doesn't exist!


Matches n repetitions of the previous character, grouped re or character class.


Matches between n and m repetitions of the previous character, grouped re or character class.


Matches up to m repetitions of the previous character, grouped re or character class.


Matches at least n repetitions of the previous character, grouped re or character class.


A regular expression group.
Regular expression groups are used to apply *, \+, \? and the \{...\} elements.
Back references refer to regular expression groups.


Whitespace characters (tabulator, space, newline?)


Opposite of \s.


Matches a digits. equivalent to [0123456789] or [0-9].


Opposite of \d. equivalent to [^0-9].


Matches a hex digit, equivalent to [0-9A-Fa-f].


Opposite of \X.


Matches an octal digit, equivalent to [0-7].


Opposite of \o.


Matches a word character, equivalent to [0-9A-Za-z].


Opposite of \w.


Head of word character (vim only?)


Opposite of \h.


Matches an alphabetic character, equivalent to [A-Za-z].


Opposite of \a


Lowercase character, equivalent to [a-z].


Opposite of \l


Uppercase letter, equivalent to [A-Z].


Opposite of \u.


Matches the start of a line.


Matches the end of a line.


Matches one character that is between [ and ]. So [abcd] matches either a, b, c or d.


Matches one character that is not between [ and ]. So [^abcd] matches neither a nor b nor c nor d. But it matches A, e, ...


Matches either regular-expression-1 or regular-expression-2


META can be:
  • digit
    digits 0 throuh 9
  • upper
    Uppercase characters in the alphabet (A,B,C..Z)
  • Lower
    Lowercase characters in the alphabet (a,b,c..z)
  • alpha
    [:upper:] + [:lower:]
  • alnum
    [:digit:] + [:alpha:]
  • blank
    space and tabulator
  • punct
    . , " ' ? ! ; :
  • print
    Printable characters
  • space
    Space characters
  • contr
    Control characters
It can be combined like so:
That would match hexa decimal representations of numbers.


\n: a number. Matches the nth regular expression group.
