Friday 13 November 2009

Regular Expressions 101 Part Two

In the previous post in this series we looked at a simple example of a regular expression:



^\d*$



which restricts input to either none or any number of digits.



The table below outlines a few slightly more complex variations of this regular expression and in which instances they would be valid.


"yes" means that particular number of digits would be valid input.




















































Quantity of numerical characters+*{2}{2,3}
Regular Expression^\d+$^\d*$^\d{2}$^\d{2,3}$
0
yes

1yesyes

2yesyesyesyes
3yesyes
yes
4yesyes


So to break this down "+" means one or more instances of the "pattern" specified
by the regular expression is required; "*" means none or any quantity of the
pattern need to be present; "{2}" means the length must be 2 instances of the
specified pattern; {2,3} means there must be either 2 or 3 instances of the
pattern present.

Here's a list of some characters found in regular expressions and what they mean:


  • . - any character

  • ? - optional i.e. there can be either none of these or any other quantity

  • + - at least one or more

  • * - none or any quantity

Regular Expressions and Grouping

You can also specify groups of characters to be matched by a regular expression. For example:

([A-Z][a-z]) specifies that the pattern must start with a Capital letter followed by a lower case character.

Clear as mud eh ;)

No comments: