What are Regular Expressions?
Regular Expression is a pattern describing a certain amount of text. Also known as RegEx or RegExp. Can be used for validating user inputs such as email addresses, find a specific String from a text and etc. This can be helped to reduce the line of code in your program. Languages such as JAVA, Perl, PHP, C, C++ supports regex. There are few text editors which support regex as well. But I’m using the online tool which is mentioned below,
Online Tool : http://regexr.com/
There are some characters which have a special meaning for regex. So if we want to math those characters as normal string, we have to use a backslash ‘\’ to remove that special meaning.
There are some special character sequences to put non-printable characters in your regular expression as well. Such as,
- tab : \t or 0x09
- carriage return : \r or 0x0D
- line feed : \n or 0x0A
- bell : \a or 0x07
- escape : \e or 0x1B
- form feed : \f or 0x0C
- vertical tab : \v or 0x0B
Regular Expression Engines
There are two types of regex engines,
- Regex directed engines
- Text directed engines
Use regex not string and the regular expression regex|regex not , if it matches only the regex from the string the engine is a regex directed engine. Otherwise it is a text directed engine.
Regex directed engines always return the left most match. Lets see how it will find a match. Take the string He captured a catfish for his cat. and the regular expression cat . First it will take the first letter of the regex, c , it will match with the first letter of the string, H . Not matched. Then it will check with the second letter of the string and so on. the fourth character of the string matches with the first character of the regex, so then it will proceed for the second character of the regex, a , it will match with the 5th character of the string, it matches too. Then engine will take the third character of the regex, t , it will be matched with the sixth character of the string, p , failed. So again engine will proceed with the first character of the regex and the fifth character of the string. When it comes to the fifteenth character it will be succeeded, cat matches the catfish. so it will return the result without proceed further for any other best match.
- gr[ae]y This will check for both gray and grey. Inside  we can give letters, numbers, special characters which will be taken one by one and do the match. Not all will be consider in a one match, only a one character for one match. The order does not matter as well.
- [0-9] with the hyphen we can check a range.
- Also we can use more than one set to match.
- ^ will take all other characters except the one given with the ^ symbol. Spaces, tabs, new lines ant etc will also be considered in this scenario.
- \w and \d are shorthand for [a-zA-Z0-9_] and [0-9].
- \s will match for [\ \t] space or tab.
- \D \W \S are the negated versions of \d \w \s.
- + can be used for repeating a match.
- ^ will check for the starting of a line.
- $ will check for the ending of a line.
- We can create a regex like following to validate inputs for numbers.
- Trailing and leading white spaces can be matched by following regex.
- \b used to match in a word boundary.
Hope now you have some basic idea about regular expressions. See you soon with another post for regular expressions. Thank You!