Regular Expressions

What are Regular Expressions?

Regular Expression is a pattern describing a certain amount of text. Also known as RegEx or RegExp. Can be used for validating user inputs such as email addresses, find a specific String from a text and etc. This can be helped to reduce the line of code in your program. Languages such as JAVA, Perl, PHP, C, C++ supports regex. There are few text editors which support regex as well. But I’m using the online tool which is mentioned below,

Online Tool : http://regexr.com/

Special Characters

There are some characters which have a special meaning for regex. So if we want to math those characters as normal string, we have to use a backslash ‘\’ to remove that special meaning.

  1. [
  2. ^
  3. (
  4. )
  5. \
  6. $
  7. .
  8. |
  9. ?
  10. *
  11. +

Non-printable Characters

There are some special character sequences to put non-printable characters in your regular expression as well. Such as,

  • tab : \t or 0x09
  • carriage return : \r or 0x0D
  • line feed : \n or 0x0A
  • bell : \a or 0x07
  • escape : \e or 0x1B
  • form feed : \f or 0x0C
  • vertical tab : \v or 0x0B

Regular Expression Engines

There are two types of regex engines,

  1. Regex directed engines
  2. Text directed engines

Use regex not string and the regular expression regex|regex not , if it matches only the regex from the string the engine is a regex directed engine. Otherwise it is a text directed engine.

1.png

Regex directed engines always return the left most match. Lets see how it will find a match. Take the string He captured a catfish for his cat. and the regular expression cat . First it will take the first letter of the regex, , it will match with the first letter of the string, H . Not matched. Then it will check with the second letter of the string and so on. the fourth character of the string matches with the first character of the regex, so then it will proceed for the second character of the regex, a , it will match with the 5th character of the string, it matches too. Then engine will take the third character of the regex, t , it will be matched with the sixth character of the string, p , failed. So again engine will proceed with the first character of the regex and the fifth character of the string. When it comes to the fifteenth character it will be succeeded, cat matches the catfish. so it will return the result without proceed further for any other best match.

2

Examples

 

  • gr[ae]y  This will check for both gray and grey. Inside [] we can give letters, numbers, special characters which will be taken one by one and do the match. Not all will be consider in a one match, only a one character for one match. The order does not matter as well.

Screenshot from 2016-07-06 07:56:38.png

  • [0-9] with the hyphen we can check a range.

Screenshot from 2016-07-06 07:58:15

Screenshot from 2016-07-06 07:58:23

  • Also we can use more than one set to match.

Screenshot from 2016-07-06 07:58:41.png

  • ^ will take all other characters except the one given with the ^ symbol. Spaces, tabs, new lines ant etc will also be considered in this scenario.

Screenshot from 2016-07-06 08:07:52.png

  • \w and \d are shorthand for [a-zA-Z0-9_] and [0-9].

Screenshot from 2016-07-06 09:01:18

Screenshot from 2016-07-06 09:01:12

  • \s will match for [\ \t] space or tab.

Screenshot from 2016-07-06 09:06:46.png

  • \D \W \S are the negated versions of \d \w  \s.

Screenshot from 2016-07-06 09:10:55

Screenshot from 2016-07-06 09:11:07

Screenshot from 2016-07-06 09:11:00

  • + can be used for repeating a match.

Screenshot from 2016-07-06 09:14:55Screenshot from 2016-07-06 09:14:36

 

 

 

 

 

 

 

  • ^ will check for the starting of a line.

1.png

  • $ will check for the ending of a line.

23.png

  • We can create a regex like following to validate inputs for numbers.

3.png

  • Trailing and leading white spaces can be matched by following regex.

4

  • \b used to match in a word boundary.

5.png

Hope now you have some basic idea about regular expressions. See you soon with another post for regular expressions. Thank You!

 

Advertisements

2 thoughts on “Regular Expressions

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s