10+ regular expressions for efficient web development

by Jean-Baptiste Jung. 24 Comments -

In programming, regular expressions are a very useful tool designed to validate, search, and match text patterns. In this article, I have compiled more than 10 incredibly useful regular expressions, for any language, that will probably be very beneficial to you.

Validate an URL

Is a particular url valid? The following regexp will let you know.

/^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \?=.-]*)*\/?$/

Source: http://snipplr.com/view/19502/validate-a-url/

Validate US phone number

This regexp will verify that a US phone number is valid.

/^(\+\d)*\s*(\(\d{3}\)\s*)*\d{3}(-{0,1}|\s{0,1})\d{2}(-{0,1}|\s{0,1})\d{2}$/

Source: http://snippets.dzone.com/posts/show/597

Test if a password is strong

Weak passwords are one of the quickest ways to get hacked. The following regexp will make sure that:

  • Passwords will contain at least (1) upper case letter
  • Passwords will contain at least (1) lower case letter
  • Passwords will contain at least (1) number or special character
  • Passwords will contain at least (8) characters in length
  • Password maximum length should not be arbitrarily limited
(?=^.{8,}$)((?=.*\d)|(?=.*\W+))(?![.\n])(?=.*[A-Z])(?=.*[a-z]).*$

Source: http://imar.spaanjaars.com/QuickDocId.aspx?quickdoc=297

Get code within <?php and ?>

If for some reason you need to grab all the code contained within the <?php and ?> tags, this regexp will do the job:

<\?[php]*([^\?>]*)\?>

Source: http://snipplr.com/view/12845/get-all-the-php-code-between/

Match tel: urls

In a recent post, I showed you how you can use iPhone special link prfixes to automatically call someone.
This regular expression will match those tel: urls.

^tel:((?:\+[\d().-]*\d[\d().-]*|[0-9A-F*#().-]*[0-9A-F*#][0-9A-F*#().-]*(?:;[a-z\d-]+(?:=(?:[a-z\d\[\]\/:&+$_!~*'().-]|%[\dA-F]{2})+)?)*;phone-context=(?:\+[\d().-]*\d[\d().-]*|(?:[a-z0-9]\.|[a-z0-9][a-z0-9-]*[a-z0-9]\.)*(?:[a-z]|[a-z][a-z0-9-]*[a-z0-9])))(?:;[a-z\d-]+(?:=(?:[a-z\d\[\]\/:&+$_!~*'().-]|%[\dA-F]{2})+)?)*(?:,(?:\+[\d().-]*\d[\d().-]*|[0-9A-F*#().-]*[0-9A-F*#][0-9A-F*#().-]*(?:;[a-z\d-]+(?:=(?:[a-z\d\[\]\/:&+$_!~*'().-]|%[\dA-F]{2})+)?)*;phone-context=\+[\d().-]*\d[\d().-]*)(?:;[a-z\d-]+(?:=(?:[a-z\d\[\]\/:&+$_!~*'().-]|%[\dA-F]{2})+)?)*)*)$

Source: http://tools.ietf.org/html/rfc3966#section-3

Validate US zip code

When building a registration form, it is common to ask the user’s zip code. As forms are often boring, there’s a strong chance that the user will try to register false data. This regular expression will make sure he entered a valid American zip code.

^[0-9]{5}(-[0-9]{4})?$

Source: http://reusablecode.blogspot.com/2008/08/isvalidzipcode.html

Validate Canadian postal code

This regexp is very similar to the previous one, but it will match Canadian postal codes instead.

^[ABCEGHJ-NPRSTVXY]{1}[0-9]{1}[ABCEGHJ-NPRSTV-Z]{1}[ ]?[0-9]{1}[ABCEGHJ-NPRSTV-Z]{1}[0-9]{1}$

Source: http://reusablecode.blogspot.com/2008/08/isvalidpostalcode.html

Grab unclosed img tags

As you probably know, the xhtml standard requires all tags to be properly closed. This regular expression will search for unclosed img tags. It could be easily modified to grab any other unclosed html tags.

<img([^>]+)(\s*[^\/])>

Source: http://snipplr.com/view/6632/grab-any-unclosed-xhtml-img-tags/

Find all CSS attributes

This regexp will find CSS attributes, such as background:red; or padding-left:25px;.

\s(?[a-zA-Z-]+)\s[:]{1}\s*(?[a-zA-Z0-9\s.#]+)[;]{1}

Source: http://snipplr.com/view/17903/find-css-attributes/

Validate an IBAN

I have recently worked on a banking application and this one was definitely a life-saver. It will verify that the given IBAN is valid.

[a-zA-Z]{2}[0-9]{2}[a-zA-Z0-9]{4}[0-9]{7}([a-zA-Z0-9]?){0,16}

Source: http://snipplr.com/view/15322/iban-regex-all-ibans/

Validate a BIC code

Another one very useful for any banking application or website: This regexp will validate a BIC code.

([a-zA-Z]{4}[a-zA-Z]{2}[a-zA-Z0-9]{2}([a-zA-Z0-9]{3})?)

Source: http://snipplr.com/view/15320/bic-bank-identifier-code-regex/

If you’re interested in regular expressions, make sure you have read our “15 PHP regular expression for developers” post.

Comments (24) - Leave yours

  1. Ethan Gardner said:

    One I use a lot is =”[^"]*["] to get attribute values if I’m trying to clean up html. For example, you could use style=”[^"]*["] to detect any element with an inline style and replace both the inline style attribute and the attribute value with the find/replace function in your IDE.

  2. Michael said:

    The URL expression incorrectly rejects a number of valid URLs. For example, it rejects a URL where you use the IP address instead of a hostname. It rejects a URL that specifies a port number. It rejects any URL that is not http or https.

    There is an RFC that defines what makes a valid URL. I recommend using that as a specification to guide the development of your regular expression.

    Based on that, I have to wonder about the accuracy of the other regular expressions in the list. I would rate this list as Not Recommended.

  3. Jonathan Allen said:

    Ugh. Those don’t even begin to consider validity, they just determine if the pattern is remotely plausable. If your intent is to give meaningful feedback to the user you need to consider the checksum.

  4. Jonathan Allen said:

    @Jenna Molby

    It is impossible to validate an email address using regular expressions alone. Foruntately you should be able to find a real emal validation function in most platforms. Unfortuantely many of them are also broken, just less so.

  5. jasha said:

    What you guys mean by saying it is trouble to validate emails with regex? I guess all the websites do that. Can you guys be more specific?

  6. Jason S said:

    Regexes are great, and I use them a lot, but sometimes a special-purpose parser is best. For example, in Python there’s the urlparse module for URLs, which I think is more convenient than a regex (it also lets you access each part of the url as a named attribute, e.g. parsed_url.scheme, parsed_url.path).

    Long ago, I came across what was claimed to be an RFC-compliant regex for e-mail addresses. It was hundreds and hundreds of characters long. Can’t find it anymore.

  7. Hayden said:

    Wow! I’m not particular with any web development techniques yet. Thanks for posting it here. I now have another idea on how=)

    • EskiMag said:

      That “Validate an URL” was falling into infinite cycle and causing 100% CPU load. I think better version should be: /^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w\?&=.-]*)\/?$/

  8. Hendrik said:

    Maybe i am wrong … but doesn’t the URL-Validating Code fail to validate UTF8-URLs (especially with umlauts and stuff)? So in my opinion “\w” would be the better choice than “\da-z”.

  9. Danny van Kooten said:

    Nice list, thanks!

    @Rory, try this regex to see if an emailadress has the right mark-up. (name@domain.extension)

    $regex = ‘/([a-z0-9_.-]+)@([a-z0-9.-]+){2,255}.([a-z]+){2,10}/i’;

  10. Mickaël Wolff said:

    Regexp are not fitted for URL or e-mail validation. In PHP, they are better way to check a correct URL or e-mail: http://php.net/manual/en/function.filter-var.php

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Please respect the following rules: No advertising, no spam, no keyword in name field. Thank you!