Email validation with regular expression

Introduction

Nowadays almost all website has some kind of html form. The most best-known are the user registration forms, information request forms and so on. As the form makes sense only if the visitor submits valid information so the site developer should take care of the data alidity.

The form validation process can be divided into 2 categories.

  • Client side validation
  • Server side validation

The client side validation is mostly realized via JavaScript code. The pro is that the processing doesn’t require any network transfer so it can be faster. The con is that the visitor can disable JavaScript in the browser and in this case he/she can submit invalid data.

That’s the point where the server side validation becomes important. The server side scripts can not be influenced by the visitor so you know that it will work in the same manner for all the visitors.
The best solution is if you apply both type of validation methods.


Validation technics

From this point on I will focus only on the server side validation of an email address. In PHP you receive the form fields values in the $_POST or
in the $_GET arrays. You must get the actual value to be checked from one of these arrays. When you have the correct variable you can try to run some validation routines on it.
There are again 2 main methods how you can check an email string:
  • String manipulation routines
  • Regular expressions


What to check

To make a good working validation routine you first need to clarify what is allowed and what is not. Let’s see how any email string should look like: test.user@demo.com
Now analyze this string a little bit. You should check the following points:

  • The string must contains one and only one ‘@’ character.
  • Before the ‘@’ it must have at least one character.
  • After the ‘@’ it must have a valid domain format with at least one ‘.’
  • The email can not contain any invalid character.
  • The total length of an email should be at least 6 character (a@b.us).


[newpage=Page - 2]


Regular expressions

If you try to create a checker function with string manipulation routines such as strlen, strpos,… then it will result in a quite complicated if-else condition structure.
An other solution is using regular expressions. In this case you should define a pattern which should fit on the relevant string. For example: pattern is '^Test' and the subject is 'Test string'. This will pass as the pattern is valid for all strings which begins with the substring 'Test'.

Quick syntax overview

^ start of string
$ end of string
? zero or one of the preceding character
* zero or more of preceding character
+ one or more of preceding character
. any character

[a-z] letters a-z inclusive in lower case
[A-Z] letters A-Z inclusive in upper case
[0-9] numbers 0-9 inclusive
[^0-9] no occurrences of numbers 0-9 inclusive
{2} 2 of preceding character
{2,} 2 or more of preceding character
{2,4} 2-4 of preceding character
(a|b) a OR b



The email pattern

The pattern for the email validation is a little bit more complicated.

^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})


Hoop’s it is really quite complex. I try to explain it. Let’s divide the pattern into smaller parts. The first part is until the ‘@’ character.

^[_a-z0-9-]+(\.[_a-z0-9-]+)*


Explanation:

-- ^ means that we start the check from the first character of the string.
-- [_a-z0-9-]+: There must be at least 1 character between a and z or between 0-9 or ‘_’ .
-- (\.[_a-z0-9-]+)* : The first character group will be followed by 0 or more character groups which always begins with a ‘.’

Now try to interpret the second part:

@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})$


Explanation:

-- The ‘@’ means that after the first part this character is mandatory exactly once.
-- [_a-z0-9-]+: As before
-- (\.[_a-z0-9-]+)\*: As before
-- (\.[a-z]{2,3})$: This means that at the end of the email string there must be a 2 or 3 character long substring. And before it a ‘.’ is mandatory.

[newpage=Page - 3]
Implement in PHP
PHP has some built in function to support regular expressions. Now we just use one of them the eregi.

Function : eregi - case insensitive regular expression match
Usage : bool eregi ( string pattern, string string [, array regs]);

A new function was created to test the email string whether it is valid or not. Later you can just call this function. It returns true if the email is valid and false otherwise. Let’s see the code:
  
<?php
   
// This function tests whether the email address is valid  
   
function isValidEmail($email){
      
$pattern = "^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})$";
     
      if (
eregi($pattern, $email)){
         return 
true;
      }
      else {
         return 
false;
      }   
   }
?>

It' quite easy, isn't it?
Now we have the validation code. Let’s make a small test form to see it in action:
  
<?php

   
// This function tests whether the email address is valid  
   
function isValidEmail($email){
      
$pattern = "^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})$";
     
      if (
eregi($pattern, $email)){
         return 
true;
      }
      else {
         return 
false;
      }   
   }
?>   

 <html>
   <body> 
      <form action="<?php echo $_SERVER['PHP_SELF']; ?>" method="post" name="emailForm">
        Email: <input type="text" name="email"><br/>
        <input type="submit" name="submitemail">
      </form>          

       <?php
         
if (isset($_POST['submitemail']))
         {
            if (
isValidEmail($_POST['email'])){
                echo 
"The email: ".$_POST['email']." is valid! ";
            }
            else{
                echo 
" The email: ".$_POST['email']." is invalid! ";
            }
         }
       
?>
    </body> 
 </html>            

 Â