N

Email Validation to RFC Specification with PHP

I finally came up with a script that will properly check if an email is valida against any RFC specification.

This includes:

Quoted pairs
(special characters that are not normally allowed, but escaped with a backslash.)
example:
this\@email@thenetgen.com
Quoted pairs are technically valid. Even though they are obsolete, if someone has an email like the one above, this validator will let them use it!

Quoted special characters
(any special characters that are not normally allowed, but are between quotes, like a space)
Example:
this+”is @ test: wow;”+em\@ail@the-net-gen8.com
This is also allowed.  as long as the “weird” characters are within double quotes. Note that a valid email will escape any double quote that is to be taken literally.

Example 1 : this\”quote@the-net-gen8.ca is valid because the double quote is a “quoted pair”
Example 2: this”quote \”within @ wierd “@the-net-gen8.ca is also valid.  The second quote does not end the quoted string since it is escaped with a back slash.

The only thing that it does not cover is ip addresses instead of domain names (usually used by spammers) such as email@[123.234.212.112]. Call me lazy or biased, but adding this to the regular expression pattern would be a breeze. I already did the hard work for you.

Notice:

This is copyright © 2009 Frank Forte

Reproduction without written permission is prohibited.

I assume NO liability for the use of this code, even if it is through my own negligence. It is meant for informational purposes only, and if you want to use this code, you must have written permission from Frank Forte first.

Please contact me if you want to use any content from TheNetGen.com. I would be more than happy to let you use it and provide support, but only if you get my written permission first.

Validating Email according to RFC specification

  1. An e-mail address consists of “local part” and “domain” separated by an at sign (@) character (RFC 2822 3.4.1). Further the domain is split into the “label” and the dot separator (difficult to distinguish with subdomains), so we will split the pattern into two parts $local and $domain.
  2. Matching local part

  3. Local Allowed Characters: any 7-bit ascii character, excluding control characters (char codes 32 to 126)a-zA-Z0-9!#%&/=_`~^$*+?{|}’-(a list of all 7 bit ASCII characters can be found here: http://www.tutorialspoint.com/html/html_ascii_codes.htm)
  4. The local part may consist of a quoted string, in other words, anything within quotes (”), including spaces (RFC 2822 3.2.5)The following special characters *must* be in quotes:SPACE ()<>[]:;@,.\
  5. Quoted pairs (”escaped” special characaters such as \@) are valid components of a local part, though an obsolete form from RFC 822 (RFC 2822 4.4).
  6. The maximum length of a local part is 64 characters (RFC 2821 4.5.3.1).
  7. A domain consists of labels separated by dot separators (RFC1035 2.3.1).
  8. Domain labels can contain alphanumeric characters and the hyphen (-). They must start with an alphabetic character and end with a alphanumeric character (RFC 1035 2.3.1).[a-zA-Z] [a-zA-Z0-9-] [a-zA-Z0-9]
  9. The maximum length of a label is 63 characters (RFC 1035 2.3.1).
  10. The maximum length of a domain is 255 characters (RFC 2821 4.5.3.1).
  11. The domain must be fully qualified and resolvable to a type A or type MX DNS address record (RFC 2821 3.6).

function CheckEmail($email){
// local characters that do not need to be quoted
// a-zA-Z0-9!#%&/=_`~^$*+?{|}’-

$locChars=‘\.a-zA-Z0-9!#%&=_`~\/\’\^\$\*\+\?\{\|\}\-’;

// local characters that need to be quoted
// ()<>[]:;@,.\

$locSpeChars=‘\(\)><\[\]:;@,\.\\ ‘;
$locSpeQuotedPairChars=‘\\\\|\ |\(|\)|\>|\<|\[|\]|\:|\;|\@|\,|\.’; // no longer supported

// local pattern
$local=‘(['.$locChars.$locSpeQuotedPairChars.']*|”['.$locChars.$locSpeChars.']*”)*’;

// domain characters
$domainChars=‘\.a-zA-Z0-9-’;

// domain pattern
$domain=‘['.$domainChars.']+’; // minimum domain:  a.a

// validation: if this statement is true, the email is not valid.
$pattern=’/^(‘.$local.’)@(‘.$domain.’)$/‘;

// validation: if this statement is true, the email is not valid.
if(!preg_match($pattern,$email,$matches) // does not exactly match allowed characters
||  strlen($matches[1]) > 64 // local part too long
||  strlen($matches[4]) > 255 // domain part too long
||  strpos($matches[1],’..’) // local part has two consecutive dots
||  preg_match(‘/^\./’,$matches[1]) //  local part starts with dot
||  preg_match(‘/\.$/’,$matches[1]) // local part ends with dot
|| !preg_match(‘/^[a-zA-Z]/’,$matches[4]) // domain part does not start with letter
|| !preg_match(‘/[a-zA-Z0-9]$/’,$matches[4]) // domain part does not end with letter or number
){
return FALSE;
} else {
return TRUE;
}
}

Comments are closed.