KNOWLEDGE BASE
Log In    |    Knowledge Base    |    4D Home
Tech Tip: Use Match Regex to validate an email address
PRODUCT: 4D | VERSION: 12 | PLATFORM: Mac & Win
Published On: December 17, 2010

The Match regex command can be used to easily test if a string is a valid email address.

The regular expression below matches 99% of the email addresses in use today. All the email address it matches can be handled by 99% of all email software in use today.

C_TEXT($pattern_T;$address_T)

$pattern_T:="(?i)^([A-Z0-9._%+-]+)@(?:[A-Z0-9_-]+\\.)+([A-Z]{2,4})$(.*)"

$address_T:=<User>@<Domain>
If (Match regex($pattern_T;$address_T))

    // ... Do something ...

End if


The image below breaks down the regex pattern and explains each part. To modify the pattern to account for an email address in the 1% you need to understand what each part does.



This pattern even allows for email addresses on servers on a subdomain. So an email address like "John+Jane.Doe@server.department.company.co.uk" will test as valid, but will catch such mistakes such as "john@4d..com."

There are a plenty email addresses that this regex doesn't match. The most common example are addresses on the ".museum" and ".travel" top level domain, which is longer than the 4 letters the regex allows for the top level domain. This trade-off is because the number of people using ".museum" or ".travel" email addresses is extremely low.

To include ".museum" and ".travel", you could use ...([A-Z]{2,6})$(.*). However, this also allows for an email address such as "john@mail.office", assuming that John simply forgot to include the ".com". If you don't mind having to update the regex each time a new top-level domain is created you could incorporate the pattern snippet below:

...(?:[A-Z]{2}|com|org|net|edu|gov|mil|biz|info|mobi|name|aero|asia|jobs|museum)$(.*)

which could be used to allow any two-letter country code top level domain, and only specific generic top level domains. This list should not be considered complete. At the time of this writing there are at least twenty top level domains.