Tech Tip: Documenting regex patterns with comments
PRODUCT: 4D | VERSION: 11 | PLATFORM: Mac & Win
Published On: May 14, 2010
Regular Expressions (regex) pattern strings can become long, obfuscated, and hard to maintain, especially for future developers doing maintennce on an unfamiliar application and those new to regex. Take for example the 4D Match regex pattern string below that will validate a North American phone number (area code/exchange/number) minus any extension.
$Regx_T:="^(1(-|.|\\s)?)?((\\d{3})|(\\(\\d{3}\\)))(-|.|\\s)?)?(\\d{3})(-|.|\\s)?(\\d{4})$"
As shown it is not the easiest pattern string to understand. But if you document your pattern by commenting the string by its logical parts, as shown below, the string becomes easier to understand and maintain by follow-on developers.
C_TEXT($Regx_T) $Regx_T:="^(1(-|.|\\s)?)?" `# Pattern begins with optional '1-', '1.' or '1' $Regx_T:=$Regx_T+"((\\d{3})" `# area code without parenthesis $Regx_T:=$Regx_T+"|" `# OR $Regx_T:=$Regx_T+"(\\(\\d{3}\\)))" `# area code with parenthesis $Regx_T:=$Regx_T+"(-|.|\\s)?" `# optionally followed by '-' or '.' or space $Regx_T:=$Regx_T+"(\\d{3})" `# 3 digits of the exchange $Regx_T:=$Regx_T+"(-|.|\\s)?" `# optionally followed by '-' or '.' or space $Regx_T:=$Regx_T+"(\\d{4})$" `# Pattern ends with last 4 digits |
To test the pattern string the array below contains some valid and invalid phone number constructs. The loop will test each number with the pattern built above and alert according to its validity.
ARRAY TEXT($Numbers_aT;5) $Numbers_aT{1}:="123 555 6789" $Numbers_aT{2}:="1-(123)-555-6789" $Numbers_aT{3}:="(123-555-6789" $Numbers_aT{4}:="(123).555.6789" $Numbers_aT{5}:="123 55 6789" For ($Ndx;1;Size of array($Numbers_aT)) If (Match regex($Regx_T;$Numbers_aT{$Ndx})) ALERT($Numbers_aT{$Ndx}+" is valid.") Else ALERT($Numbers_aT{$Ndx}+" is invalid!") End if End for |
Elements 1, 2, and 4 are valid, 3 and 5 are not.