KNOWLEDGE BASE
Log In    |    Knowledge Base    |    4D Home
Tech Tip: Regular expression to search for URL address
PRODUCT: 4D | VERSION: 11.6 | PLATFORM: Mac & Win
Published On: May 21, 2010

This tech tip demonstrates how to exrtact all URL addresses from text.

ARRAY LONGINT(posFound_a;0)
ARRAY LONGINT(lengthFound_a;0)
ARRAY TEXT(URL_a;0)
C_LONGINT($start)
C_TEXT($mySubstring)
C_TEXT(stringNew; pattern)
C_BOOLEAN($found)

$start:=1
$foung:=False

pattern:="(http|https|ftp)" ` Patterns that contain http or https, or ftp protocols.
pattern:=pattern + "\\://" ` Adding :// to the pattern
pattern:=pattern + "[a-zA-Z0-9\\-\\.]+" `Matches the first part of the domain
pattern:=pattern + "\\.[a-zA-Z]{2,4}" `Match the second part of the domain
pattern:=pattern + "(:[a-zA-Z0-9]*)?/?" `Match the port number and the slash
pattern:=pattern + "([a-zA-Z0-9\\-\\._?\\,'/\\+%\\$#\\=~\\:\\&])*" `Reserved chars
pattern:=pattern + "[^\\.\\,\\)\\(\\s\\']" `Excluded chars

myString:="This is url1 http://www.4d.com and this is url 2"
myString:=myString +"http://forums.4d.fr and http://slashdot.org and another email"
myString:=myString + " http://doc.4d.com/4D-Language-Reference-11.6/Printing/"
myString:=myString + "Subtotal.301-205861.en.html"

Repeat
  $found:=Match regex(pattern;myString;$start;posFound_a;lengthFound_a)
  stringNew:=Substring(myString;posFound_a{0};lengthFound_a{0})
  If ($found)
    APPEND TO ARRAY(URL_a;stringNew)
  End if
  $start:=posFound_a{0}+lengthFound_a{0}
Until (Not($found))


After executing this method the URL_a array will be populated with all URL addresses. This pattern works with most common url addresses.

If the user wants to instead validate a URL address the pattern should begin with "^" and end with "$".