KNOWLEDGE BASE
Log In    |    Knowledge Base    |    4D Home
Tech Tip: Generating Outliers from data sets
PRODUCT: 4D | VERSION: 14.0 | PLATFORM: Mac & Win
Published On: October 16, 2014

Below is a sample method to generate an array of outliers for a data set of numbers:

  //$1-> Pointer to input array
  //$2-> Pointer to output array
  //$3-> Optional Boolean Parameter
  //...False Only Shows Non-Extreme Outliers
  //...True Only Shows Extreme Outliers

//Declare Variables and Parameters

C_POINTER($1->;$2->)
C_BOOLEAN($3->;$out)
C_LONGINT($size;$q1;$q2;$q3;$counter)
C_REAL($IQR;$loOut;$hiOut;$exLoOut;$exHiOut)
ARRAY LONGINT($arrayIn;0)
ARRAY LONGINT($arrayRes;0)


If ((Count parameters=2)|(Count parameters=3))
   //Organize data set
 COPY ARRAY($1->->;$arrayIn)
 SORT ARRAY($arrayIn)

   //Calculate Quartiles
 $size:=Size of array($arrayIn)
 $q1:=$size/4
 $q2:=$size/2
 $q3:=$q1+$q2
 $IQR:=$arrayIn{$q3}-$arrayIn{$q1}

   //Calculate Inner Fences for data set
 $loOut:=$arrayIn{$q1}-(1.5*$IQR)
 $hiOut:=$arrayIn{$q3}+(1.5*$IQR)

   //Check parameters to see which Outliers to return
   //Then run for loops to check which values to return

 If (Count parameters=2)
  For ($counter;1;$size)
   If (($arrayIn{$counter}<$loOut)|($arrayIn{$counter}>$hiOut))
    APPEND TO ARRAY($arrayRes;$arrayIn{$counter})
   End if
  End for
 End if

 If (Count parameters=3)
  $out:=$3->

    //Calculate Outer Fences for data set
  $exLoOut:=$arrayIn{$q1}-(3*$IQR)
  $exHiOut:=$arrayIn{$q3}+(3*$IQR)

  If ($out=False)
   For ($counter;1;$size)
    If ((($arrayIn{$counter}<$loOut)&($arrayIn{$counter}>$exLoOut))|(($arrayIn{$counter}>$hiOut)&($arrayIn{$counter}<$exHiOut)))
     APPEND TO ARRAY($arrayRes;$arrayIn{$counter})
    End if
   End for

  Else
   For ($counter;1;$size)
    If (($arrayIn{$counter}<$exLoOut)|($arrayIn{$counter}>$exHiOut))
     APPEND TO ARRAY($arrayRes;$arrayIn{$counter})
    End if
   End for
  End if

 End if

  //Return array of desired outliers
 COPY ARRAY($arrayRes;$2->)

End if


Saving the method as Array_Outliers, an example using the method is shown below:
//Declare Variables
ARRAY LONGINT($array;0)
ARRAY LONGINT($arrayRes1;0)
ARRAY LONGINT($arrayRes2;0)
ARRAY LONGINT($arrayRes3;0)


//Sample Data Set
//...Extreme Low Outlier

APPEND TO ARRAY($array;-20)
//...Low Outlier
APPEND TO ARRAY($array;2)
//...Data within Typical Range
APPEND TO ARRAY($array;21)
APPEND TO ARRAY($array;22)
APPEND TO ARRAY($array;24)
APPEND TO ARRAY($array;25)
APPEND TO ARRAY($array;28)
APPEND TO ARRAY($array;35)
APPEND TO ARRAY($array;23)
APPEND TO ARRAY($array;24)
APPEND TO ARRAY($array;25)
APPEND TO ARRAY($array;29)
APPEND TO ARRAY($array;33)
//...High Outlier
APPEND TO ARRAY($array;50)
//...Extreme High Outlier
APPEND TO ARRAY($array;100)

Array_Percentile(->$array;->$arrayRes1)
Array_Percentile(->$array;->$arrayRes2;False)
Array_Percentile(->$array;->$arrayRes3;True)


When executed, the method above will result in the folowing:
-$arrayRes1 will contain {-20, 2, 50, 100}
-$arrayRes2 will contain {2, 50}
-$arrayRes3 will contain {-20, 100}

Locating outliers in data sets is useful in analyzing the information. It can also be helpful to extract the quartile values, inner fences, and outer fences for more informative details on the data sets.