Quick links:
 
Flags
 
Verbs
 
Functions
 
Glossary
 
Release docs
Statistics examples¶
Computing interquartile ranges¶
For one or more specified field names, simply compute p25 and p75, then write the IQR as the difference of p75 and p25:
mlr --oxtab stats1 -f x -a p25,p75 \
    then put '$x_iqr = $x_p75 - $x_p25' \
    data/medium 
x_p25 0.24667037823231752 x_p75 0.7481860062358446 x_iqr 0.5015156280035271
For wildcarded field names, first compute p25 and p75, then loop over field names with p25 in them:
mlr --oxtab stats1 --fr '[i-z]' -a p25,p75 \
    then put 'for (k,v in $*) {
      if (k =~ "(.*)_p25") {
        $["\1_iqr"] = $["\1_p75"] - $["\1_p25"]
      }
    }' \
    data/medium 
i_p25 2501 i_p75 7501 x_p25 0.24667037823231752 x_p75 0.7481860062358446 y_p25 0.25213670524015686 y_p75 0.7640028449996572 i_iqr 5000 x_iqr 0.5015156280035271 y_iqr 0.5118661397595003
Computing weighted means¶
This might be more elegantly implemented as an option within the stats1 verb. Meanwhile, it's expressible within the DSL:
mlr --from data/medium put -q '
  # Using the y field for weighting in this example
  weight = $y;
  # Using the a field for weighted aggregation in this example
  @sumwx[$a] += weight * $i;
  @sumw[$a] += weight;
  @sumx[$a] += $i;
  @sumn[$a] += 1;
  end {
    map wmean = {};
    map mean  = {};
    for (a in @sumwx) {
      wmean[a] = @sumwx[a] / @sumw[a]
    }
    for (a in @sumx) {
      mean[a] = @sumx[a] / @sumn[a]
    }
    #emit wmean, "a";
    #emit mean, "a";
    emit (wmean, mean), "a";
  }'
a=pan,wmean=4979.563722208067,mean=5028.259010091302 a=eks,wmean=4890.3815931472145,mean=4956.2900763358775 a=wye,wmean=4946.987746229947,mean=4920.001017293998 a=zee,wmean=5164.719684856538,mean=5123.092330239375 a=hat,wmean=4925.533162478552,mean=4967.743946419371