Quick examplesΒΆ
Column select:
% mlr --csv cut -f hostname,uptime mydata.csv
Add new columns as function of other columns:
% mlr --nidx put '$sum = $7 < 0.0 ? 3.5 : $7 + 2.1*$8' *.dat
Row filter:
% mlr --csv filter '$status != "down" && $upsec >= 10000' *.csv
Apply column labels and pretty-print:
% grep -v '^#' /etc/group | mlr --ifs : --nidx --opprint label group,pass,gid,member then sort -f group
Join multiple data sources on key columns:
% mlr join -j account_id -f accounts.dat then group-by account_name balances.dat
Multiple formats including JSON:
% mlr --json put '$attr = sub($attr, "([0-9]+)_([0-9]+)_.*", "\1:\2")' data/*.json
Aggregate per-column statistics:
% mlr stats1 -a min,mean,max,p10,p50,p90 -f flag,u,v data/*
Linear regression:
% mlr stats2 -a linreg-pca -f u,v -g shape data/*
Aggregate custom per-column statistics:
% mlr put -q '@sum[$a][$b] += $x; end {emit @sum, "a", "b"}' data/*
Iterate over data using DSL expressions:
% mlr --from estimates.tbl put '
for (k,v in $*) {
if (is_numeric(v) && k =~ "^[t-z].*$") {
$sum += v; $count += 1
}
}
$mean = $sum / $count # no assignment if count unset
'
Run DSL expressions from a script file:
% mlr --from infile.dat put -f analyze.mlr
Split/reduce output to multiple filenames:
% mlr --from infile.dat put 'tee > "./taps/data-".$a."-".$b, $*'
Compressed I/O:
% mlr --from infile.dat put 'tee | "gzip > ./taps/data-".$a."-".$b.".gz", $*'
Interoperate with other data-processing tools using standard pipes:
% mlr --from infile.dat put -q '@v=$*; dump | "jq .[]"'
Tap/trace:
% mlr --from infile.dat put '(NR % 1000 == 0) { print > stderr, "Checkpoint ".NR}'