• About Miller • File formats • Miller features in the context of the Unix toolkit • Record-heterogeneity • Reference • Data examples • Cookbook • FAQ • Internationalization • Compiling, portability, dependencies, and testing • Performance • Why C? • Why call it Miller? • How original is Miller? • Things to do • Documents by release • Contact information • GitHub repo |
• Doing arithmetic on fields with currency symbols • Program timing Parsing log-file outputThis, of course, depends highly on what’s in your log files. But, as an example, suppose you have log-file lines such as2015-10-08 08:29:09,445 INFO com.company.path.to.ClassName @ [sometext] various/sorts/of data {& punctuation} hits=1 status=0 time=2.378 grep 'various sorts' *.log | sed 's/.*} //' | mlr --fs space --repifs --oxtab stats1 -a min,p10,p50,p90,max -f time -g status Doing arithmetic on fields with currency symbols$ cat sample.csv EventOccurred,EventType,Description,Status,PaymentType,NameonAccount,TransactionNumber,Amount 10/1/2015,Charged Back,Reason: Authorization Revoked By Customer,Disputed,Checking,John,1,$230.36 10/1/2015,Charged Back,Reason: Authorization Revoked By Customer,Disputed,Checking,Fred,2,$32.25 10/1/2015,Charged Back,Reason: Customer Advises Not Authorized,Disputed,Checking,Bob,3,$39.02 10/1/2015,Charged Back,Reason: Authorization Revoked By Customer,Disputed,Checking,Alice,4,$57.54 10/1/2015,Charged Back,Reason: Authorization Revoked By Customer,Disputed,Checking,Jungle,5,$230.36 10/1/2015,Charged Back,Reason: Payment Stopped,Disputed,Checking,Joe,6,$281.96 10/2/2015,Charged Back,Reason: Customer Advises Not Authorized,Disputed,Checking,Joseph,7,$188.19 10/2/2015,Charged Back,Reason: Customer Advises Not Authorized,Disputed,Checking,Joseph,8,$188.19 10/2/2015,Charged Back,Reason: Payment Stopped,Disputed,Checking,Anthony,9,$250.00 $ mlr --icsv --opprint cat sample.csv EventOccurred EventType Description Status PaymentType NameonAccount TransactionNumber Amount 10/1/2015 Charged Back Reason: Authorization Revoked By Customer Disputed Checking John 1 $230.36 10/1/2015 Charged Back Reason: Authorization Revoked By Customer Disputed Checking Fred 2 $32.25 10/1/2015 Charged Back Reason: Customer Advises Not Authorized Disputed Checking Bob 3 $39.02 10/1/2015 Charged Back Reason: Authorization Revoked By Customer Disputed Checking Alice 4 $57.54 10/1/2015 Charged Back Reason: Authorization Revoked By Customer Disputed Checking Jungle 5 $230.36 10/1/2015 Charged Back Reason: Payment Stopped Disputed Checking Joe 6 $281.96 10/2/2015 Charged Back Reason: Customer Advises Not Authorized Disputed Checking Joseph 7 $188.19 10/2/2015 Charged Back Reason: Customer Advises Not Authorized Disputed Checking Joseph 8 $188.19 10/2/2015 Charged Back Reason: Payment Stopped Disputed Checking Anthony 9 $250.00 $ mlr --csv put '$Amount = sub(string($Amount), "\$", "")' then stats1 -a sum -f Amount sample.csv Amount_sum 1497.870000 $ mlr --csv --ofmt '%.2lf' put '$Amount = sub(string($Amount), "\$", "")' then stats1 -a sum -f Amount sample.csv Amount_sum 1497.87 Program timingThis admittedly artificial example demonstrates using Miller time and stats functions to introspectly acquire some information about Miller’s own runtime. The delta function computes the difference between successive timestamps.$ ruby -e '10000.times{|i|puts "i=#{i+1}"}' > lines.txt $ head -n 5 lines.txt i=1 i=2 i=3 i=4 i=5 mlr --ofmt '%.9le' --opprint put '$t=systime()' then step -a delta -f t lines.txt | head -n 7 i t t_delta 1 1430603027.018016 1.430603027e+09 2 1430603027.018043 2.694129944e-05 3 1430603027.018048 5.006790161e-06 4 1430603027.018052 4.053115845e-06 5 1430603027.018055 2.861022949e-06 6 1430603027.018058 3.099441528e-06 mlr --ofmt '%.9le' --oxtab \ put '$t=systime()' then \ step -a delta -f t then \ filter '$i>1' then \ stats1 -a min,mean,max -f t_delta \ lines.txt t_delta_min 2.861022949e-06 t_delta_mean 4.077508505e-06 t_delta_max 5.388259888e-05 |