• About Miller • File formats • Miller features in the context of the Unix toolkit • Record-heterogeneity • Reference • Data examples • FAQ • Internationalization • Compiling, portability, dependencies, and testing • Performance • Why C? • Why call it Miller? • How original is Miller? • Things to do • Documents by release • Contact information • GitHub repo |
• DKVP: Key-value pairs • NIDX: Index-numbered (toolkit style) • CSV/TSV/etc. • PPRINT: Pretty-printed tabular • XTAB: Vertical tabular Examples$ mlr --usage-data-format-examples DKVP: delimited key-value pairs (Miller default format) +---------------------+ | apple=1,bat=2,cog=3 | Record 1: "apple" => "1", "bat" => "2", "cog" => "3" | dish=7,egg=8,flint | Record 2: "dish" => "7", "egg" => "8", "3" => "flint" +---------------------+ NIDX: implicitly numerically indexed (Unix-toolkit style) +---------------------+ | the quick brown | Record 1: "1" => "the", "2" => "quick", "3" => "brown" | fox jumped | Record 2: "1" => "fox", "2" => "jumped" +---------------------+ CSV/CSV-lite: comma-separated values with separate header line +---------------------+ | apple,bat,cog | | 1,2,3 | Record 1: "apple => "1", "bat" => "2", "cog" => "3" | 4,5,6 | Record 2: "apple" => "4", "bat" => "5", "cog" => "6" +---------------------+ PPRINT: pretty-printed tabular +---------------------+ | apple bat cog | | 1 2 3 | Record 1: "apple => "1", "bat" => "2", "cog" => "3" | 4 5 6 | Record 2: "apple" => "4", "bat" => "5", "cog" => "6" +---------------------+ XTAB: pretty-printed transposed tabular +---------------------+ | apple 1 | Record 1: "apple" => "1", "bat" => "2", "cog" => "3" | bat 2 | | cog 3 | | | | dish 7 | Record 2: "dish" => "7", "egg" => "8" | egg 8 | +---------------------+ DKVP: Key-value pairsMiller’s default file format is DKVP, for delimited key-value pairs. Example:$ mlr cat data/small a=pan,b=pan,i=1,x=0.3467901443380824,y=0.7268028627434533 a=eks,b=pan,i=2,x=0.7586799647899636,y=0.5221511083334797 a=wye,b=wye,i=3,x=0.20460330576630303,y=0.33831852551664776 a=eks,b=wye,i=4,x=0.38139939387114097,y=0.13418874328430463 a=wye,b=pan,i=5,x=0.5732889198020006,y=0.8636244699032729 puts "host=#{hostname},seconds=#{t2-t1},message=#{msg}" puts mymap.collect{|k,v| "#{k}=#{v}"}.join(',') echo "type=3,user=$USER,date=$date\n"; logger.log("type=3,user=$USER,date=$date\n"); resource=/path/to/file,loadsec=0.45,ok=true record_count=100, resource=/path/to/file resource=/some/other/path,loadsec=0.97,ok=false NIDX: Index-numbered (toolkit style)With --inidx --ifs ' ' --repifs, Miller splits lines on whitespace and assigns integer field names starting with 1. This recapitulates Unix-toolkit behavior. Example with index-numbered output:
CSV/TSV/etc.When mlr is invoked with the --csv or --csvlite option, key names are found on the first record and values are taken from subsequent records. This includes the case of CSV-formatted files. See Record-heterogeneity for how Miller handles changes of field names within a single data stream. Miller has record separator RS and field separator FS, just as awk does. For TSV, use --fs tab; to convert TSV to CSV, use --ifs tab --ofs comma, etc. (See also Reference.) Miller’s --csv flag supports RFC-4180 CSV ( https://tools.ietf.org/html/rfc4180). This includes CRLF line-terminators by default, regardless of platform. You can use mlr --csv --rs lf for native Un*x (LF-terminated) CSV files. The RFC says, somewhat briefly, that “there may be a header line”. Miller’s --implicit-csv-header option allows you to read CSV data which lacks a header line, applying column labels 1, 2, 3, etc. for you. You may also use Miller’s label to replace those numerical column names with labels of your choosing.PPRINT: Pretty-printed tabularMiller’s pretty-print format is like CSV, but column-aligned. For example, compare
XTAB: Vertical tabularThis is perhaps most useful for looking a very wide and/or multi-column data which causes line-wraps on the screen (but see also https://github.com/twosigma/ngrid for an entirely different, very powerful option). Namely:
|