Skip to content

Maps

Miller data types are listed on the Data types page; here we focus specifically on maps.

On the whole, maps are as in most other programming languages. However, following the Principle of Least Surprise and aiming to reduce keystroking for Miller's most-used streaming-record-processing model, there are a few differences as noted below.

Types of maps

Map literals are written in curly braces with string keys any Miller data type (including other maps, or arrays) as values. Also, integers may be given as keys although they'll be stored as strings.

mlr -n put '
  end {
    x = {"a": 1, "b": {"x": 2, "y": [3,4,5]}, 99: true};
    dump x;
    print x[99];
    print x["99"];
  }
'
{
  "a": 1,
  "b": {
    "x": 2,
    "y": [3, 4, 5]
  },
  "99": true
}
true
true

As with arrays and argument-lists, trailing commas are supported:

mlr -n put '
  end {
    x = {
      "a" : 1,
      "b" : 2,
      "c" : 3,
    };
    print x;
  }
'
{
  "a": 1,
  "b": 2,
  "c": 3
}

The current record, accessible using $*, is a map.

mlr --csv --from example.csv head -n 2 then put -q '
  dump $*;
  print "Color is", $*["color"];
'
{
  "color": "yellow",
  "shape": "triangle",
  "flag": "true",
  "k": 1,
  "index": 11,
  "quantity": 43.6498,
  "rate": 9.8870
}
Color is yellow
{
  "color": "red",
  "shape": "square",
  "flag": "true",
  "k": 2,
  "index": 15,
  "quantity": 79.2778,
  "rate": 0.0130
}
Color is red

The collection of all out-of-stream variables, @*, is a map.

mlr --csv --from example.csv put -q '
  begin {
    @last_rates = {};
  }
  @last_rates[$shape] = $rate;
  @last_color = $color;
  end {
    dump @*;
  }
'
{
  "last_rates": {
    "triangle": 5.8240,
    "square": 8.2430,
    "circle": 8.3350
  },
  "last_color": "purple"
}

Also note that several built-in functions operate on maps and/or return maps.

Insertion order is preserved

Miller maps preserve insertion order. So if you write @m["y"]=7 and then @m["x"]=3 then any loop over the map @m will give you the kays "y" and "x" in that order.

String keys, with conversion from/to integer

All Miller map keys are strings. If a map is indexed with an integer for either read or write (i.e. on either the right-hand side or left-hand side of an assignment) then the integer will be converted to/from string, respectively. So @m[3] is the same as @m["3"]. The reason for this is for situations like operating on all records where it's important to let people do @records[NR] = $*.

Auto-create

Indexing any as-yet-assigned local variable or out-of-stream variable results in auto-create of that variable as a map variable:

mlr --csv --from example.csv put -q '
  # You can do this but you do not need to:
  # begin { @last_rates = {} }
  @last_rates[$shape] = $rate;
  end {
    dump @last_rates;
  }
'
{
  "triangle": 5.8240,
  "square": 8.2430,
  "circle": 8.3350
}

This also means that auto-create results in maps, not arrays, even if keys are integers. If you want to auto-extend an array, initialize it explicitly to [].

mlr --csv --from example.csv head -n 4 then put -q '
  begin {
    @my_array = [];
  }
  @my_array[NR] = $quantity;
  @my_map[NR] = $rate;
  end {
    dump
  }
'
{
  "my_array": [43.6498, 79.2778, 13.8103, 77.5542],
  "my_map": {
    "1": 9.8870,
    "2": 0.0130,
    "3": 2.9010,
    "4": 7.4670
  }
}

Auto-deepen

Similarly, maps are auto-deepened: you can put @m["a"]["b"]["c"]=3 without first setting @m["a"]={} and @m["a"]["b"]={}. The reason for this is for doing data aggregations: for example if you want compute keyed sums, you can do that with a minimum of keystrokes.

mlr --icsv --opprint --from example.csv put -q '
  @quantity_sum[$color][$shape] += $rate;
  end {
    emit @quantity_sum, "color", "shape";
  }
'
color  shape    quantity_sum
yellow triangle 9.8870
yellow circle   12.572000000000001
red    square   17.011
red    circle   2.9010
purple triangle 14.415
purple square   8.2430

Looping

See single-variable for-loops and key-value for-loops.

Map-valued fields in CSV files

See the flatten/unflatten page.