# DimensionalData

DimensionalData.jl provides tools and abstractions for working with datasets that have named dimensions, and optionally a lookup index. It's a pluggable, generalised version of AxisArrays.jl with a cleaner syntax, and additional functionality found in NamedDims.jl. It has similar goals to pythons xarray, and is primarily written for use with spatial data in GeoData.jl.

Broadcasting and most Base methods maintain and sync dimension context.

DimensionalData.jl also implements:

- comprehensive plot recipes for Plots.jl.
- a Tables.jl interface with
`DimTable`

- multi-layered
`DimStack`

s that can be indexed together, and have base methods applied to all layers. - the Adapt.jl interface for use on GPUs, even as GPU kernel arguments.
- traits for handling a wide range of spatial data types accurately.

## Dimensions

Dimensions are wrapper types. They hold the lookup index, details about the
grid, and other metadata. They are also used to index into the array.
`X`

, `Y`

, `Z`

and `Ti`

are the exported defaults. A generalised `Dim`

type is available
to use arbitrary symbols to name dimensions. Custom dimension types can also be defined
using the `@dim`

macro.

Dimensions can be used to construct arrays in `rand`

, `ones`

, `zeros`

and `fill`

with
either a range for a lookup index or a number for the dimension length:

```
julia> using DimensionalData
julia> A = rand(X(1:40), Y(50))
40×50 DimArray{Float64,2} with dimensions:
X: 1:40 (Sampled - Ordered Regular Points)
Y
0.929006 0.116946 0.750017 … 0.172604 0.678835 0.495294
0.0550038 0.100739 0.427026 0.778067 0.309657 0.831754
⋮ ⋱
0.647768 0.965682 0.049315 0.220338 0.0326206 0.36705
0.851769 0.164914 0.555637 0.771508 0.964596 0.30265
```

We can also use dim wrappers for indexing, so that the dimension order in the underlying array does not need to be known:

```
julia> A[Y(1), X(1:10)]
10-element DimArray{Float64,1} with dimensions:
X: 1:10 (Sampled - Ordered Regular Points)
and reference dimensions: Y(1)
0.929006
0.0550038
0.641773
⋮
0.846251
0.506362
0.0492866
```

And this has no runtime cost:

```
julia> A = ones(X(3), Y(3))
3×3 DimArray{Float64,2} with dimensions: X, Y
1.0 1.0 1.0
1.0 1.0 1.0
1.0 1.0 1.0
julia> @btime $A[X(1), Y(2)]
1.077 ns (0 allocations: 0 bytes)
1.0
julia> @btime parent($A)[1, 2]
1.078 ns (0 allocations: 0 bytes)
1.0
```

Dims can be used for indexing and views without knowing dimension order:

```
julia> A = rand(X(40), Y(50))
40×50 DimArray{Float64,2} with dimensions: X, Y
0.377696 0.105445 0.543156 … 0.844973 0.163758 0.849367
⋮ ⋱
0.431454 0.108927 0.137541 0.531587 0.592512 0.598927
julia> A[Y=3]
40-element DimArray{Float64,1} with dimensions: X
and reference dimensions: Y(3)
0.543156
⋮
0.137541
julia> view(A, Y(), X(1:5))
5×50 DimArray{Float64,2} with dimensions: X, Y
0.377696 0.105445 0.543156 … 0.844973 0.163758 0.849367
⋮ ⋱
0.875279 0.133032 0.925045 0.156768 0.736917 0.444683
```

And for specifying dimension number in all `Base`

and `Statistics`

functions that have a `dims`

argument:

```
julia> using Statistics
julia> A = rand(X(3), Y(4), Ti(5));
julia> mean(A; dims=Ti)
3×4×1 DimArray{Float64,3} with dimensions: X, Y, Ti (Time)
[:, :, 1]
0.168058 0.52353 0.563065 0.347025
0.472786 0.395884 0.307846 0.518926
0.365028 0.381367 0.423553 0.369339
```

You can also use symbols to create `Dim{X}`

dimensions,
although we can't use the `rand`

method directly with Symbols,
and insteadd use the regular `DimArray`

constructor:

```
julia> A = DimArray(rand(10, 20, 30), (:a, :b, :c));
julia> A[a=2:5, c=9]
4×20 DimArray{Float64,2} with dimensions: Dim{:a}, Dim{:b}
and reference dimensions: Dim{:c}(9)
0.134354 0.581673 0.422615 … 0.410222 0.687915 0.753441
0.573664 0.547341 0.835962 0.0353398 0.794341 0.490831
0.166643 0.133217 0.879084 0.695685 0.956644 0.698638
0.325034 0.147461 0.149673 0.560843 0.889962 0.75733
```

## Selectors

Selectors find indices in the lookup index for each dimension:

`At(x)`

: get the index exactly matching the passed in value(s)`Near(x)`

: get the closest index to the passed in value(s)`Where(f::Function)`

: filter the array axis by a function of the dimension index values.`Between(a, b)`

: get all indices between two values, excluding the high value.`Contains(x)`

: get indices where the value x falls within the interval, exluding the upper value. Only used for`Sampled`

`Intervals`

, for`Points`

, use`At`

.

(`Between`

and `Contains`

exlude the upper boundary so that adjacent selections
never contain the same index)

Selectors can be used in `getindex`

, `setindex!`

and
`view`

to select indices matching the passed in value(s)

We can use selectors inside dim wrappers:

```
julia> using Dates
julia> timespan = DateTime(2001,1):Month(1):DateTime(2001,12)
DateTime("2001-01-01T00:00:00"):Month(1):DateTime("2001-12-01T00:00:00")
julia> A = DimArray(rand(12,10), (Ti(timespan), X(10:10:100)))
12×10 DimArray{Float64,2} with dimensions:
Ti (Time): DateTime("2001-01-01T00:00:00"):Month(1):DateTime("2001-12-01T00:00:00") (Sampled - Ordered Regular Points)
X: 10:10:100 (Sampled - Ordered Regular Points)
0.14106 0.476176 0.311356 0.454908 … 0.464364 0.973193 0.535004
⋮ ⋱
0.522759 0.390414 0.797637 0.686718 0.901123 0.704603 0.0740788
julia> @btime A[X(Near(35)), Ti(At(DateTime(2001,5)))]
0.3133109280208961
```

Without dim wrappers selectors must be in the right order:

```
using Unitful
julia> A = rand(X((1:10:100)u"m"), Ti((1:5:100)u"s"));
julia> A[Between(10.5u"m", 50.5u"m"), Near(23u"s")]
4-element DimArray{Float64,1} with dimensions:
X: (11:10:41) m (Sampled - Ordered Regular Points)
and reference dimensions:
Ti(21 s) (Time): 21 s (Sampled - Ordered Regular Points)
0.584028
⋮
0.716715
```

For values other than `Int`

/`AbstractArray`

/`Colon`

(which are set aside for
regular indexing) the `At`

selector is assumed, and can be dropped completely:

```
julia> A = rand(X([:a, :b, :c]), Y([25.6, 25.7, 25.8]));
julia> A[:b, 25.8]
0.61839141062599
```

### Compile-time selectors

Using all `Val`

indexes (only recommended for small arrays)
you can index with named dimensions `At`

arbitrary values with no
runtime cost:

```
julia> A = rand(X(Val((:a, :b, :c))), Y(Val((5.0, 6.0, 7.0))))
3×3 DimArray{Float64,2} with dimensions:
X: Val{(:a, :b, :c)}() (Categorical - Unordered)
Y: Val{(5.0, 6.0, 7.0)}() (Categorical - Unordered)
0.5808 0.835037 0.528461
0.8924 0.431394 0.506915
0.66386 0.955305 0.774132
julia> @btime $A[:c, 6.0]
2.777 ns (0 allocations: 0 bytes)
0.9553052910459472
julia> @btime $A[Val(:c), Val(6.0)]
1.288 ns (0 allocations: 0 bytes)
0.9553052910459472
```

## Methods where dims can be used containing indices or Selectors

`getindex`

, `setindex!`

`view`

`Symbol`

s can be used to indicate the array dimension:

Methods where dims, dim types, or `size`

,`axes`

,`firstindex`

,`lastindex`

`cat`

,`reverse`

,`dropdims`

`reduce`

,`mapreduce`

`sum`

,`prod`

,`maximum`

,`minimum`

,`mean`

,`median`

,`extrema`

,`std`

,`var`

,`cor`

,`cov`

`permutedims`

,`adjoint`

,`transpose`

,`Transpose`

`mapslices`

,`eachslice`

`DimArray`

s:

Methods where dims can be used to construct `fill`

,`ones`

,`zeros`

,`rand`

## Warnings

Indexing with unordered or reverse order arrays has undefined behaviour.
It will trash the dimension index, break `searchsorted`

and nothing will make
sense any more. So do it at you own risk. However, indexing with sorted vectors
of Int can be useful. So it's allowed. But it will still do strange things
to your interval sizes if the dimension span is `Irregular`

.

## Alternate Packages

There are a lot of similar Julia packages in this space. AxisArrays.jl, NamedDims.jl, NamedArrays.jl are registered alternative that each cover some of the functionality provided by DimensionalData.jl. DimensionalData.jl should be able to replicate most of their syntax and functionality.

AxisKeys.jl and AbstractIndices.jl are some other interesting developments. For more detail on why there are so many similar options and where things are headed, read this thread.