Fast Fourier transforms of MPI-distributed Julia arrays
69 Stars
Updated Last
1 Year Ago
Started In
November 2019


Stable Dev DOI

Build Status Coverage

Fast Fourier transforms of MPI-distributed Julia arrays.

This package provides multidimensional FFTs and related transforms on MPI-distributed Julia arrays via the PencilArrays package.

The name of this package originates from the decomposition of 3D domains along two out of three dimensions, sometimes called pencil decomposition. This is illustrated by the figure below, where each coloured block is managed by a different MPI process. Typically, one wants to compute FFTs on a scalar or vector field along the three spatial dimensions. In the case of a pencil decomposition, 3D FFTs are performed one dimension at a time (along the non-decomposed direction, using a serial FFT implementation). Global data transpositions are then needed to switch from one pencil configuration to the other and perform FFTs along the other dimensions.

Pencil decomposition of 3D domains


  • distributed N-dimensional FFTs of MPI-distributed Julia arrays, using the PencilArrays package;

  • FFTs and related transforms (e.g. DCTs / Chebyshev transforms) may be arbitrarily combined along different dimensions;

  • in-place and out-of-place transforms;

  • high scalability up to (at least) tens of thousands of MPI processes.


PencilFFTs can be installed using the Julia package manager:

julia> ] add PencilFFTs

Quick start

The following example shows how to apply a 3D FFT of real data over 12 MPI processes distributed on a 3 × 4 grid (same distribution as in the figure above).

using MPI
using PencilFFTs
using Random


dims = (16, 32, 64)  # input data dimensions
transform = Transforms.RFFT()  # apply a 3D real-to-complex FFT

# Distribute 12 processes on a 3 × 4 grid.
comm = MPI.COMM_WORLD  # we assume MPI.Comm_size(comm) == 12
proc_dims = (3, 4)

# Create plan
plan = PencilFFTPlan(dims, transform, proc_dims, comm)

# Allocate and initialise input data, and apply transform.
u = allocate_input(plan)
uF = plan * u

# Apply backwards transform. Note that the result is normalised.
v = plan \ uF
@assert u  v

For more details see the tutorial.


The performance of PencilFFTs is comparable to that of widely adopted MPI-based FFT libraries implemented in lower-level languages. As seen below, with its default settings, PencilFFTs generally outperforms the Fortran P3DFFT libraries.

Strong scaling of PencilFFTs

See the benchmarks section of the docs for details.