First Read

Step-by-step: how to read openPMD data? We are using the examples files from openPMD-example-datasets (example-3d.tar.gz).

Include / Import

After successful installation, you can start using openPMD-api as follows:

C++17

#include <openPMD/openPMD.hpp>

// example: data handling & print
#include <vector>   // std::vector
#include <iostream> // std::cout
#include <memory>   // std::shared_ptr

namespace io = openPMD;

Python

import openpmd_api as io

# example: data handling
import numpy as np

Open

Open an existing openPMD series in data<N>.h5. Further file formats than .h5 (HDF5) are supported: .bp (ADIOS2) or .json (JSON).

C++17

auto series = io::Series(
    "data_%T.h5",
    io::Access::READ_ONLY);

Python

series = io.Series(
    "data_%T.h5",
    io.Access.read_only)

Tip

Replace the file ending .h5 with a wildcard .%E to let openPMD autodetect the ending from the file system. Use the wildcard %T to match filename encoded iterations.

Tip

More detailed options can be passed via JSON or TOML as a further constructor parameter. Try {"defer_iteration_parsing": true} to speed up the first access. (Remember to explicitly it.open() iterations in that case.)

Iteration

Grouping by an arbitrary, positive integer number <N> in a series. Let’s take the iteration 100:

C++17

auto i = series.iterations[100];

Python

i = series.iterations[100]

Attributes

openPMD defines a kernel of meta attributes and can always be extended with more. Let’s see what we’ve got:

C++17

std::cout << "openPMD version: "
    << series.openPMD() << "\n";

if( series.containsAttribute("author") )
    std::cout << "Author: "
        << series.author() << "\n";

Python

print("openPMD version: ",
      series.openPMD)

if series.contains_attribute("author"):
    print("Author: ",
          series.author)

Record

An openPMD record can be either structured (mesh) or unstructured (particles). Let’s read an electric field:

C++17

// record
auto E = i.meshes["E"];

// record components
auto E_x = E["x"];

Python

# record
E = i.meshes["E"]

# record components
E_x = E["x"]

Tip

You can check via i.meshes.contains("E") (C++) or "E" in i.meshes (Python) if an entry exists.

Units

Even without understanding the name “E” we can check the dimensionality of a record to understand its purpose.

C++17

// unit system agnostic dimension
auto E_unitDim = E.unitDimension();

// ...
// io::UnitDimension::M

// conversion to SI
double x_unit = E_x.unitSI();

Python

# unit system agnostic dimension
E_unitDim = E.unit_dimension

# ...
# io.Unit_Dimension.M

# conversion to SI
x_unit = E_x.unit_SI

Note

This example is not yet written :-)

In the future, units are automatically converted to a selected unit system (not yet implemented). For now, please multiply your read data (x_data) with x_unit to covert to SI, otherwise the raw, potentially awkwardly scaled data is taken.

Register Chunk

We can load record components partially and in parallel or at once. Reading small data one by one is is a performance killer for I/O. Therefore, we register all data to be loaded first and then flush it in collectively.

C++17

// alternatively, pass pre-allocated
std::shared_ptr< double > x_data =
    E_x.loadChunk< double >();

Python

# returns an allocated but
# invalid numpy array
x_data = E_x.load_chunk()

Attention

After registering a data chunk such as x_data for loading, it MUST NOT be modified or deleted until the flush() step is performed! You must not yet access x_data !

One can also request to load a slice of data:

C++17

Extent extent = E_x.getExtent();
extent.at(2) = 1;
std::shared_ptr< double > x_slice_data =
    E_x.loadChunk< double >(
        io::Offset{0, 0, 4}, extent);

Python

# we support slice syntax, too
x_slice_data = E_x[:, :, 4]

Don’t forget that we still need to flush().

Flush Chunk

We now flush the registered data chunks and fill them with actual data from the I/O backend. Flushing several chunks at once allows to increase I/O performance significantly. Only after that, the variables x_data and x_slice_data can be read, manipulated and/or deleted.

C++17

series.flush();

Python

series.flush()

Data

We can now work with the newly loaded data in x_data (or x_slice_data):

C++17

auto extent = E_x.getExtent();

std::cout << "First values in E_x "
        "of shape: ";
for( auto const& dim : extent )
    std::cout << dim << ", ";
std::cout << "\n";

for( size_t col = 0;
     col < extent[1] && col < 5;
     ++col )
    std::cout << x_data.get()[col]
              << ", ";
std::cout << "\n";

Python

extent = E_x.shape

print(
    "First values in E_x "
    "of shape: ",
    extent)


print(x_data[0, 0, :5])

Close

Finally, the Series is closed when its destructor is called. Make sure to have flush() ed all data loads at this point, otherwise it will be called once more implicitly.

C++17

series.close()

Python

series.close()