RAPIDS
The Python bindings of openPMD-api enable easy loading into the GPU-accelerated RAPIDS.ai datascience & AI/ML ecosystem.
How to Install
Follow the official documentation to install RAPIDS.
# preparation
conda update -n base conda
conda install -n base conda-libmamba-solver
conda config --set solver libmamba
# install
conda create -n rapids -c rapidsai -c conda-forge -c nvidia rapids python cudatoolkit openpmd-api pandas
conda activate rapids
Dataframes
The central Python API call to convert to openPMD particles to a cuDF dataframe is the ParticleSpecies.to_df
method.
import openpmd_api as io
import cudf
s = io.Series("samples/git-sample/data%T.h5", io.Access.read_only)
electrons = s.iterations[400].particles["electrons"]
cdf = cudf.from_pandas(electrons.to_df())
type(cdf) # cudf.DataFrame
print(cdf)
# note: no series.flush() needed
One can also combine all iterations in a single dataframe like this:
cdf = s.to_cudf("electrons")
# like before but with a new column "iteration" and all particles
print(cdf)
openPMD as SQL Database
Once converted to a dataframe, one can query and process openPMD data also with SQL syntax as provided by many databases.
A project that provides such syntax is for instance BlazingSQL (see the BlazingSQL install documentation).
import openpmd_api as io
from blazingsql import BlazingContext
s = io.Series("samples/git-sample/data%T.h5", io.Access.read_only)
electrons = s.iterations[400].particles["electrons"]
bc = BlazingContext(enable_progress_bar=True)
bc.create_table('electrons', electrons.to_df())
# all properties for electrons > 3e11 kg*m/s
bc.sql('SELECT * FROM electrons WHERE momentum_z > 3e11')
# selected properties
bc.sql('SELECT momentum_x, momentum_y, momentum_z, weighting FROM electrons WHERE momentum_z > 3e11')
Example
A detailed example script for particle and field analysis is documented under as 11_particle_dataframe.py
in our examples.