Blog
PAGANI & m-CUBES: An Effortless Migration from CUDA to SYCL for Numerical Integration
March 7, 2024
Story at a Glance
-
Old Dominion University and Fermi National Laboratory migrated CUDA* code
for the PAGANI and m-CUBES parallel algorithms to SYCL using the
Intel® DPC++ Compatibility Tool
. -
The SYCL code achieved 90–95% of the performance of CUDA-optimized code
on NVIDIA* V100.
What Are PAGANI and m-CUBES?
Numerical integration is required for many applications across
fields such as physics, including beam dynamics simulation and parameter
estimation in cosmological models. Low-dimensional integrals (one or two
variables) can often be computed efficiently, even for oscillatory or sharply
peaked integrands. Medium- to high-dimensional integrals, however, typically
require highly parallel computing architectures.
PAGANI is a deterministic, quadrature-based algorithm designed
for massively parallel architectures. It computes integrals using weighted
summations of an integrand function f evaluated at d-dimensional
points xi, each associated with a weight wi.
PAGANI progressively refines its accuracy until a user-defined error threshold
is reached by applying quadrature rules over increasingly smaller regions of
the integration space.
m-CUBES is a probabilistic Monte Carlo algorithm based on the
VEGAS integrator. It estimates integrals by random sampling across the
integration space and uses the standard deviation of the samples to determine
error bounds. The algorithm employs importance and stratified sampling and
partitions the domain into multiple sub-cubes, each sampled independently.
Challenge of Vendor Lock-In
Without universally adopted standards, portability across computing platforms
traditionally required maintaining multiple code bases. CUDA, while popular,
is proprietary and limited to NVIDIA GPUs.
To address this, we developed SYCL implementations of PAGANI and m-CUBES,
migrating from CUDA to achieve both performance and portability across
heterogeneous architectures, including GPUs and multi-CPU systems.
CUDA to SYCL Implementation for PAGANI and m-CUBES
The
Intel® oneAPI Toolkits
provide compilers, libraries, and migration tools such as the
Intel® DPC++ Compatibility Tool
, which automates much of the CUDA-to-SYCL porting process.
PAGANI and m-CUBES offload most computation to accelerators using iterative
workflows. The Compatibility Tool successfully converted nearly all Thrust
library calls to SYCL equivalents.


One exception was the minimax method, which triggered a
compatibility warning. In this case, a non-trivially copyable object caused a
SYCL compilation error when captured in a lambda. Refactoring the code to pass
individual pointers resolved the issue.
Note: regions->leftcoord cannot be directly used
inside the lambda expression, as this results in an illegal runtime memory
access.


Performance Evaluation
To evaluate performance, we tested the primary kernels of PAGANI and m-CUBES
using multiple integrands commonly employed in numerical integration
benchmarks, including trigonometric, polynomial, and exponential functions.