Skip to Main Content
EventHero

Blog

PAGANI & m-CUBES: An Effortless Migration from CUDA to SYCL for Numerical Integration

March 7, 2024

Story at a Glance

What Are PAGANI and m-CUBES?

Numerical integration is required for many applications across
fields such as physics, including beam dynamics simulation and parameter
estimation in cosmological models. Low-dimensional integrals (one or two
variables) can often be computed efficiently, even for oscillatory or sharply
peaked integrands. Medium- to high-dimensional integrals, however, typically
require highly parallel computing architectures.

PAGANI is a deterministic, quadrature-based algorithm designed
for massively parallel architectures. It computes integrals using weighted
summations of an integrand function f evaluated at d-dimensional
points xi, each associated with a weight wi.
PAGANI progressively refines its accuracy until a user-defined error threshold
is reached by applying quadrature rules over increasingly smaller regions of
the integration space.

m-CUBES is a probabilistic Monte Carlo algorithm based on the
VEGAS integrator. It estimates integrals by random sampling across the
integration space and uses the standard deviation of the samples to determine
error bounds. The algorithm employs importance and stratified sampling and
partitions the domain into multiple sub-cubes, each sampled independently.

Challenge of Vendor Lock-In

Without universally adopted standards, portability across computing platforms
traditionally required maintaining multiple code bases. CUDA, while popular,
is proprietary and limited to NVIDIA GPUs.

To address this, we developed SYCL implementations of PAGANI and m-CUBES,
migrating from CUDA to achieve both performance and portability across
heterogeneous architectures, including GPUs and multi-CPU systems.

CUDA to SYCL Implementation for PAGANI and m-CUBES

The

Intel® oneAPI Toolkits

provide compilers, libraries, and migration tools such as the

Intel® DPC++ Compatibility Tool
, which automates much of the CUDA-to-SYCL porting process.

PAGANI and m-CUBES offload most computation to accelerators using iterative
workflows. The Compatibility Tool successfully converted nearly all Thrust
library calls to SYCL equivalents.

PAGANI flowchart
Figure 1. PAGANI flowchart
m-CUBES flowchart
Figure 2. m-CUBES flowchart

One exception was the minimax method, which triggered a
compatibility warning. In this case, a non-trivially copyable object caused a
SYCL compilation error when captured in a lambda. Refactoring the code to pass
individual pointers resolved the issue.

Note: regions->leftcoord cannot be directly used
inside the lambda expression, as this results in an illegal runtime memory
access.

Incompatible kernel example
Figure 4. Kernel call not properly converted by the Intel DPC++ Compatibility Tool
Corrected SYCL kernel example
Figure 5. Functional SYCL code adapted from the CUDA implementation

Performance Evaluation

To evaluate performance, we tested the primary kernels of PAGANI and m-CUBES
using multiple integrands commonly employed in numerical integration
benchmarks, including trigonometric, polynomial, and exponential functions.