JIT Field Access
Runtime field selection for ray tracing operators without instantiating all field combinations.
The Problem
Operators like ProjectionOperator select fields at runtime based on user configuration, but some fields (e.g., VelocityX/Y/Z) have specialized implementations requiring compile-time field indices. Instantiating a kernel per field leads to 30+ compilations, huge binaries, and long compile times.
Solution Architecture
flowchart TD
classDef input fill:#6a1b9a,color:#fff,stroke:#4a148c
classDef decision fill:#1565c0,color:#fff,stroke:#0d47a1
classDef jit fill:#2e7d32,color:#fff,stroke:#1b5e20
classDef fallback fill:#ef6c00,color:#fff,stroke:#e65100
classDef result fill:#5c6bc0,color:#fff,stroke:#3949ab
runtime[/"Runtime field_index"/]:::input
runtime --> check{Backend?}:::decision
check -->|"AdaptiveCpp + generic"| dynamic[DynamicFieldAccessor]:::jit
check -->|"Other backends"| host[Host-side dispatch]:::fallback
dynamic --> jit[JIT replaces placeholder]:::jit
jit --> kernel_jit[Single kernel]:::result
host --> trait{Specialized?}:::fallback
trait -->|Yes| kernel_spec[Specialized kernel]:::result
trait -->|No| kernel_arr[Array-access kernel]:::result
Backend Implementations
DynamicFieldAccessor (AdaptiveCpp)
With --acpp-targets=generic, uses ACPP_EXT_DYNAMIC_FUNCTIONS to replace placeholder functions at JIT timeāzero dispatch overhead after compilation.
template <typename TDataset>
SYCL_EXTERNAL FP dynamic_get_field_value(const TDataset& dataset, const Photon& photon);
// At kernel launch:
config.define(&dynamic_get_field_value<Dataset>, &get_field_value_impl<Dataset, VelocityZ>);
Host-Side Dispatch (Universal)
Fallback for Intel SYCL and AdaptiveCpp without generic target. Dispatch happens on the host before kernel launch, not inside kernels:
// Host-side: select kernel once, then launch
dispatch_field_kernel<TDataset>(
quantity,
[&]<FieldIndex FI>() { kernel_specialized<FI>(...); }, // VelocityX/Y/Z
[&]() { kernel_array(quantity, ...); } // Temperature, Density, etc.
);
Inside kernels, no runtime dispatch:
// Specialized kernel - compile-time field
const FP value = dataset.get_entry<Quantity>(photon);
// Array kernel - runtime index, direct memory access
const FP value = dataset.data.get_array_entry(field_index, cell);
Trait Detection
template <typename Dataset, FieldIndex Field>
struct has_specialized_get_entry : std::false_type {};
template <>
struct has_specialized_get_entry<SphericalShellDataset, VelocityX> : std::true_type {};
Performance Comparison
| Approach | Compile Time | Kernel Instantiations | Runtime Overhead |
|---|---|---|---|
| Naive compile-time | Long | 30+ | None |
| Host-side dispatch | Fast | ~4 (3 specialized + 1 array) | None |
| DynamicFieldAccessor | Fastest | 1 | None (after JIT) |
Adding New Specialized Fields
-
Implement in dataset header:
-
Register trait in
DatasetInterface.h:
Files Reference
| File | Purpose |
|---|---|
src/jit/BackendCapabilities.h |
Backend detection |
src/jit/DynamicFieldAccessor.h |
AdaptiveCpp JIT |
src/jit/SmartFieldAccessor.h |
Host-side dispatch helper |
src/DatasetInterface.h |
Trait definitions |
tests/tests_smart_field_accessor.cpp |
Unit tests |