Skip to content

JIT Field Access

Runtime field selection for ray tracing operators without instantiating all field combinations.

The Problem

Operators like ProjectionOperator select fields at runtime based on user configuration, but some fields (e.g., VelocityX/Y/Z) have specialized implementations requiring compile-time field indices. Instantiating a kernel per field leads to 30+ compilations, huge binaries, and long compile times.

Solution Architecture

flowchart TD
    classDef input fill:#6a1b9a,color:#fff,stroke:#4a148c
    classDef decision fill:#1565c0,color:#fff,stroke:#0d47a1
    classDef jit fill:#2e7d32,color:#fff,stroke:#1b5e20
    classDef fallback fill:#ef6c00,color:#fff,stroke:#e65100
    classDef result fill:#5c6bc0,color:#fff,stroke:#3949ab

    runtime[/"Runtime field_index"/]:::input
    runtime --> check{Backend?}:::decision

    check -->|"AdaptiveCpp + generic"| dynamic[DynamicFieldAccessor]:::jit
    check -->|"Other backends"| host[Host-side dispatch]:::fallback

    dynamic --> jit[JIT replaces placeholder]:::jit
    jit --> kernel_jit[Single kernel]:::result

    host --> trait{Specialized?}:::fallback
    trait -->|Yes| kernel_spec[Specialized kernel]:::result
    trait -->|No| kernel_arr[Array-access kernel]:::result

Backend Implementations

DynamicFieldAccessor (AdaptiveCpp)

With --acpp-targets=generic, uses ACPP_EXT_DYNAMIC_FUNCTIONS to replace placeholder functions at JIT time—zero dispatch overhead after compilation.

template <typename TDataset>
SYCL_EXTERNAL FP dynamic_get_field_value(const TDataset& dataset, const Photon& photon);

// At kernel launch:
config.define(&dynamic_get_field_value<Dataset>, &get_field_value_impl<Dataset, VelocityZ>);

Host-Side Dispatch (Universal)

Fallback for Intel SYCL and AdaptiveCpp without generic target. Dispatch happens on the host before kernel launch, not inside kernels:

// Host-side: select kernel once, then launch
dispatch_field_kernel<TDataset>(
    quantity,
    [&]<FieldIndex FI>() { kernel_specialized<FI>(...); },  // VelocityX/Y/Z
    [&]() { kernel_array(quantity, ...); }                   // Temperature, Density, etc.
);

Inside kernels, no runtime dispatch:

// Specialized kernel - compile-time field
const FP value = dataset.get_entry<Quantity>(photon);

// Array kernel - runtime index, direct memory access
const FP value = dataset.data.get_array_entry(field_index, cell);

Trait Detection

template <typename Dataset, FieldIndex Field>
struct has_specialized_get_entry : std::false_type {};

template <>
struct has_specialized_get_entry<SphericalShellDataset, VelocityX> : std::true_type {};

Performance Comparison

Approach Compile Time Kernel Instantiations Runtime Overhead
Naive compile-time Long 30+ None
Host-side dispatch Fast ~4 (3 specialized + 1 array) None
DynamicFieldAccessor Fastest 1 None (after JIT)

Adding New Specialized Fields

  1. Implement in dataset header:

    template <>
    FP SphericalShellDataset::get_entry_impl<NewField>(const Photon& photon) const {
        return computed_value;
    }
    

  2. Register trait in DatasetInterface.h:

    template <>
    struct has_specialized_get_entry<SphericalShellDataset, NewField> : std::true_type {};
    

Files Reference

File Purpose
src/jit/BackendCapabilities.h Backend detection
src/jit/DynamicFieldAccessor.h AdaptiveCpp JIT
src/jit/SmartFieldAccessor.h Host-side dispatch helper
src/DatasetInterface.h Trait definitions
tests/tests_smart_field_accessor.cpp Unit tests