load.network
  • Load Network
  • Quickstart
  • About Load Network
    • Overview
    • Network Releases Nomenclature
    • Load Network Alphanets
    • Key Features
    • ELI5
  • Using Load Network
    • Compatibility & Performance
    • Network configurations
    • Load Network Bundler
    • 0xbabe2: Large Data Uploads
    • Load Network Bundler Gateways
    • Load Network Precompiles
    • LN-Native JSON-RPC Methods
    • load:// Data Protocol
    • Self-Hosted RPC Proxies
      • Rust Proxy
      • JavaScript Proxy
    • Code & Integrations Examples
      • ethers (etherjs)
      • Deploying an ERC20 Token
  • load hyperbeam
    • About Load HyperBEAM
    • ~evm@1.0 device
  • ~kem@1.0 device
  • ~riscv-em@1.0 device
  • Load Network Cloud Platform
    • Cloud Platform (LNCP)
    • Load S3 Protocol
    • load0 data layer
  • Load Network for evm chains
    • Ledger Archiver (any chain)
    • Ledger Archivers: State Reconstruction
    • DA ExEx (Reth-only)
    • Deploying OP-Stack Rollups
  • Load Network ExEx
    • About ExExes
    • ExEx.rs
    • Load Network ExExes
      • Google BigQuery ETL
      • Borsh Serializer
      • Arweave Data Uploader
      • Load Network DA ExEx
      • Load Network WeaveDrive ExEx
  • Load Network Arweave Data Protocols
    • LN-ExEx Data Protocol
    • Load Network Precompiles Data Protocol
  • DA Integrations
    • LN-EigenDA Proxy Server
    • LN-Dymension: DA client for RollAP
Powered by GitBook
On this page
  • About
  • KEM Technical Architecture
  • On Writing Kernel Functions
  • Uniform Parameters
  • Example: Image Glitcher
  • References
Export as PDF

~kem@1.0 device

The Kernel Execution Machine device

Previous~evm@1.0 deviceNext~riscv-em@1.0 device

Last updated 1 day ago

About

The kernel-em NIF (kernel execution machine - kem@1.0 device) is a HyperBEAM Rust device built on top of to offer a general GPU-instructions compute execution machine for .wgsl functions (shaders, kernels).

With wgpu being a cross-platform GPU graphics API, hyperbeam node operators can add the KEM device to offer a compute platform for KEM functions. And with the ability to be called from within an ao process through ao.resolve (kem@1.0 device), KEM functions offer great flexibility to run as GPU compute sidecars alongside ao processes.

This device is experimental, in PoC stage

KEM Technical Architecture

KEM function source code is deployed on Arweave (example, double integer: ), and the source code TXID is used as the KEM function ID.

fn execute_kernel(
    kernel_id: String,
    input_data: rustler::Binary,
    output_size_hint: u64,
) -> NifResult<Vec<u8>> {
    let kernel_src = retrieve_kernel_src(&kernel_id).unwrap();
    let kem = pollster::block_on(KernelExecutor::new());
    let result = kem.execute_kernel_default(&kernel_src, input_data.as_slice(), Some(output_size_hint));
    Ok(result)
}

A KEM function execution takes 3 parameters: function ID, binary input data, and output size hint ratio (e.g., 2 means the output is expected to be no more than 2x the size of the input).

The KEM takes the input, retrieves the kernel source code from Arweave, and executes the GPU instructions on the hyperbeam node operator's hardware against the given input, then returns the byte results.

On Writing Kernel Functions

As the kernel execution machine (KEM) is designed to have I/O as bytes, and having the shader entrypoint standardized as main, writing a kernel function should have the function's entrypoint named main, the shader's type to be @compute, and the function's input/output should be in bytes; here is an example of skeleton function:

// SPDX-License-Identifier: GPL-3.0

// input as u32 array
@group(0) @binding(0)
var<storage, read> input_bytes: array<u32>;

// output as u32 array
@group(0) @binding(1)
var<storage, read_write> output_bytes: array<u32>;

// a work group of 256 threads
@compute @workgroup_size(256)
// main compute kernel entry point
fn main(@builtin(global_invocation_id) global_id: vec3<u32>) {
}

Uniform Parameters

Uniform parameters have been introduced as well, allowing you to pass configuration data and constants to your compute shaders. Uniforms are read-only data that remains constant across all invocations of the shader.

Here is an example of a skeleton function with uniform parameters:

// SPDX-License-Identifier: GPL-3.0

// input as u32 array
@group(0) @binding(0)
var<storage, read> input_bytes: array<u32>;

// output as u32 array
@group(0) @binding(1)
var<storage, read_write> output_bytes: array<u32>;

// uniform parameters for configuration
@group(0) @binding(2)
var<uniform> params: vec2<u32>; // example: param1, param2

// a work group of 256 threads
@compute @workgroup_size(256)
// main compute kernel entry point
fn main(@builtin(global_invocation_id) global_id: vec3<u32>) {
    // Access uniform parameters
    let param1 = i32(params.x);
    let param2 = i32(params.y);
    
    // your kernel logic here
}

Example: Image Glitcher

References

Using the image glitcher kernel function -

device source code:

hb device interface:

nif tests:

ao process example:

wgpu
btSvNclyu2me_zGh4X9ULVRZqwze9l2DpkcVHcLw9Eg
source code
native/kernel_em_nif
dev_kem.erl
kem_nif_test.erl
kem-device.lua
Technical Architecture Diagram
original image
glitched via the kernel function - minted as AO NFT on Bazar
https://bazar.arweave.net/#/asset/0z8MNwaRpkXhEgIxUv8ESNhtHxVGNfFkmGkoPtu0amY