Going parallel with reusable resources

In the previous section, we went over the basics of creating compute shaders and managing GPU memory. In this section, we will write a compute shader that runs more than one thread in parallel and look at what that changes.

So far, most of our shaders have looked like this:

import 
const tgpu: {
    const: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/constant/tgpuConstant").constant;
    fn: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/function/tgpuFn").fn;
    comptime: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/function/comptime").comptime;
    resolve: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/resolve/tgpuResolve").resolve;
    resolveWithContext: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/resolve/tgpuResolve").resolveWithContext;
    init: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/root/init").init;
    initFromDevice: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/root/init").initFromDevice;
    slot: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/slot/slot").slot;
    lazy: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/slot/lazy").lazy;
    ... 10 more ...;
    '~unstable': typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/tgpuUnstable");
}@module ― typegpu
tgpu from 'typegpu';

const 
const root: TgpuRoot
root = await 
const tgpu: {
    const: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/constant/tgpuConstant").constant;
    fn: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/function/tgpuFn").fn;
    comptime: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/function/comptime").comptime;
    resolve: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/resolve/tgpuResolve").resolve;
    resolveWithContext: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/resolve/tgpuResolve").resolveWithContext;
    init: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/root/init").init;
    initFromDevice: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/root/init").initFromDevice;
    slot: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/slot/slot").slot;
    lazy: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/slot/lazy").lazy;
    ... 10 more ...;
    '~unstable': typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/tgpuUnstable");
}@module ― typegpu
tgpu.
init: (options?: InitOptions) => Promise<TgpuRoot>
Requests a new GPU device and creates a root around it.
If a specific device should be used instead, use
@see ― initFromDevice. *
@example 
When given no options, the function will ask the browser for a suitable GPU device.
const root = await tgpu.init();
@example 
If there are specific options that should be used when requesting a device, you can pass those in.
const adapterOptions: GPURequestAdapterOptions = ...;
const deviceDescriptor: GPUDeviceDescriptor = ...;
const root = await tgpu.init({ adapter: adapterOptions, device: deviceDescriptor });
init();

const 
const program: TgpuGuardedComputePipeline<[]>
program = 
const root: TgpuRoot
root.
WithBinding.createGuardedComputePipeline<[]>(callback: () => void): TgpuGuardedComputePipeline<[]>
Creates a compute pipeline that executes the given callback in an exact number of threads.
This is different from createComputePipeline() in that it does a bounds check on the
thread id, where as regular pipelines do not and work in units of workgroups.
@param ― callback A function converted to WGSL and executed on the GPU.
It can accept up to 3 parameters (x, y, z) which correspond to the global invocation ID
of the executing thread.
@example 
If no parameters are provided, the callback will be executed once, in a single thread.
const fooPipeline = root
  .createGuardedComputePipeline(() => {
    'use gpu';
    console.log('Hello, GPU!');
  });

fooPipeline.dispatchThreads();
// [GPU] Hello, GPU!
@example 
One parameter means n-threads will be executed in parallel.
const fooPipeline = root
  .createGuardedComputePipeline((x) => {
    'use gpu';
    if (x % 16 === 0) {
      // Logging every 16th thread
      console.log('I am the', x, 'thread');
    }
  });

// executing 512 threads
fooPipeline.dispatchThreads(512);
// [GPU] I am the 256 thread
// [GPU] I am the 272 thread
// ... (30 hidden logs)
// [GPU] I am the 16 thread
// [GPU] I am the 240 thread
createGuardedComputePipeline(() => {
  'use gpu';
  // Run some code.
});

const program: TgpuGuardedComputePipeline<[]>
program.
TgpuGuardedComputePipeline<[]>.dispatchThreads(): void
Dispatches the pipeline.
Unlike TgpuComputePipeline.dispatchWorkgroups(), this method takes in the
number of threads to run in each dimension.
Under the hood, the number of expected threads is sent as a uniform, and
"guarded" by a bounds check.
dispatchThreads();

The createGuardedComputePipeline() method is a simple way to create a function that runs on the GPU. Calling dispatchThreads() without arguments dispatches one logical GPU thread. That is useful for the smallest examples, but most compute workloads are interesting because they run many copies of the same program in parallel.

From one thread to many

To run more than one thread, add a positional argument to the shader function and pass a thread count to dispatchThreads(). In a one-dimensional dispatch, that argument is the thread index.

Let’s create a program that increments an entire array of values instead of incrementing a single value:

import 
const tgpu: {
    const: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/constant/tgpuConstant").constant;
    fn: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/function/tgpuFn").fn;
    comptime: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/function/comptime").comptime;
    resolve: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/resolve/tgpuResolve").resolve;
    resolveWithContext: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/resolve/tgpuResolve").resolveWithContext;
    init: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/root/init").init;
    initFromDevice: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/root/init").initFromDevice;
    slot: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/slot/slot").slot;
    lazy: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/slot/lazy").lazy;
    ... 10 more ...;
    '~unstable': typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/tgpuUnstable");
}@module ― typegpu
tgpu, { 
import d
d } from 'typegpu';

const 
const root: TgpuRoot
root = await 
const tgpu: {
    const: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/constant/tgpuConstant").constant;
    fn: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/function/tgpuFn").fn;
    comptime: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/function/comptime").comptime;
    resolve: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/resolve/tgpuResolve").resolve;
    resolveWithContext: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/resolve/tgpuResolve").resolveWithContext;
    init: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/root/init").init;
    initFromDevice: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/root/init").initFromDevice;
    slot: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/slot/slot").slot;
    lazy: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/slot/lazy").lazy;
    ... 10 more ...;
    '~unstable': typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/tgpuUnstable");
}@module ― typegpu
tgpu.
init: (options?: InitOptions) => Promise<TgpuRoot>
Requests a new GPU device and creates a root around it.
If a specific device should be used instead, use
@see ― initFromDevice. *
@example 
When given no options, the function will ask the browser for a suitable GPU device.
const root = await tgpu.init();
@example 
If there are specific options that should be used when requesting a device, you can pass those in.
const adapterOptions: GPURequestAdapterOptions = ...;
const deviceDescriptor: GPUDeviceDescriptor = ...;
const root = await tgpu.init({ adapter: adapterOptions, device: deviceDescriptor });
init();

const 
const valuesMutable: TgpuMutable<d.WgslArray<d.U32>>
valuesMutable = 
const root: TgpuRoot
root.
TgpuRoot.createMutable<d.WgslArray<d.U32>>(typeSchema: d.WgslArray<d.U32>, initial?: ((buffer: TgpuBuffer<NoInfer<d.WgslArray<d.U32>>>) => void) | d.InferInput<NoInfer<d.WgslArray<d.U32>>> | undefined): TgpuMutable<d.WgslArray<d.U32>> (+1 overload)
Allocates memory on the GPU, allows passing data between host and shader.
Can be mutated in-place on the GPU. For a general-purpose buffer,
use
TgpuRoot.createBuffer
.
@param ― typeSchema The type of data that this buffer will hold.
@param ― initial Either initial value of the buffer, or an initializer to execute on the mapped buffer. (optional)
createMutable(
import d
d.
arrayOf<d.U32>(elementType: d.U32, elementCount: number): d.WgslArray<d.U32> (+1 overload)
export arrayOf
Creates an array schema that can be used to construct gpu buffers.
Describes arrays with fixed-size length, storing elements of the same type.
@example 
const LENGTH = 3;
const array = d.arrayOf(d.u32, LENGTH);
If elementCount is not specified, a partially applied function is returned.
@example const array = d.arrayOf(d.vec3f);
//    ^? (n: number) => WgslArray<d.Vec3f>
@param ― elementType The type of elements in the array.
@param ― elementCount The number of elements in the array.
arrayOf(
import d
d.
const u32: d.U32
export u32
A schema that represents an unsigned 32-bit integer value. (equivalent to u32 in WGSL)
Can also be called to cast a value to an u32 in accordance with WGSL casting rules.
@example const value = u32(); // 0
@example const value = u32(7); // 7
@example const value = u32(3.14); // 3
@example const value = u32(-1); // 4294967295
@example const value = u32(-3.1); // 0
u32, 16));

const 
const program: TgpuGuardedComputePipeline<[x: number]>
program = 
const root: TgpuRoot
root.
WithBinding.createGuardedComputePipeline<[x: number]>(callback: (x: number) => void): TgpuGuardedComputePipeline<[x: number]>
Creates a compute pipeline that executes the given callback in an exact number of threads.
This is different from createComputePipeline() in that it does a bounds check on the
thread id, where as regular pipelines do not and work in units of workgroups.
@param ― callback A function converted to WGSL and executed on the GPU.
It can accept up to 3 parameters (x, y, z) which correspond to the global invocation ID
of the executing thread.
@example 
If no parameters are provided, the callback will be executed once, in a single thread.
const fooPipeline = root
  .createGuardedComputePipeline(() => {
    'use gpu';
    console.log('Hello, GPU!');
  });

fooPipeline.dispatchThreads();
// [GPU] Hello, GPU!
@example 
One parameter means n-threads will be executed in parallel.
const fooPipeline = root
  .createGuardedComputePipeline((x) => {
    'use gpu';
    if (x % 16 === 0) {
      // Logging every 16th thread
      console.log('I am the', x, 'thread');
    }
  });

// executing 512 threads
fooPipeline.dispatchThreads(512);
// [GPU] I am the 256 thread
// [GPU] I am the 272 thread
// ... (30 hidden logs)
// [GPU] I am the 16 thread
// [GPU] I am the 240 thread
createGuardedComputePipeline((
x: number
x) => {
  'use gpu';
  
const valuesMutable: TgpuMutable<d.WgslArray<d.U32>>
valuesMutable.
TgpuMutable<WgslArray<U32>>.$: number[]
$[
x: number
x]++;
});

export async function 
function execute(): Promise<{
    threadCount: number;
    values: number[];
}>
execute() {
  const 
const threadCount: number
threadCount = 
var Math: Math
An intrinsic object that provides basic mathematics functionality and constants.
Math.
Math.floor(x: number): number
Returns the greatest integer less than or equal to its numeric argument.
@param ― x A numeric expression.
floor(
var Math: Math
An intrinsic object that provides basic mathematics functionality and constants.
Math.
Math.random(): number
Returns a pseudorandom number between 0 and 1.
random() * 16) + 1;
  
const program: TgpuGuardedComputePipeline<[x: number]>
program.
TgpuGuardedComputePipeline<[x: number]>.dispatchThreads(x: number): void
Dispatches the pipeline.
Unlike TgpuComputePipeline.dispatchWorkgroups(), this method takes in the
number of threads to run in each dimension.
Under the hood, the number of expected threads is sent as a uniform, and
"guarded" by a bounds check.
dispatchThreads(
const threadCount: number
threadCount);

  return {
    
threadCount: number
threadCount,
    
values: number[]
values: await 
const valuesMutable: TgpuMutable<d.WgslArray<d.U32>>
valuesMutable.
TgpuBufferShorthandBase<WgslArray<U32>>.read(): Promise<number[]>
read(),
  };
}

ValuesNot run

100

110

120

130

140

150

Each time you run the example, it chooses a random thread count between 1 and 16. If it dispatches 6 threads, thread 0 increments array element 0, thread 1 increments array element 1, and so on, up to thread 5. The preview highlights the part of the array touched by the latest dispatch.

Dispatch size and bounds

The dispatchThreads() method controls how many logical GPU threads will run the shader function. If we dispatch 16 threads for a 16-element array, we increment the whole array. If we dispatch 4 threads, only the first 4 elements are incremented because x takes values from 0 to the passed count minus 1.

We still have to be careful not to read or write outside the bounds of the array. WebGPU prevents those accesses from reaching memory outside the binding, but that does not make them useful. The shader may still continue with meaningless data, or a write you expected may not happen.

If the dispatch size might be larger than the buffer length, guard the array access inside the shader:

const 
const program: TgpuGuardedComputePipeline<[x: number]>
program = 
const root: TgpuRoot
root.
WithBinding.createGuardedComputePipeline<[x: number]>(callback: (x: number) => void): TgpuGuardedComputePipeline<[x: number]>
Creates a compute pipeline that executes the given callback in an exact number of threads.
This is different from createComputePipeline() in that it does a bounds check on the
thread id, where as regular pipelines do not and work in units of workgroups.
@param ― callback A function converted to WGSL and executed on the GPU.
It can accept up to 3 parameters (x, y, z) which correspond to the global invocation ID
of the executing thread.
@example 
If no parameters are provided, the callback will be executed once, in a single thread.
const fooPipeline = root
  .createGuardedComputePipeline(() => {
    'use gpu';
    console.log('Hello, GPU!');
  });

fooPipeline.dispatchThreads();
// [GPU] Hello, GPU!
@example 
One parameter means n-threads will be executed in parallel.
const fooPipeline = root
  .createGuardedComputePipeline((x) => {
    'use gpu';
    if (x % 16 === 0) {
      // Logging every 16th thread
      console.log('I am the', x, 'thread');
    }
  });

// executing 512 threads
fooPipeline.dispatchThreads(512);
// [GPU] I am the 256 thread
// [GPU] I am the 272 thread
// ... (30 hidden logs)
// [GPU] I am the 16 thread
// [GPU] I am the 240 thread
createGuardedComputePipeline((
x: number
x) => {
  'use gpu';
  if (
x: number
x >= 
const valuesMutable: TgpuMutable<d.WgslArray<d.U32>>
valuesMutable.
TgpuMutable<WgslArray<U32>>.$: number[]
$.
Array<number>.length: number
Gets or sets the length of the array. This is a number one higher than the highest index in the array.
length) {
    return;
  }

  
const valuesMutable: TgpuMutable<d.WgslArray<d.U32>>
valuesMutable.
TgpuMutable<WgslArray<U32>>.$: number[]
$[
x: number
x]++;
});

A useful mental model

For the increment example above, you can think of the shader as something close to this JavaScript loop:

const 
const values: number[]
values = 
var Array: ArrayConstructor
Array.
ArrayConstructor.from<unknown, number>(iterable: Iterable<unknown> | ArrayLike<unknown>, mapfn: (v: unknown, k: number) => number, thisArg?: any): number[] (+3 overloads)
Creates an array from an iterable object.
@param ― iterable An iterable object to convert to an array.
@param ― mapfn A mapping function to call on every element of the array.
@param ― thisArg Value of 'this' used to invoke the mapfn.
from({ 
ArrayLike<T>.length: number
length: 16 }, () => 0);

for (let 
let x: number
x = 0; 
let x: number
x < 
const values: number[]
values.
Array<number>.length: number
Gets or sets the length of the array. This is a number one higher than the highest index in the array.
length; 
let x: number
x++) {
  
const values: number[]
values[
let x: number
x]++;
}

This is a useful analogy for understanding how the thread index maps to data. But it is only an analogy. The shader invocations run in parallel, and their execution order is not guaranteed.

When one thread reads data that another thread may be writing, the loop analogy breaks down:

const 
const program: TgpuGuardedComputePipeline<[x: number]>
program = 
const root: TgpuRoot
root.
WithBinding.createGuardedComputePipeline<[x: number]>(callback: (x: number) => void): TgpuGuardedComputePipeline<[x: number]>
Creates a compute pipeline that executes the given callback in an exact number of threads.
This is different from createComputePipeline() in that it does a bounds check on the
thread id, where as regular pipelines do not and work in units of workgroups.
@param ― callback A function converted to WGSL and executed on the GPU.
It can accept up to 3 parameters (x, y, z) which correspond to the global invocation ID
of the executing thread.
@example 
If no parameters are provided, the callback will be executed once, in a single thread.
const fooPipeline = root
  .createGuardedComputePipeline(() => {
    'use gpu';
    console.log('Hello, GPU!');
  });

fooPipeline.dispatchThreads();
// [GPU] Hello, GPU!
@example 
One parameter means n-threads will be executed in parallel.
const fooPipeline = root
  .createGuardedComputePipeline((x) => {
    'use gpu';
    if (x % 16 === 0) {
      // Logging every 16th thread
      console.log('I am the', x, 'thread');
    }
  });

// executing 512 threads
fooPipeline.dispatchThreads(512);
// [GPU] I am the 256 thread
// [GPU] I am the 272 thread
// ... (30 hidden logs)
// [GPU] I am the 16 thread
// [GPU] I am the 240 thread
createGuardedComputePipeline((
x: number
x) => {
  'use gpu';
  if (
x: number
x >= 
const valuesMutable: TgpuMutable<d.WgslArray<d.U32>>
valuesMutable.
TgpuMutable<WgslArray<U32>>.$: number[]
$.
Array<number>.length: number
Gets or sets the length of the array. This is a number one higher than the highest index in the array.
length || 
x: number
x === 0) {
    return;
  }

  const 
const sum: number
sum = 
const valuesMutable: TgpuMutable<d.WgslArray<d.U32>>
valuesMutable.
TgpuMutable<WgslArray<U32>>.$: number[]
$[
x: number
x - 1] + 
const valuesMutable: TgpuMutable<d.WgslArray<d.U32>>
valuesMutable.
TgpuMutable<WgslArray<U32>>.$: number[]
$[
x: number
x];
  
const valuesMutable: TgpuMutable<d.WgslArray<d.U32>>
valuesMutable.
TgpuMutable<WgslArray<U32>>.$: number[]
$[
x: number
x] = 
const sum: number
sum;
});

The result is nondeterministic because each thread may read valuesMutable.$[x - 1] before or after the neighboring thread writes to it. When shader invocations need to coordinate, the algorithm has to account for that explicitly.

Multi-dimensional compute shaders

When using createGuardedComputePipeline(), we can pass a function that takes up to three arguments: x, y, and z. These are the thread indices along the x, y, and z axes, respectively.

On a high level, a 2D compute shader can often be reduced to a 1D compute shader by flattening the coordinates into one index. The same is true for 3D work. That translation is sometimes useful, but it can make the shader harder to read because the first thing each thread has to do is recover the coordinates it actually cares about.

Let’s take a look at a simple 2D compute shader that increments a randomly sized rectangle inside a grid:

import 
const tgpu: {
    const: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/constant/tgpuConstant").constant;
    fn: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/function/tgpuFn").fn;
    comptime: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/function/comptime").comptime;
    resolve: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/resolve/tgpuResolve").resolve;
    resolveWithContext: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/resolve/tgpuResolve").resolveWithContext;
    init: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/root/init").init;
    initFromDevice: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/root/init").initFromDevice;
    slot: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/slot/slot").slot;
    lazy: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/slot/lazy").lazy;
    ... 10 more ...;
    '~unstable': typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/tgpuUnstable");
}@module ― typegpu
tgpu, { 
import d
d } from 'typegpu';

const 
const root: TgpuRoot
root = await 
const tgpu: {
    const: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/constant/tgpuConstant").constant;
    fn: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/function/tgpuFn").fn;
    comptime: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/function/comptime").comptime;
    resolve: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/resolve/tgpuResolve").resolve;
    resolveWithContext: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/resolve/tgpuResolve").resolveWithContext;
    init: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/root/init").init;
    initFromDevice: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/root/init").initFromDevice;
    slot: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/slot/slot").slot;
    lazy: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/slot/lazy").lazy;
    ... 10 more ...;
    '~unstable': typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/tgpuUnstable");
}@module ― typegpu
tgpu.
init: (options?: InitOptions) => Promise<TgpuRoot>
Requests a new GPU device and creates a root around it.
If a specific device should be used instead, use
@see ― initFromDevice. *
@example 
When given no options, the function will ask the browser for a suitable GPU device.
const root = await tgpu.init();
@example 
If there are specific options that should be used when requesting a device, you can pass those in.
const adapterOptions: GPURequestAdapterOptions = ...;
const deviceDescriptor: GPUDeviceDescriptor = ...;
const root = await tgpu.init({ adapter: adapterOptions, device: deviceDescriptor });
init();

const 
const WIDTH: 8
WIDTH = 8;
const 
const HEIGHT: 6
HEIGHT = 6;

const 
const valuesMutable: TgpuMutable<d.WgslArray<d.WgslArray<d.U32>>>
valuesMutable = 
const root: TgpuRoot
root.
TgpuRoot.createMutable<d.WgslArray<d.WgslArray<d.U32>>>(typeSchema: d.WgslArray<d.WgslArray<d.U32>>, initial?: ((buffer: TgpuBuffer<NoInfer<d.WgslArray<d.WgslArray<d.U32>>>>) => void) | d.InferInput<NoInfer<d.WgslArray<d.WgslArray<d.U32>>>> | undefined): TgpuMutable<d.WgslArray<d.WgslArray<d.U32>>> (+1 overload)
Allocates memory on the GPU, allows passing data between host and shader.
Can be mutated in-place on the GPU. For a general-purpose buffer,
use
TgpuRoot.createBuffer
.
@param ― typeSchema The type of data that this buffer will hold.
@param ― initial Either initial value of the buffer, or an initializer to execute on the mapped buffer. (optional)
createMutable(
  
import d
d.
arrayOf<d.WgslArray<d.U32>>(elementType: d.WgslArray<d.U32>, elementCount: number): d.WgslArray<d.WgslArray<d.U32>> (+1 overload)
export arrayOf
Creates an array schema that can be used to construct gpu buffers.
Describes arrays with fixed-size length, storing elements of the same type.
@example 
const LENGTH = 3;
const array = d.arrayOf(d.u32, LENGTH);
If elementCount is not specified, a partially applied function is returned.
@example const array = d.arrayOf(d.vec3f);
//    ^? (n: number) => WgslArray<d.Vec3f>
@param ― elementType The type of elements in the array.
@param ― elementCount The number of elements in the array.
arrayOf(
import d
d.
arrayOf<d.U32>(elementType: d.U32, elementCount: number): d.WgslArray<d.U32> (+1 overload)
export arrayOf
Creates an array schema that can be used to construct gpu buffers.
Describes arrays with fixed-size length, storing elements of the same type.
@example 
const LENGTH = 3;
const array = d.arrayOf(d.u32, LENGTH);
If elementCount is not specified, a partially applied function is returned.
@example const array = d.arrayOf(d.vec3f);
//    ^? (n: number) => WgslArray<d.Vec3f>
@param ― elementType The type of elements in the array.
@param ― elementCount The number of elements in the array.
arrayOf(
import d
d.
const u32: d.U32
export u32
A schema that represents an unsigned 32-bit integer value. (equivalent to u32 in WGSL)
Can also be called to cast a value to an u32 in accordance with WGSL casting rules.
@example const value = u32(); // 0
@example const value = u32(7); // 7
@example const value = u32(3.14); // 3
@example const value = u32(-1); // 4294967295
@example const value = u32(-3.1); // 0
u32, 
const HEIGHT: 6
HEIGHT), 
const WIDTH: 8
WIDTH),
);

const 
const program: TgpuGuardedComputePipeline<[x: number, y: number]>
program = 
const root: TgpuRoot
root.
WithBinding.createGuardedComputePipeline<[x: number, y: number]>(callback: (x: number, y: number) => void): TgpuGuardedComputePipeline<[x: number, y: number]>
Creates a compute pipeline that executes the given callback in an exact number of threads.
This is different from createComputePipeline() in that it does a bounds check on the
thread id, where as regular pipelines do not and work in units of workgroups.
@param ― callback A function converted to WGSL and executed on the GPU.
It can accept up to 3 parameters (x, y, z) which correspond to the global invocation ID
of the executing thread.
@example 
If no parameters are provided, the callback will be executed once, in a single thread.
const fooPipeline = root
  .createGuardedComputePipeline(() => {
    'use gpu';
    console.log('Hello, GPU!');
  });

fooPipeline.dispatchThreads();
// [GPU] Hello, GPU!
@example 
One parameter means n-threads will be executed in parallel.
const fooPipeline = root
  .createGuardedComputePipeline((x) => {
    'use gpu';
    if (x % 16 === 0) {
      // Logging every 16th thread
      console.log('I am the', x, 'thread');
    }
  });

// executing 512 threads
fooPipeline.dispatchThreads(512);
// [GPU] I am the 256 thread
// [GPU] I am the 272 thread
// ... (30 hidden logs)
// [GPU] I am the 16 thread
// [GPU] I am the 240 thread
createGuardedComputePipeline((
x: number
x, 
y: number
y) => {
  'use gpu';
  
const valuesMutable: TgpuMutable<d.WgslArray<d.WgslArray<d.U32>>>
valuesMutable.
TgpuMutable<WgslArray<WgslArray<U32>>>.$: number[][]
$[
x: number
x][
y: number
y]++;
});

export async function 
function execute(): Promise<{
    dispatchWidth: number;
    dispatchHeight: number;
    values: number[][];
}>
execute() {
  const 
const dispatchWidth: number
dispatchWidth = 
var Math: Math
An intrinsic object that provides basic mathematics functionality and constants.
Math.
Math.floor(x: number): number
Returns the greatest integer less than or equal to its numeric argument.
@param ― x A numeric expression.
floor(
var Math: Math
An intrinsic object that provides basic mathematics functionality and constants.
Math.
Math.random(): number
Returns a pseudorandom number between 0 and 1.
random() * 
const WIDTH: 8
WIDTH) + 1;
  const 
const dispatchHeight: number
dispatchHeight = 
var Math: Math
An intrinsic object that provides basic mathematics functionality and constants.
Math.
Math.floor(x: number): number
Returns the greatest integer less than or equal to its numeric argument.
@param ― x A numeric expression.
floor(
var Math: Math
An intrinsic object that provides basic mathematics functionality and constants.
Math.
Math.random(): number
Returns a pseudorandom number between 0 and 1.
random() * 
const HEIGHT: 6
HEIGHT) + 1;
  
const program: TgpuGuardedComputePipeline<[x: number, y: number]>
program.
TgpuGuardedComputePipeline<[x: number, y: number]>.dispatchThreads(x: number, y: number): void
Dispatches the pipeline.
Unlike TgpuComputePipeline.dispatchWorkgroups(), this method takes in the
number of threads to run in each dimension.
Under the hood, the number of expected threads is sent as a uniform, and
"guarded" by a bounds check.
dispatchThreads(
const dispatchWidth: number
dispatchWidth, 
const dispatchHeight: number
dispatchHeight);

  return {
    
dispatchWidth: number
dispatchWidth,
    
dispatchHeight: number
dispatchHeight,
    
values: number[][]
values: await 
const valuesMutable: TgpuMutable<d.WgslArray<d.WgslArray<d.U32>>>
valuesMutable.
TgpuBufferShorthandBase<WgslArray<WgslArray<U32>>>.read(): Promise<number[][]>
read(),
  };
}

GridNot run

This time, the shader callback receives two indices. The x argument selects the outer array, and the y argument selects the inner array, so valuesMutable.$[x][y] points at one cell in the grid.

The dispatch uses the same shape, only now we provide two counts instead of one:

const dispatchWidth = Math.floor(Math.random() * WIDTH) + 1;
const dispatchHeight = Math.floor(Math.random() * HEIGHT) + 1;

program.dispatchThreads(dispatchWidth, dispatchHeight);

This asks TypeGPU to run dispatchWidth * dispatchHeight logical threads, but each thread still receives its own coordinate pair. For a 4 x 2 dispatch, x ranges from 0 to 3, while y ranges from 0 to 1. The preview highlights the rectangle touched by the latest dispatch.

This is the main reason to use a 2D dispatch for grid-shaped data: the shader can stay in the same coordinate system as the problem. The same pattern extends to 3D by adding a third callback argument and passing a third number to dispatchThreads().

Swapping resources

So far, we have used createMutable() to create buffers that shader code can read and write directly. This is very convenient for simple shaders and requires no additional setup. We refer to this path as Fixed Resources: the shader references one particular GPU resource and TypeGPU binds it for us.

Sometimes, though, we want to swap which buffer is used by the shader for a particular dispatch. Later, the same idea will apply to other resource types too. For that, TypeGPU offers Laid Out Resources.

Laid out resources split the setup into two pieces:

A bind group layout describes the shape of the resources the shader expects.
A bind group provides the concrete GPU resources that satisfy that layout.

The layout is created first, without providing any actual buffers. For buffers, each layout entry uses the same schema we would use to create the buffer. If the entry is a storage buffer, it can also include an access mode:

import 
const tgpu: {
    const: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/constant/tgpuConstant").constant;
    fn: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/function/tgpuFn").fn;
    comptime: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/function/comptime").comptime;
    resolve: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/resolve/tgpuResolve").resolve;
    resolveWithContext: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/resolve/tgpuResolve").resolveWithContext;
    init: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/root/init").init;
    initFromDevice: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/root/init").initFromDevice;
    slot: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/slot/slot").slot;
    lazy: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/slot/lazy").lazy;
    ... 10 more ...;
    '~unstable': typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/tgpuUnstable");
}@module ― typegpu
tgpu, { 
import d
d } from 'typegpu';

const 
const bindGroupLayout: TgpuBindGroupLayout<{
    counter: {
        storage: d.U32;
        access: "mutable";
    };
    values: {
        storage: d.WgslArray<d.U32>;
    };
}>
bindGroupLayout = 
const tgpu: {
    const: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/constant/tgpuConstant").constant;
    fn: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/function/tgpuFn").fn;
    comptime: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/function/comptime").comptime;
    resolve: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/resolve/tgpuResolve").resolve;
    resolveWithContext: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/resolve/tgpuResolve").resolveWithContext;
    init: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/root/init").init;
    initFromDevice: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/root/init").initFromDevice;
    slot: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/slot/slot").slot;
    lazy: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/slot/lazy").lazy;
    ... 10 more ...;
    '~unstable': typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/tgpuUnstable");
}@module ― typegpu
tgpu.
bindGroupLayout: <{
    counter: {
        storage: d.U32;
        access: "mutable";
    };
    values: {
        storage: d.WgslArray<d.U32>;
    };
}>(entries: {
    counter: {
        storage: d.U32;
        access: "mutable";
    };
    values: {
        storage: d.WgslArray<d.U32>;
    };
}) => TgpuBindGroupLayout<{
    counter: {
        storage: d.U32;
        access: "mutable";
    };
    values: {
        storage: d.WgslArray<d.U32>;
    };
}> (+1 overload)
bindGroupLayout({
  
counter: {
    storage: d.U32;
    access: "mutable";
}
counter: { 
storage: d.U32
storage: 
import d
d.
const u32: d.U32
export u32
A schema that represents an unsigned 32-bit integer value. (equivalent to u32 in WGSL)
Can also be called to cast a value to an u32 in accordance with WGSL casting rules.
@example const value = u32(); // 0
@example const value = u32(7); // 7
@example const value = u32(3.14); // 3
@example const value = u32(-1); // 4294967295
@example const value = u32(-3.1); // 0
u32, 
access: "mutable"
access: 'mutable' },
  
values: {
    storage: d.WgslArray<d.U32>;
}
values: { 
storage: d.WgslArray<d.U32>
storage: 
import d
d.
arrayOf<d.U32>(elementType: d.U32, elementCount: number): d.WgslArray<d.U32> (+1 overload)
export arrayOf
Creates an array schema that can be used to construct gpu buffers.
Describes arrays with fixed-size length, storing elements of the same type.
@example 
const LENGTH = 3;
const array = d.arrayOf(d.u32, LENGTH);
If elementCount is not specified, a partially applied function is returned.
@example const array = d.arrayOf(d.vec3f);
//    ^? (n: number) => WgslArray<d.Vec3f>
@param ― elementType The type of elements in the array.
@param ― elementCount The number of elements in the array.
arrayOf(
import d
d.
const u32: d.U32
export u32
A schema that represents an unsigned 32-bit integer value. (equivalent to u32 in WGSL)
Can also be called to cast a value to an u32 in accordance with WGSL casting rules.
@example const value = u32(); // 0
@example const value = u32(7); // 7
@example const value = u32(3.14); // 3
@example const value = u32(-1); // 4294967295
@example const value = u32(-3.1); // 0
u32, 16) },
});

Each key in the bind group layout maps to a single resource. For the buffer resources we have explored so far, the equivalent layout entries look like this:

Fixed Resources shorthand	Laid Out Resources entry
`root.createUniform(d.u32)`	`{ uniform: d.u32 }`
`root.createReadonly(d.u32)`	`{ storage: d.u32 }`
`root.createMutable(d.u32)`	`{ storage: d.u32, access: 'mutable' }`

Storage entries are readonly by default, so { storage: d.u32 } and { storage: d.u32, access: 'readonly' } describe the same shader access.

To provide the resources, create a bind group with root.createBindGroup(layout, entries):

const 
const counterBuffer: TgpuBuffer<d.U32> & StorageFlag
counterBuffer = 
const root: TgpuRoot
root.
TgpuRoot.createBuffer<d.U32>(typeSchema: d.U32, initial?: number | ((buffer: TgpuBuffer<NoInfer<d.U32>>) => void) | undefined): TgpuBuffer<d.U32> (+1 overload)
Allocates memory on the GPU, allows passing data between host and shader.
@param ― typeSchema The type of data that this buffer will hold.
@param ― initial Either initial value of the buffer, or an initializer to execute on the mapped buffer. (optional)
createBuffer(
import d
d.
const u32: d.U32
export u32
A schema that represents an unsigned 32-bit integer value. (equivalent to u32 in WGSL)
Can also be called to cast a value to an u32 in accordance with WGSL casting rules.
@example const value = u32(); // 0
@example const value = u32(7); // 7
@example const value = u32(3.14); // 3
@example const value = u32(-1); // 4294967295
@example const value = u32(-3.1); // 0
u32).
TgpuBuffer<U32>.$usage<["storage"]>(usages_0: "storage"): TgpuBuffer<d.U32> & StorageFlag
$usage('storage');
const 
const valuesBuffer: TgpuBuffer<d.WgslArray<d.U32>> & StorageFlag
valuesBuffer = 
const root: TgpuRoot
root.
TgpuRoot.createBuffer<d.WgslArray<d.U32>>(typeSchema: d.WgslArray<d.U32>, initial?: ((buffer: TgpuBuffer<NoInfer<d.WgslArray<d.U32>>>) => void) | d.InferInput<NoInfer<d.WgslArray<d.U32>>> | undefined): TgpuBuffer<d.WgslArray<d.U32>> (+1 overload)
Allocates memory on the GPU, allows passing data between host and shader.
@param ― typeSchema The type of data that this buffer will hold.
@param ― initial Either initial value of the buffer, or an initializer to execute on the mapped buffer. (optional)
createBuffer(
import d
d.
arrayOf<d.U32>(elementType: d.U32, elementCount: number): d.WgslArray<d.U32> (+1 overload)
export arrayOf
Creates an array schema that can be used to construct gpu buffers.
Describes arrays with fixed-size length, storing elements of the same type.
@example 
const LENGTH = 3;
const array = d.arrayOf(d.u32, LENGTH);
If elementCount is not specified, a partially applied function is returned.
@example const array = d.arrayOf(d.vec3f);
//    ^? (n: number) => WgslArray<d.Vec3f>
@param ― elementType The type of elements in the array.
@param ― elementCount The number of elements in the array.
arrayOf(
import d
d.
const u32: d.U32
export u32
A schema that represents an unsigned 32-bit integer value. (equivalent to u32 in WGSL)
Can also be called to cast a value to an u32 in accordance with WGSL casting rules.
@example const value = u32(); // 0
@example const value = u32(7); // 7
@example const value = u32(3.14); // 3
@example const value = u32(-1); // 4294967295
@example const value = u32(-3.1); // 0
u32, 16)).
TgpuBuffer<WgslArray<U32>>.$usage<["storage"]>(usages_0: "storage"): TgpuBuffer<d.WgslArray<d.U32>> & StorageFlag
$usage('storage');

const 
const bindGroup: TgpuBindGroup<{
    counter: {
        storage: d.U32;
        access: "mutable";
    };
    values: {
        storage: d.WgslArray<d.U32>;
    };
}>
bindGroup = 
const root: TgpuRoot
root.
TgpuRoot.createBindGroup<{
    counter: {
        storage: d.U32;
        access: "mutable";
    };
    values: {
        storage: d.WgslArray<d.U32>;
    };
}>(layout: TgpuBindGroupLayout<{
    counter: {
        storage: d.U32;
        access: "mutable";
    };
    values: {
        storage: d.WgslArray<d.U32>;
    };
}>, entries: ExtractBindGroupInputFromLayout<{
    counter: {
        storage: d.U32;
        access: "mutable";
    };
    values: {
        storage: d.WgslArray<d.U32>;
    };
}>): TgpuBindGroup<{
    counter: {
        storage: d.U32;
        access: "mutable";
    };
    values: {
        storage: d.WgslArray<d.U32>;
    };
}>
Creates a group of resources that can be bound to a shader based on a specified layout.
@example 
const fooLayout = tgpu.bindGroupLayout({
foo: { uniform: d.vec3f },
bar: { texture: 'float' },
});
const fooBuffer = ...;
const barTexture = ...;
const fooBindGroup = root.createBindGroup(fooLayout, {
foo: fooBuffer,
bar: barTexture,
});
@param ― layout Layout describing the bind group to be created.
@param ― entries A record with values being the resources populating the bind group
and keys being their associated names, matching the layout keys.
createBindGroup(
const bindGroupLayout: TgpuBindGroupLayout<{
    counter: {
        storage: d.U32;
        access: "mutable";
    };
    values: {
        storage: d.WgslArray<d.U32>;
    };
}>
bindGroupLayout, {
  
counter: GPUBuffer | (TgpuBuffer<d.U32 | d.Atomic<d.U32> | DecoratedLocation<d.U32>> & StorageFlag)
counter: 
const counterBuffer: TgpuBuffer<d.U32> & StorageFlag
counterBuffer,
  
values: GPUBuffer | (TgpuBuffer<d.WgslArray<d.U32 | d.Atomic<d.U32> | DecoratedLocation<d.U32>>> & StorageFlag)
values: 
const valuesBuffer: TgpuBuffer<d.WgslArray<d.U32>> & StorageFlag
valuesBuffer,
});

Because bind group entries are typed, providing a resource with the wrong schema or usage is a TypeScript error.

const 
const wrongBuffer: TgpuBuffer<d.F32> & StorageFlag
wrongBuffer = 
const root: TgpuRoot
root.
TgpuRoot.createBuffer<d.F32>(typeSchema: d.F32, initial?: number | ((buffer: TgpuBuffer<NoInfer<d.F32>>) => void) | undefined): TgpuBuffer<d.F32> (+1 overload)
Allocates memory on the GPU, allows passing data between host and shader.
@param ― typeSchema The type of data that this buffer will hold.
@param ― initial Either initial value of the buffer, or an initializer to execute on the mapped buffer. (optional)
createBuffer(
import d
d.
const f32: d.F32
export f32
A schema that represents a 32-bit float value. (equivalent to f32 in WGSL)
Can also be called to cast a value to an f32.
@example const value = f32(); // 0
@example const value = f32(1.23); // 1.23
@example const value = f32(true); // 1
f32).
TgpuBuffer<F32>.$usage<["storage"]>(usages_0: "storage"): TgpuBuffer<d.F32> & StorageFlag
$usage('storage');
const 
const invalidBindGroup: TgpuBindGroup<{
    counter: {
        storage: d.U32;
        access: "mutable";
    };
    values: {
        storage: d.WgslArray<d.U32>;
    };
}>
invalidBindGroup = 
const root: TgpuRoot
root.
TgpuRoot.createBindGroup<{
    counter: {
        storage: d.U32;
        access: "mutable";
    };
    values: {
        storage: d.WgslArray<d.U32>;
    };
}>(layout: TgpuBindGroupLayout<{
    counter: {
        storage: d.U32;
        access: "mutable";
    };
    values: {
        storage: d.WgslArray<d.U32>;
    };
}>, entries: ExtractBindGroupInputFromLayout<{
    counter: {
        storage: d.U32;
        access: "mutable";
    };
    values: {
        storage: d.WgslArray<d.U32>;
    };
}>): TgpuBindGroup<{
    counter: {
        storage: d.U32;
        access: "mutable";
    };
    values: {
        storage: d.WgslArray<d.U32>;
    };
}>
Creates a group of resources that can be bound to a shader based on a specified layout.
@example 
const fooLayout = tgpu.bindGroupLayout({
foo: { uniform: d.vec3f },
bar: { texture: 'float' },
});
const fooBuffer = ...;
const barTexture = ...;
const fooBindGroup = root.createBindGroup(fooLayout, {
foo: fooBuffer,
bar: barTexture,
});
@param ― layout Layout describing the bind group to be created.
@param ― entries A record with values being the resources populating the bind group
and keys being their associated names, matching the layout keys.
createBindGroup(
const bindGroupLayout: TgpuBindGroupLayout<{
    counter: {
        storage: d.U32;
        access: "mutable";
    };
    values: {
        storage: d.WgslArray<d.U32>;
    };
}>
bindGroupLayout, {
  counter: 
const wrongBuffer: TgpuBuffer<d.F32> & StorageFlag
wrongBuffer,Error ts(2322)  ― Type 'TgpuBuffer<F32> & StorageFlag' is not assignable to type 'GPUBuffer | (TgpuBuffer<U32 | Atomic<U32> | DecoratedLocation<U32>> & StorageFlag)'.
  Type 'TgpuBuffer<F32> & StorageFlag' is not assignable to type 'TgpuBuffer<U32 | Atomic<U32> | DecoratedLocation<U32>> & StorageFlag'.
    Type 'TgpuBuffer<F32> & StorageFlag' is not assignable to type 'TgpuBuffer<U32 | Atomic<U32> | DecoratedLocation<U32>>'.
      Types of property 'dataType' are incompatible.
        Type 'F32' is not assignable to type 'U32 | Atomic<U32> | DecoratedLocation<U32>'.
          Property '[$memIdent]' is missing in type 'F32' but required in type 'U32'.
  
values: GPUBuffer | (TgpuBuffer<d.WgslArray<d.U32 | d.Atomic<d.U32> | DecoratedLocation<d.U32>>> & StorageFlag)
values: 
const valuesBuffer: TgpuBuffer<d.WgslArray<d.U32>> & StorageFlag
valuesBuffer,
});

Here is that resource swap in a small particle simulation. The shader stays the same, while the selected bind group decides which particle buffer is advanced:

import 
const tgpu: {
    const: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/constant/tgpuConstant").constant;
    fn: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/function/tgpuFn").fn;
    comptime: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/function/comptime").comptime;
    resolve: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/resolve/tgpuResolve").resolve;
    resolveWithContext: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/resolve/tgpuResolve").resolveWithContext;
    init: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/root/init").init;
    initFromDevice: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/root/init").initFromDevice;
    slot: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/slot/slot").slot;
    lazy: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/slot/lazy").lazy;
    ... 10 more ...;
    '~unstable': typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/tgpuUnstable");
}@module ― typegpu
tgpu, { 
import d
d, 
import std
std } from 'typegpu';

const 
const root: TgpuRoot
root = await 
const tgpu: {
    const: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/constant/tgpuConstant").constant;
    fn: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/function/tgpuFn").fn;
    comptime: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/function/comptime").comptime;
    resolve: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/resolve/tgpuResolve").resolve;
    resolveWithContext: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/resolve/tgpuResolve").resolveWithContext;
    init: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/root/init").init;
    initFromDevice: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/root/init").initFromDevice;
    slot: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/slot/slot").slot;
    lazy: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/slot/lazy").lazy;
    ... 10 more ...;
    '~unstable': typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/tgpuUnstable");
}@module ― typegpu
tgpu.
init: (options?: InitOptions) => Promise<TgpuRoot>
Requests a new GPU device and creates a root around it.
If a specific device should be used instead, use
@see ― initFromDevice. *
@example 
When given no options, the function will ask the browser for a suitable GPU device.
const root = await tgpu.init();
@example 
If there are specific options that should be used when requesting a device, you can pass those in.
const adapterOptions: GPURequestAdapterOptions = ...;
const deviceDescriptor: GPUDeviceDescriptor = ...;
const root = await tgpu.init({ adapter: adapterOptions, device: deviceDescriptor });
init();

const 
const PARTICLE_COUNT: 96
PARTICLE_COUNT = 96;

const 
const Particle: d.WgslStruct<{
    position: d.Vec2f;
    velocity: d.Vec2f;
}>
Particle = 
import d
d.
struct<{
    position: d.Vec2f;
    velocity: d.Vec2f;
}>(props: {
    position: d.Vec2f;
    velocity: d.Vec2f;
}): d.WgslStruct<{
    position: d.Vec2f;
    velocity: d.Vec2f;
}>
export struct
Creates a struct schema that can be used to construct GPU buffers.
Ensures proper alignment and padding of properties (as opposed to a d.unstruct schema).
The order of members matches the passed in properties object.
@example const CircleStruct = d.struct({ radius: d.f32, pos: d.vec3f });
@param ― props Record with string keys and TgpuData values,
each entry describing one struct member.
struct({
  
position: d.Vec2f
position: 
import d
d.
const vec2f: d.Vec2f
export vec2f
Schema representing vec2f - a vector with 2 elements of type f32.
Also a constructor function for this vector value.
@example const vector = d.vec2f(); // (0.0, 0.0)
const vector = d.vec2f(1); // (1.0, 1.0)
const vector = d.vec2f(0.5, 0.1); // (0.5, 0.1)
@example const buffer = root.createBuffer(d.vec2f, d.vec2f(0, 1)); // buffer holding a d.vec2f value, with an initial value of vec2f(0, 1);
vec2f,
  
velocity: d.Vec2f
velocity: 
import d
d.
const vec2f: d.Vec2f
export vec2f
Schema representing vec2f - a vector with 2 elements of type f32.
Also a constructor function for this vector value.
@example const vector = d.vec2f(); // (0.0, 0.0)
const vector = d.vec2f(1); // (1.0, 1.0)
const vector = d.vec2f(0.5, 0.1); // (0.5, 0.1)
@example const buffer = root.createBuffer(d.vec2f, d.vec2f(0, 1)); // buffer holding a d.vec2f value, with an initial value of vec2f(0, 1);
vec2f,
});
const 
const ParticleArray: d.WgslArray<d.WgslStruct<{
    position: d.Vec2f;
    velocity: d.Vec2f;
}>>
ParticleArray = 
import d
d.
arrayOf<d.WgslStruct<{
    position: d.Vec2f;
    velocity: d.Vec2f;
}>>(elementType: d.WgslStruct<{
    position: d.Vec2f;
    velocity: d.Vec2f;
}>, elementCount: number): d.WgslArray<d.WgslStruct<{
    position: d.Vec2f;
    velocity: d.Vec2f;
}>> (+1 overload)
export arrayOf
Creates an array schema that can be used to construct gpu buffers.
Describes arrays with fixed-size length, storing elements of the same type.
@example 
const LENGTH = 3;
const array = d.arrayOf(d.u32, LENGTH);
If elementCount is not specified, a partially applied function is returned.
@example const array = d.arrayOf(d.vec3f);
//    ^? (n: number) => WgslArray<d.Vec3f>
@param ― elementType The type of elements in the array.
@param ― elementCount The number of elements in the array.
arrayOf(
const Particle: d.WgslStruct<{
    position: d.Vec2f;
    velocity: d.Vec2f;
}>
Particle, 
const PARTICLE_COUNT: 96
PARTICLE_COUNT);

const 
const particleLayout: TgpuBindGroupLayout<{
    particles: {
        storage: d.WgslArray<d.WgslStruct<{
            position: d.Vec2f;
            velocity: d.Vec2f;
        }>>;
        access: "mutable";
    };
}>
particleLayout = 
const tgpu: {
    const: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/constant/tgpuConstant").constant;
    fn: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/function/tgpuFn").fn;
    comptime: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/function/comptime").comptime;
    resolve: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/resolve/tgpuResolve").resolve;
    resolveWithContext: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/resolve/tgpuResolve").resolveWithContext;
    init: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/root/init").init;
    initFromDevice: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/root/init").initFromDevice;
    slot: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/slot/slot").slot;
    lazy: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/slot/lazy").lazy;
    ... 10 more ...;
    '~unstable': typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/tgpuUnstable");
}@module ― typegpu
tgpu.
bindGroupLayout: <{
    particles: {
        storage: d.WgslArray<d.WgslStruct<{
            position: d.Vec2f;
            velocity: d.Vec2f;
        }>>;
        access: "mutable";
    };
}>(entries: {
    particles: {
        storage: d.WgslArray<d.WgslStruct<{
            position: d.Vec2f;
            velocity: d.Vec2f;
        }>>;
        access: "mutable";
    };
}) => TgpuBindGroupLayout<{
    particles: {
        storage: d.WgslArray<d.WgslStruct<{
            position: d.Vec2f;
            velocity: d.Vec2f;
        }>>;
        access: "mutable";
    };
}> (+1 overload)
bindGroupLayout({
  
particles: {
    storage: d.WgslArray<d.WgslStruct<{
        position: d.Vec2f;
        velocity: d.Vec2f;
    }>>;
    access: "mutable";
}
particles: { 
storage: d.WgslArray<d.WgslStruct<{
    position: d.Vec2f;
    velocity: d.Vec2f;
}>>
storage: 
const ParticleArray: d.WgslArray<d.WgslStruct<{
    position: d.Vec2f;
    velocity: d.Vec2f;
}>>
ParticleArray, 
access: "mutable"
access: 'mutable' },
});

declare const 
const initialStates: {
    position: d.v2f;
    velocity: d.v2f;
}[][]
initialStates: { 
position: d.v2f
position: 
import d
d.
export v2f
Interface representing its WGSL vector type counterpart: vec2f or vec2.
A vector with 2 elements of type f32
v2f; 
velocity: d.v2f
velocity: 
import d
d.
export v2f
Interface representing its WGSL vector type counterpart: vec2f or vec2.
A vector with 2 elements of type f32
v2f }[][]

const 
const particleBuffers: (TgpuBuffer<d.WgslArray<d.WgslStruct<{
    position: d.Vec2f;
    velocity: d.Vec2f;
}>>> & StorageFlag)[]
particleBuffers = 
const initialStates: {
    position: d.v2f;
    velocity: d.v2f;
}[][]
initialStates.
Array<{ position: d.v2f; velocity: d.v2f; }[]>.map<TgpuBuffer<d.WgslArray<d.WgslStruct<{
    position: d.Vec2f;
    velocity: d.Vec2f;
}>>> & StorageFlag>(callbackfn: (value: {
    position: d.v2f;
    velocity: d.v2f;
}[], index: number, array: {
    position: d.v2f;
    velocity: d.v2f;
}[][]) => TgpuBuffer<d.WgslArray<d.WgslStruct<{
    position: d.Vec2f;
    velocity: d.Vec2f;
}>>> & StorageFlag, thisArg?: any): (TgpuBuffer<d.WgslArray<d.WgslStruct<{
    position: d.Vec2f;
    velocity: d.Vec2f;
}>>> & StorageFlag)[]
Calls a defined callback function on each element of an array, and returns an array that contains the results.
@param ― callbackfn A function that accepts up to three arguments. The map method calls the callbackfn function one time for each element in the array.
@param ― thisArg An object to which the this keyword can refer in the callbackfn function. If thisArg is omitted, undefined is used as the this value.
map((
state: {
    position: d.v2f;
    velocity: d.v2f;
}[]
state) =>
  
const root: TgpuRoot
root.
TgpuRoot.createBuffer<d.WgslArray<d.WgslStruct<{
    position: d.Vec2f;
    velocity: d.Vec2f;
}>>>(typeSchema: d.WgslArray<d.WgslStruct<{
    position: d.Vec2f;
    velocity: d.Vec2f;
}>>, initial?: ((buffer: TgpuBuffer<NoInfer<d.WgslArray<d.WgslStruct<{
    position: d.Vec2f;
    velocity: d.Vec2f;
}>>>>) => void) | d.InferInput<NoInfer<d.WgslArray<d.WgslStruct<{
    position: d.Vec2f;
    velocity: d.Vec2f;
}>>>> | undefined): TgpuBuffer<d.WgslArray<d.WgslStruct<{
    position: d.Vec2f;
    velocity: d.Vec2f;
}>>> (+1 overload)
Allocates memory on the GPU, allows passing data between host and shader.
@param ― typeSchema The type of data that this buffer will hold.
@param ― initial Either initial value of the buffer, or an initializer to execute on the mapped buffer. (optional)
createBuffer(
const ParticleArray: d.WgslArray<d.WgslStruct<{
    position: d.Vec2f;
    velocity: d.Vec2f;
}>>
ParticleArray, 
state: {
    position: d.v2f;
    velocity: d.v2f;
}[]
state).
TgpuBuffer<WgslArray<WgslStruct<{ position: Vec2f; velocity: Vec2f; }>>>.$usage<["storage"]>(usages_0: "storage"): TgpuBuffer<d.WgslArray<d.WgslStruct<{
    position: d.Vec2f;
    velocity: d.Vec2f;
}>>> & StorageFlag
$usage('storage'),
);

const 
const bindGroups: TgpuBindGroup<{
    particles: {
        storage: d.WgslArray<d.WgslStruct<{
            position: d.Vec2f;
            velocity: d.Vec2f;
        }>>;
        access: "mutable";
    };
}>[]
bindGroups = 
const particleBuffers: (TgpuBuffer<d.WgslArray<d.WgslStruct<{
    position: d.Vec2f;
    velocity: d.Vec2f;
}>>> & StorageFlag)[]
particleBuffers.
Array<TgpuBuffer<WgslArray<WgslStruct<{ position: Vec2f; velocity: Vec2f; }>>> & StorageFlag>.map<TgpuBindGroup<{
    particles: {
        storage: d.WgslArray<d.WgslStruct<{
            position: d.Vec2f;
            velocity: d.Vec2f;
        }>>;
        access: "mutable";
    };
}>>(callbackfn: (value: TgpuBuffer<d.WgslArray<d.WgslStruct<{
    position: d.Vec2f;
    velocity: d.Vec2f;
}>>> & StorageFlag, index: number, array: (TgpuBuffer<d.WgslArray<d.WgslStruct<{
    position: d.Vec2f;
    velocity: d.Vec2f;
}>>> & StorageFlag)[]) => TgpuBindGroup<{
    particles: {
        storage: d.WgslArray<d.WgslStruct<{
            position: d.Vec2f;
            velocity: d.Vec2f;
        }>>;
        access: "mutable";
    };
}>, thisArg?: any): TgpuBindGroup<...>[]
Calls a defined callback function on each element of an array, and returns an array that contains the results.
@param ― callbackfn A function that accepts up to three arguments. The map method calls the callbackfn function one time for each element in the array.
@param ― thisArg An object to which the this keyword can refer in the callbackfn function. If thisArg is omitted, undefined is used as the this value.
map((
particles: TgpuBuffer<d.WgslArray<d.WgslStruct<{
    position: d.Vec2f;
    velocity: d.Vec2f;
}>>> & StorageFlag
particles) =>
  
const root: TgpuRoot
root.
TgpuRoot.createBindGroup<{
    particles: {
        storage: d.WgslArray<d.WgslStruct<{
            position: d.Vec2f;
            velocity: d.Vec2f;
        }>>;
        access: "mutable";
    };
}>(layout: TgpuBindGroupLayout<{
    particles: {
        storage: d.WgslArray<d.WgslStruct<{
            position: d.Vec2f;
            velocity: d.Vec2f;
        }>>;
        access: "mutable";
    };
}>, entries: ExtractBindGroupInputFromLayout<{
    particles: {
        storage: d.WgslArray<d.WgslStruct<{
            position: d.Vec2f;
            velocity: d.Vec2f;
        }>>;
        access: "mutable";
    };
}>): TgpuBindGroup<{
    particles: {
        storage: d.WgslArray<d.WgslStruct<{
            position: d.Vec2f;
            velocity: d.Vec2f;
        }>>;
        access: "mutable";
    };
}>
Creates a group of resources that can be bound to a shader based on a specified layout.
@example 
const fooLayout = tgpu.bindGroupLayout({
foo: { uniform: d.vec3f },
bar: { texture: 'float' },
});
const fooBuffer = ...;
const barTexture = ...;
const fooBindGroup = root.createBindGroup(fooLayout, {
foo: fooBuffer,
bar: barTexture,
});
@param ― layout Layout describing the bind group to be created.
@param ― entries A record with values being the resources populating the bind group
and keys being their associated names, matching the layout keys.
createBindGroup(
const particleLayout: TgpuBindGroupLayout<{
    particles: {
        storage: d.WgslArray<d.WgslStruct<{
            position: d.Vec2f;
            velocity: d.Vec2f;
        }>>;
        access: "mutable";
    };
}>
particleLayout, { 
particles: GPUBuffer | (TgpuBuffer<d.WgslArray<d.WgslStruct<{
    position: d.Vec2f;
    velocity: d.Vec2f;
}>>> & StorageFlag)
particles }),
);

const 
const simulate: TgpuGuardedComputePipeline<[i: number]>
simulate = 
const root: TgpuRoot
root.
WithBinding.createGuardedComputePipeline<[i: number]>(callback: (i: number) => void): TgpuGuardedComputePipeline<[i: number]>
Creates a compute pipeline that executes the given callback in an exact number of threads.
This is different from createComputePipeline() in that it does a bounds check on the
thread id, where as regular pipelines do not and work in units of workgroups.
@param ― callback A function converted to WGSL and executed on the GPU.
It can accept up to 3 parameters (x, y, z) which correspond to the global invocation ID
of the executing thread.
@example 
If no parameters are provided, the callback will be executed once, in a single thread.
const fooPipeline = root
  .createGuardedComputePipeline(() => {
    'use gpu';
    console.log('Hello, GPU!');
  });

fooPipeline.dispatchThreads();
// [GPU] Hello, GPU!
@example 
One parameter means n-threads will be executed in parallel.
const fooPipeline = root
  .createGuardedComputePipeline((x) => {
    'use gpu';
    if (x % 16 === 0) {
      // Logging every 16th thread
      console.log('I am the', x, 'thread');
    }
  });

// executing 512 threads
fooPipeline.dispatchThreads(512);
// [GPU] I am the 256 thread
// [GPU] I am the 272 thread
// ... (30 hidden logs)
// [GPU] I am the 16 thread
// [GPU] I am the 240 thread
createGuardedComputePipeline((
i: number
i) => {
  'use gpu';
  const 
const particle: {
    position: d.v2f;
    velocity: d.v2f;
}
particle = 
const particleLayout: TgpuBindGroupLayout<{
    particles: {
        storage: d.WgslArray<d.WgslStruct<{
            position: d.Vec2f;
            velocity: d.Vec2f;
        }>>;
        access: "mutable";
    };
}>
particleLayout.
TgpuBindGroupLayout<{ particles: { storage: WgslArray<WgslStruct<{ position: Vec2f; velocity: Vec2f; }>>; access: "mutable"; }; }>.$: {
    particles: {
        position: d.v2f;
        velocity: d.v2f;
    }[];
}
$.
particles: {
    position: d.v2f;
    velocity: d.v2f;
}[]
particles[
i: number
i];
  
const particleLayout: TgpuBindGroupLayout<{
    particles: {
        storage: d.WgslArray<d.WgslStruct<{
            position: d.Vec2f;
            velocity: d.Vec2f;
        }>>;
        access: "mutable";
    };
}>
particleLayout.
TgpuBindGroupLayout<{ particles: { storage: WgslArray<WgslStruct<{ position: Vec2f; velocity: Vec2f; }>>; access: "mutable"; }; }>.$: {
    particles: {
        position: d.v2f;
        velocity: d.v2f;
    }[];
}
$.
particles: {
    position: d.v2f;
    velocity: d.v2f;
}[]
particles[
i: number
i].
position: d.v2f
position = 
import std
std.
fract<d.v2f>(value: d.v2f): d.v2f (+1 overload)
export fract
fract(
    
const particle: {
    position: d.v2f;
    velocity: d.v2f;
}
particle.
position: d.v2f
position + 
const particle: {
    position: d.v2f;
    velocity: d.v2f;
}
particle.
velocity: d.v2f
velocity,
  );
});

declare function 
function render(selected: number): void
render(
selected: number
selected: number): void;

export function 
function run(selected: number): void
run(
selected: number
selected: number) {
  const 
const bindGroup: TgpuBindGroup<{
    particles: {
        storage: d.WgslArray<d.WgslStruct<{
            position: d.Vec2f;
            velocity: d.Vec2f;
        }>>;
        access: "mutable";
    };
}>
bindGroup = 
const bindGroups: TgpuBindGroup<{
    particles: {
        storage: d.WgslArray<d.WgslStruct<{
            position: d.Vec2f;
            velocity: d.Vec2f;
        }>>;
        access: "mutable";
    };
}>[]
bindGroups[
selected: number
selected]!;

  
const simulate: TgpuGuardedComputePipeline<[i: number]>
simulate.
TgpuGuardedComputePipeline<[i: number]>.with(bindGroup: TgpuBindGroup): TgpuGuardedComputePipeline<[i: number]> (+1 overload)
Returns a pipeline wrapper with the specified bind group bound.
Analogous to TgpuComputePipeline.with(bindGroup).
with(
const bindGroup: TgpuBindGroup<{
    particles: {
        storage: d.WgslArray<d.WgslStruct<{
            position: d.Vec2f;
            velocity: d.Vec2f;
        }>>;
        access: "mutable";
    };
}>
bindGroup).
TgpuGuardedComputePipeline<[i: number]>.dispatchThreads(i: number): void
Dispatches the pipeline.
Unlike TgpuComputePipeline.dispatchWorkgroups(), this method takes in the
number of threads to run in each dimension.
Under the hood, the number of expected threads is sent as a uniform, and
"guarded" by a bounds check.
dispatchThreads(
const PARTICLE_COUNT: 96
PARTICLE_COUNT);
  
function render(selected: number): void
render(
selected: number
selected);
}

Orbit0 steps

The visible snippet hides the rendering code behind render(...) because render pipelines and SDF drawing have not been introduced yet. The full example below includes that part.

Show full example code

import { sdBox2d } from '@typegpu/sdf';
import tgpu, { common, d, std } from 'typegpu';
import { createExampleRoot } from './runnable/index.ts';

const PARTICLE_COUNT = 96;
const PARTICLE_HALF_SIZE = 0.0065;

const Particle = d.struct({
  position: d.vec2f,
  velocity: d.vec2f,
});
const ParticleArray = d.arrayOf(Particle, PARTICLE_COUNT);

export type ParticleValue = {
  position: d.v2f;
  velocity: d.v2f;
};

export type BufferIndex = 0 | 1 | 2;

export const BUFFER_COLORS = [
  d.vec3f(0.4, 0.91, 0.98),
  d.vec3f(0.98, 0.66, 0.83),
  d.vec3f(0.75, 0.95, 0.39),
] as const;

function createOrbitState(): ParticleValue[] {
  return Array.from({ length: PARTICLE_COUNT }, (_, i) => {
    const t = (i / PARTICLE_COUNT) * Math.PI * 2;
    const radius = 0.28 + Math.sin(i * 0.7) * 0.045;
    const tangent = t + Math.PI / 2;

    return {
      position: d.vec2f(0.5 + Math.cos(t) * radius, 0.5 + Math.sin(t) * radius),
      velocity: d.vec2f(Math.cos(tangent) * 0.0065, Math.sin(tangent) * 0.0065),
    };
  });
}

function createDriftState(): ParticleValue[] {
  const columns = 12;
  const rows = PARTICLE_COUNT / columns;

  return Array.from({ length: PARTICLE_COUNT }, (_, i) => {
    const x = i % columns;
    const y = Math.floor(i / columns);
    const offset = Math.sin(i * 1.37) * 0.012;

    return {
      position: d.vec2f((x + 0.5) / columns, (y + 0.5) / rows + offset),
      velocity: d.vec2f(0.009, 0.004),
    };
  });
}

function createBurstState(): ParticleValue[] {
  return Array.from({ length: PARTICLE_COUNT }, (_, i) => {
    const t = (i / PARTICLE_COUNT) * Math.PI * 2;
    const ring = (i % 12) / 12;
    const radius = 0.03 + ring * 0.12;
    const speed = 0.004 + ring * 0.007;

    return {
      position: d.vec2f(0.5 + Math.cos(t) * radius, 0.5 + Math.sin(t) * radius),
      velocity: d.vec2f(Math.cos(t) * speed, Math.sin(t) * speed),
    };
  });
}

const INITIAL_STATES = [createOrbitState(), createDriftState(), createBurstState()] as const;

export async function createBindGroupProgram(canvas: HTMLCanvasElement) {
  const root = await createExampleRoot();
  const context = root.configureContext({ canvas, alphaMode: 'premultiplied' });

  const computeLayout = tgpu.bindGroupLayout({
    particles: { storage: ParticleArray, access: 'mutable' },
  });
  const displayLayout = tgpu.bindGroupLayout({
    particles: { storage: ParticleArray },
  });

  const particleBuffers = INITIAL_STATES.map((state) =>
    root.createBuffer(ParticleArray, state).$usage('storage'),
  );
  const computeBindGroups = particleBuffers.map((particles) =>
    root.createBindGroup(computeLayout, { particles }),
  );
  const displayBindGroups = particleBuffers.map((particles) =>
    root.createBindGroup(displayLayout, { particles }),
  );

  const particleColor = root.createUniform(d.vec3f, BUFFER_COLORS[0]);

  const simulate = root.createGuardedComputePipeline((i) => {
    'use gpu';
    const particle = computeLayout.$.particles[i];
    computeLayout.$.particles[i].position = std.fract(particle.position + particle.velocity);
  });

  const render = root.createRenderPipeline({
    vertex: common.fullScreenTriangle,
    fragment: ({ uv }) => {
      'use gpu';
      const background = d.vec3f(0.09, 0.08, 0.15);
      const gridColor = d.vec3f(0.19, 0.18, 0.29);
      const gridUv = std.fract(uv * 8);
      const verticalGrid = std.min(gridUv.x, 1 - gridUv.x);
      const horizontalGrid = std.min(gridUv.y, 1 - gridUv.y);
      const gridDistance = std.min(verticalGrid, horizontalGrid);
      const gridMask = 1 - std.smoothstep(0.003, 0.007, gridDistance);

      let particleDistance = d.f32(1);
      for (const particle of displayLayout.$.particles) {
        const offset = std.abs(uv - particle.position);
        const wrappedOffset = std.min(offset, 1 - offset);
        particleDistance = std.min(
          particleDistance,
          sdBox2d(wrappedOffset, d.vec2f(PARTICLE_HALF_SIZE)),
        );
      }

      const particleMask = 1 - std.smoothstep(0, std.fwidth(particleDistance), particleDistance);
      const withGrid = std.mix(background, gridColor, gridMask);
      const withParticles = std.mix(withGrid, particleColor.$, particleMask);

      return d.vec4f(withParticles, 1);
    },
  });

  function resizeCanvasToDisplaySize() {
    const pixelRatio = Math.min(window.devicePixelRatio || 1, 2);
    const width = Math.min(1024, Math.max(1, Math.round(canvas.clientWidth * pixelRatio)));
    const height = Math.min(1024, Math.max(1, Math.round(canvas.clientHeight * pixelRatio)));

    if (canvas.width !== width || canvas.height !== height) {
      canvas.width = width;
      canvas.height = height;
    }
  }

  function draw(bufferIndex: BufferIndex) {
    const displayBindGroup = displayBindGroups[bufferIndex];
    if (!displayBindGroup) {
      return;
    }
    resizeCanvasToDisplaySize();
    particleColor.write(BUFFER_COLORS[bufferIndex]);
    render.with(displayBindGroup).withColorAttachment({ view: context }).draw(3);
  }

  return {
    computeBindGroups,
    draw,
    particleCount: PARTICLE_COUNT,
    root,
    simulate,
  };
}

Now the same simulation step can be written in either style. The Fixed Resources version is bound to one particular set of resources, while the Laid Out Resources version keeps the pipeline reusable and supplies the resources when dispatching.

Laid Out Resources
Fixed Resources

import 
const tgpu: {
    const: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/constant/tgpuConstant").constant;
    fn: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/function/tgpuFn").fn;
    comptime: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/function/comptime").comptime;
    resolve: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/resolve/tgpuResolve").resolve;
    resolveWithContext: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/resolve/tgpuResolve").resolveWithContext;
    init: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/root/init").init;
    initFromDevice: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/root/init").initFromDevice;
    slot: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/slot/slot").slot;
    lazy: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/slot/lazy").lazy;
    ... 10 more ...;
    '~unstable': typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/tgpuUnstable");
}@module ― typegpu
tgpu, { 
import d
d } from 'typegpu';

const 
const root: TgpuRoot
root = await 
const tgpu: {
    const: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/constant/tgpuConstant").constant;
    fn: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/function/tgpuFn").fn;
    comptime: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/function/comptime").comptime;
    resolve: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/resolve/tgpuResolve").resolve;
    resolveWithContext: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/resolve/tgpuResolve").resolveWithContext;
    init: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/root/init").init;
    initFromDevice: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/root/init").initFromDevice;
    slot: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/slot/slot").slot;
    lazy: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/slot/lazy").lazy;
    ... 10 more ...;
    '~unstable': typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/tgpuUnstable");
}@module ― typegpu
tgpu.
init: (options?: InitOptions) => Promise<TgpuRoot>
Requests a new GPU device and creates a root around it.
If a specific device should be used instead, use
@see ― initFromDevice. *
@example 
When given no options, the function will ask the browser for a suitable GPU device.
const root = await tgpu.init();
@example 
If there are specific options that should be used when requesting a device, you can pass those in.
const adapterOptions: GPURequestAdapterOptions = ...;
const deviceDescriptor: GPUDeviceDescriptor = ...;
const root = await tgpu.init({ adapter: adapterOptions, device: deviceDescriptor });
init();

const 
const Particle: d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>
Particle = 
import d
d.
struct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>(props: {
    position: d.Vec3f;
    velocity: d.Vec3f;
}): d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>
export struct
Creates a struct schema that can be used to construct GPU buffers.
Ensures proper alignment and padding of properties (as opposed to a d.unstruct schema).
The order of members matches the passed in properties object.
@example const CircleStruct = d.struct({ radius: d.f32, pos: d.vec3f });
@param ― props Record with string keys and TgpuData values,
each entry describing one struct member.
struct({
  
position: d.Vec3f
position: 
import d
d.
const vec3f: d.Vec3f
export vec3f
Schema representing vec3f - a vector with 3 elements of type f32.
Also a constructor function for this vector value.
@example const vector = d.vec3f(); // (0.0, 0.0, 0.0)
const vector = d.vec3f(1); // (1.0, 1.0, 1.0)
const vector = d.vec3f(1, 2, 3.5); // (1.0, 2.0, 3.5)
@example const buffer = root.createBuffer(d.vec3f, d.vec3f(0, 1, 2)); // buffer holding a d.vec3f value, with an initial value of vec3f(0, 1, 2);
vec3f,
  
velocity: d.Vec3f
velocity: 
import d
d.
const vec3f: d.Vec3f
export vec3f
Schema representing vec3f - a vector with 3 elements of type f32.
Also a constructor function for this vector value.
@example const vector = d.vec3f(); // (0.0, 0.0, 0.0)
const vector = d.vec3f(1); // (1.0, 1.0, 1.0)
const vector = d.vec3f(1, 2, 3.5); // (1.0, 2.0, 3.5)
@example const buffer = root.createBuffer(d.vec3f, d.vec3f(0, 1, 2)); // buffer holding a d.vec3f value, with an initial value of vec3f(0, 1, 2);
vec3f,
});

const 
const deltaTimeBuffer: TgpuBuffer<d.F32> & UniformFlag
deltaTimeBuffer = 
const root: TgpuRoot
root.
TgpuRoot.createBuffer<d.F32>(typeSchema: d.F32, initial?: number | ((buffer: TgpuBuffer<NoInfer<d.F32>>) => void) | undefined): TgpuBuffer<d.F32> (+1 overload)
Allocates memory on the GPU, allows passing data between host and shader.
@param ― typeSchema The type of data that this buffer will hold.
@param ― initial Either initial value of the buffer, or an initializer to execute on the mapped buffer. (optional)
createBuffer(
import d
d.
const f32: d.F32
export f32
A schema that represents a 32-bit float value. (equivalent to f32 in WGSL)
Can also be called to cast a value to an f32.
@example const value = f32(); // 0
@example const value = f32(1.23); // 1.23
@example const value = f32(true); // 1
f32).
TgpuBuffer<F32>.$usage<["uniform"]>(usages_0: "uniform"): TgpuBuffer<d.F32> & UniformFlag
$usage('uniform');
const 
const particlesBuffer: TgpuBuffer<d.WgslArray<d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>>> & StorageFlag
particlesBuffer = 
const root: TgpuRoot
root.
TgpuRoot.createBuffer<d.WgslArray<d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>>>(typeSchema: d.WgslArray<d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>>, initial?: ((buffer: TgpuBuffer<NoInfer<d.WgslArray<d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>>>>) => void) | d.InferInput<NoInfer<d.WgslArray<d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>>>> | undefined): TgpuBuffer<d.WgslArray<d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>>> (+1 overload)
Allocates memory on the GPU, allows passing data between host and shader.
@param ― typeSchema The type of data that this buffer will hold.
@param ― initial Either initial value of the buffer, or an initializer to execute on the mapped buffer. (optional)
createBuffer(
import d
d.
arrayOf<d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>>(elementType: d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>, elementCount: number): d.WgslArray<d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>> (+1 overload)
export arrayOf
Creates an array schema that can be used to construct gpu buffers.
Describes arrays with fixed-size length, storing elements of the same type.
@example 
const LENGTH = 3;
const array = d.arrayOf(d.u32, LENGTH);
If elementCount is not specified, a partially applied function is returned.
@example const array = d.arrayOf(d.vec3f);
//    ^? (n: number) => WgslArray<d.Vec3f>
@param ― elementType The type of elements in the array.
@param ― elementCount The number of elements in the array.
arrayOf(
const Particle: d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>
Particle, 512)).
TgpuBuffer<WgslArray<WgslStruct<{ position: Vec3f; velocity: Vec3f; }>>>.$usage<["storage"]>(usages_0: "storage"): TgpuBuffer<d.WgslArray<d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>>> & StorageFlag
$usage('storage');

const 
const simulationLayout: TgpuBindGroupLayout<{
    deltaTime: {
        uniform: d.F32;
    };
    particles: {
        storage: (elementCount: number) => d.WgslArray<d.WgslStruct<{
            position: d.Vec3f;
            velocity: d.Vec3f;
        }>>;
        access: "mutable";
    };
}>
simulationLayout = 
const tgpu: {
    const: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/constant/tgpuConstant").constant;
    fn: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/function/tgpuFn").fn;
    comptime: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/function/comptime").comptime;
    resolve: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/resolve/tgpuResolve").resolve;
    resolveWithContext: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/resolve/tgpuResolve").resolveWithContext;
    init: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/root/init").init;
    initFromDevice: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/root/init").initFromDevice;
    slot: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/slot/slot").slot;
    lazy: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/slot/lazy").lazy;
    ... 10 more ...;
    '~unstable': typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/tgpuUnstable");
}@module ― typegpu
tgpu.
bindGroupLayout: <{
    deltaTime: {
        uniform: d.F32;
    };
    particles: {
        storage: (elementCount: number) => d.WgslArray<d.WgslStruct<{
            position: d.Vec3f;
            velocity: d.Vec3f;
        }>>;
        access: "mutable";
    };
}>(entries: {
    deltaTime: {
        uniform: d.F32;
    };
    particles: {
        storage: (elementCount: number) => d.WgslArray<d.WgslStruct<{
            position: d.Vec3f;
            velocity: d.Vec3f;
        }>>;
        access: "mutable";
    };
}) => TgpuBindGroupLayout<{
    deltaTime: {
        uniform: d.F32;
    };
    particles: {
        storage: (elementCount: number) => d.WgslArray<d.WgslStruct<{
            position: d.Vec3f;
            velocity: d.Vec3f;
        }>>;
        access: "mutable";
    };
}> (+1 overload)
bindGroupLayout({
  
deltaTime: {
    uniform: d.F32;
}
deltaTime: { 
uniform: d.F32
uniform: 
import d
d.
const f32: d.F32
export f32
A schema that represents a 32-bit float value. (equivalent to f32 in WGSL)
Can also be called to cast a value to an f32.
@example const value = f32(); // 0
@example const value = f32(1.23); // 1.23
@example const value = f32(true); // 1
f32 },
  
particles: {
    storage: (elementCount: number) => d.WgslArray<d.WgslStruct<{
        position: d.Vec3f;
        velocity: d.Vec3f;
    }>>;
    access: "mutable";
}
particles: { 
storage: (elementCount: number) => d.WgslArray<d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>>
storage: 
import d
d.
arrayOf<d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>>(elementType: d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>): (elementCount: number) => d.WgslArray<d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>> (+1 overload)
export arrayOf
Creates an array schema that can be used to construct gpu buffers.
Describes arrays with fixed-size length, storing elements of the same type.
@example 
const LENGTH = 3;
const array = d.arrayOf(d.u32, LENGTH);
If elementCount is not specified, a partially applied function is returned.
@example const array = d.arrayOf(d.vec3f);
//    ^? (n: number) => WgslArray<d.Vec3f>
@param ― elementType The type of elements in the array.
@param ― elementCount The number of elements in the array.
arrayOf(
const Particle: d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>
Particle), 
access: "mutable"
access: 'mutable' },
});

const 
const bindGroup: TgpuBindGroup<{
    deltaTime: {
        uniform: d.F32;
    };
    particles: {
        storage: (elementCount: number) => d.WgslArray<d.WgslStruct<{
            position: d.Vec3f;
            velocity: d.Vec3f;
        }>>;
        access: "mutable";
    };
}>
bindGroup = 
const root: TgpuRoot
root.
TgpuRoot.createBindGroup<{
    deltaTime: {
        uniform: d.F32;
    };
    particles: {
        storage: (elementCount: number) => d.WgslArray<d.WgslStruct<{
            position: d.Vec3f;
            velocity: d.Vec3f;
        }>>;
        access: "mutable";
    };
}>(layout: TgpuBindGroupLayout<{
    deltaTime: {
        uniform: d.F32;
    };
    particles: {
        storage: (elementCount: number) => d.WgslArray<d.WgslStruct<{
            position: d.Vec3f;
            velocity: d.Vec3f;
        }>>;
        access: "mutable";
    };
}>, entries: ExtractBindGroupInputFromLayout<{
    deltaTime: {
        uniform: d.F32;
    };
    particles: {
        storage: (elementCount: number) => d.WgslArray<d.WgslStruct<{
            position: d.Vec3f;
            velocity: d.Vec3f;
        }>>;
        access: "mutable";
    };
}>): TgpuBindGroup<...>
Creates a group of resources that can be bound to a shader based on a specified layout.
@example 
const fooLayout = tgpu.bindGroupLayout({
foo: { uniform: d.vec3f },
bar: { texture: 'float' },
});
const fooBuffer = ...;
const barTexture = ...;
const fooBindGroup = root.createBindGroup(fooLayout, {
foo: fooBuffer,
bar: barTexture,
});
@param ― layout Layout describing the bind group to be created.
@param ― entries A record with values being the resources populating the bind group
and keys being their associated names, matching the layout keys.
createBindGroup(
const simulationLayout: TgpuBindGroupLayout<{
    deltaTime: {
        uniform: d.F32;
    };
    particles: {
        storage: (elementCount: number) => d.WgslArray<d.WgslStruct<{
            position: d.Vec3f;
            velocity: d.Vec3f;
        }>>;
        access: "mutable";
    };
}>
simulationLayout, {
  
deltaTime: GPUBuffer | (TgpuBuffer<d.F32> & UniformFlag)
deltaTime: 
const deltaTimeBuffer: TgpuBuffer<d.F32> & UniformFlag
deltaTimeBuffer,
  
particles: GPUBuffer | (TgpuBuffer<d.WgslArray<d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>>> & StorageFlag)
particles: 
const particlesBuffer: TgpuBuffer<d.WgslArray<d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>>> & StorageFlag
particlesBuffer,
});

const 
const simulate: TgpuGuardedComputePipeline<[x: number]>
simulate = 
const root: TgpuRoot
root.
WithBinding.createGuardedComputePipeline<[x: number]>(callback: (x: number) => void): TgpuGuardedComputePipeline<[x: number]>
Creates a compute pipeline that executes the given callback in an exact number of threads.
This is different from createComputePipeline() in that it does a bounds check on the
thread id, where as regular pipelines do not and work in units of workgroups.
@param ― callback A function converted to WGSL and executed on the GPU.
It can accept up to 3 parameters (x, y, z) which correspond to the global invocation ID
of the executing thread.
@example 
If no parameters are provided, the callback will be executed once, in a single thread.
const fooPipeline = root
  .createGuardedComputePipeline(() => {
    'use gpu';
    console.log('Hello, GPU!');
  });

fooPipeline.dispatchThreads();
// [GPU] Hello, GPU!
@example 
One parameter means n-threads will be executed in parallel.
const fooPipeline = root
  .createGuardedComputePipeline((x) => {
    'use gpu';
    if (x % 16 === 0) {
      // Logging every 16th thread
      console.log('I am the', x, 'thread');
    }
  });

// executing 512 threads
fooPipeline.dispatchThreads(512);
// [GPU] I am the 256 thread
// [GPU] I am the 272 thread
// ... (30 hidden logs)
// [GPU] I am the 16 thread
// [GPU] I am the 240 thread
createGuardedComputePipeline((
x: number
x) => {
  'use gpu';
  const 
const particle: {
    position: d.v3f;
    velocity: d.v3f;
}
particle = 
const simulationLayout: TgpuBindGroupLayout<{
    deltaTime: {
        uniform: d.F32;
    };
    particles: {
        storage: (elementCount: number) => d.WgslArray<d.WgslStruct<{
            position: d.Vec3f;
            velocity: d.Vec3f;
        }>>;
        access: "mutable";
    };
}>
simulationLayout.
TgpuBindGroupLayout<{ deltaTime: { uniform: F32; }; particles: { storage: (elementCount: number) => WgslArray<WgslStruct<{ position: Vec3f; velocity: Vec3f; }>>; access: "mutable"; }; }>.$: {
    deltaTime: number;
    particles: {
        position: d.v3f;
        velocity: d.v3f;
    }[];
}
$.
particles: {
    position: d.v3f;
    velocity: d.v3f;
}[]
particles[
x: number
x];
  
const simulationLayout: TgpuBindGroupLayout<{
    deltaTime: {
        uniform: d.F32;
    };
    particles: {
        storage: (elementCount: number) => d.WgslArray<d.WgslStruct<{
            position: d.Vec3f;
            velocity: d.Vec3f;
        }>>;
        access: "mutable";
    };
}>
simulationLayout.
TgpuBindGroupLayout<{ deltaTime: { uniform: F32; }; particles: { storage: (elementCount: number) => WgslArray<WgslStruct<{ position: Vec3f; velocity: Vec3f; }>>; access: "mutable"; }; }>.$: {
    deltaTime: number;
    particles: {
        position: d.v3f;
        velocity: d.v3f;
    }[];
}
$.
particles: {
    position: d.v3f;
    velocity: d.v3f;
}[]
particles[
x: number
x].
position: d.v3f
position =
    
const particle: {
    position: d.v3f;
    velocity: d.v3f;
}
particle.
position: d.v3f
position + 
const particle: {
    position: d.v3f;
    velocity: d.v3f;
}
particle.
velocity: d.v3f
velocity * 
const simulationLayout: TgpuBindGroupLayout<{
    deltaTime: {
        uniform: d.F32;
    };
    particles: {
        storage: (elementCount: number) => d.WgslArray<d.WgslStruct<{
            position: d.Vec3f;
            velocity: d.Vec3f;
        }>>;
        access: "mutable";
    };
}>
simulationLayout.
TgpuBindGroupLayout<{ deltaTime: { uniform: F32; }; particles: { storage: (elementCount: number) => WgslArray<WgslStruct<{ position: Vec3f; velocity: Vec3f; }>>; access: "mutable"; }; }>.$: {
    deltaTime: number;
    particles: {
        position: d.v3f;
        velocity: d.v3f;
    }[];
}
$.
deltaTime: number
deltaTime;
});

const simulate: TgpuGuardedComputePipeline<[x: number]>
simulate.
TgpuGuardedComputePipeline<[x: number]>.with(bindGroup: TgpuBindGroup): TgpuGuardedComputePipeline<[x: number]> (+1 overload)
Returns a pipeline wrapper with the specified bind group bound.
Analogous to TgpuComputePipeline.with(bindGroup).
with(
const bindGroup: TgpuBindGroup<{
    deltaTime: {
        uniform: d.F32;
    };
    particles: {
        storage: (elementCount: number) => d.WgslArray<d.WgslStruct<{
            position: d.Vec3f;
            velocity: d.Vec3f;
        }>>;
        access: "mutable";
    };
}>
bindGroup).
TgpuGuardedComputePipeline<[x: number]>.dispatchThreads(x: number): void
Dispatches the pipeline.
Unlike TgpuComputePipeline.dispatchWorkgroups(), this method takes in the
number of threads to run in each dimension.
Under the hood, the number of expected threads is sent as a uniform, and
"guarded" by a bounds check.
dispatchThreads(512);

import 
const tgpu: {
    const: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/constant/tgpuConstant").constant;
    fn: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/function/tgpuFn").fn;
    comptime: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/function/comptime").comptime;
    resolve: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/resolve/tgpuResolve").resolve;
    resolveWithContext: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/resolve/tgpuResolve").resolveWithContext;
    init: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/root/init").init;
    initFromDevice: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/root/init").initFromDevice;
    slot: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/slot/slot").slot;
    lazy: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/slot/lazy").lazy;
    ... 10 more ...;
    '~unstable': typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/tgpuUnstable");
}@module ― typegpu
tgpu, { 
import d
d } from 'typegpu';

const 
const root: TgpuRoot
root = await 
const tgpu: {
    const: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/constant/tgpuConstant").constant;
    fn: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/function/tgpuFn").fn;
    comptime: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/function/comptime").comptime;
    resolve: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/resolve/tgpuResolve").resolve;
    resolveWithContext: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/resolve/tgpuResolve").resolveWithContext;
    init: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/root/init").init;
    initFromDevice: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/root/init").initFromDevice;
    slot: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/slot/slot").slot;
    lazy: typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/core/slot/lazy").lazy;
    ... 10 more ...;
    '~unstable': typeof import("/home/runner/work/TypeGPU/TypeGPU/packages/typegpu/src/tgpuUnstable");
}@module ― typegpu
tgpu.
init: (options?: InitOptions) => Promise<TgpuRoot>
Requests a new GPU device and creates a root around it.
If a specific device should be used instead, use
@see ― initFromDevice. *
@example 
When given no options, the function will ask the browser for a suitable GPU device.
const root = await tgpu.init();
@example 
If there are specific options that should be used when requesting a device, you can pass those in.
const adapterOptions: GPURequestAdapterOptions = ...;
const deviceDescriptor: GPUDeviceDescriptor = ...;
const root = await tgpu.init({ adapter: adapterOptions, device: deviceDescriptor });
init();

const 
const Particle: d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>
Particle = 
import d
d.
struct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>(props: {
    position: d.Vec3f;
    velocity: d.Vec3f;
}): d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>
export struct
Creates a struct schema that can be used to construct GPU buffers.
Ensures proper alignment and padding of properties (as opposed to a d.unstruct schema).
The order of members matches the passed in properties object.
@example const CircleStruct = d.struct({ radius: d.f32, pos: d.vec3f });
@param ― props Record with string keys and TgpuData values,
each entry describing one struct member.
struct({
  
position: d.Vec3f
position: 
import d
d.
const vec3f: d.Vec3f
export vec3f
Schema representing vec3f - a vector with 3 elements of type f32.
Also a constructor function for this vector value.
@example const vector = d.vec3f(); // (0.0, 0.0, 0.0)
const vector = d.vec3f(1); // (1.0, 1.0, 1.0)
const vector = d.vec3f(1, 2, 3.5); // (1.0, 2.0, 3.5)
@example const buffer = root.createBuffer(d.vec3f, d.vec3f(0, 1, 2)); // buffer holding a d.vec3f value, with an initial value of vec3f(0, 1, 2);
vec3f,
  
velocity: d.Vec3f
velocity: 
import d
d.
const vec3f: d.Vec3f
export vec3f
Schema representing vec3f - a vector with 3 elements of type f32.
Also a constructor function for this vector value.
@example const vector = d.vec3f(); // (0.0, 0.0, 0.0)
const vector = d.vec3f(1); // (1.0, 1.0, 1.0)
const vector = d.vec3f(1, 2, 3.5); // (1.0, 2.0, 3.5)
@example const buffer = root.createBuffer(d.vec3f, d.vec3f(0, 1, 2)); // buffer holding a d.vec3f value, with an initial value of vec3f(0, 1, 2);
vec3f,
});

const 
const deltaTimeUniform: TgpuUniform<d.F32>
deltaTimeUniform = 
const root: TgpuRoot
root.
TgpuRoot.createUniform<d.F32>(typeSchema: d.F32, initial?: number | ((buffer: TgpuBuffer<NoInfer<d.F32>>) => void) | undefined): TgpuUniform<d.F32> (+1 overload)
Allocates memory on the GPU, allows passing data between host and shader.
Read-only on the GPU, optimized for small data. For a general-purpose buffer,
use
TgpuRoot.createBuffer
.
@param ― typeSchema The type of data that this buffer will hold.
@param ― initial Either initial value of the buffer, or an initializer to execute on the mapped buffer. (optional)
createUniform(
import d
d.
const f32: d.F32
export f32
A schema that represents a 32-bit float value. (equivalent to f32 in WGSL)
Can also be called to cast a value to an f32.
@example const value = f32(); // 0
@example const value = f32(1.23); // 1.23
@example const value = f32(true); // 1
f32);
const 
const particlesMutable: TgpuMutable<d.WgslArray<d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>>>
particlesMutable = 
const root: TgpuRoot
root.
TgpuRoot.createMutable<d.WgslArray<d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>>>(typeSchema: d.WgslArray<d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>>, initial?: ((buffer: TgpuBuffer<NoInfer<d.WgslArray<d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>>>>) => void) | d.InferInput<NoInfer<d.WgslArray<d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>>>> | undefined): TgpuMutable<d.WgslArray<d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>>> (+1 overload)
Allocates memory on the GPU, allows passing data between host and shader.
Can be mutated in-place on the GPU. For a general-purpose buffer,
use
TgpuRoot.createBuffer
.
@param ― typeSchema The type of data that this buffer will hold.
@param ― initial Either initial value of the buffer, or an initializer to execute on the mapped buffer. (optional)
createMutable(
import d
d.
arrayOf<d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>>(elementType: d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>, elementCount: number): d.WgslArray<d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>> (+1 overload)
export arrayOf
Creates an array schema that can be used to construct gpu buffers.
Describes arrays with fixed-size length, storing elements of the same type.
@example 
const LENGTH = 3;
const array = d.arrayOf(d.u32, LENGTH);
If elementCount is not specified, a partially applied function is returned.
@example const array = d.arrayOf(d.vec3f);
//    ^? (n: number) => WgslArray<d.Vec3f>
@param ― elementType The type of elements in the array.
@param ― elementCount The number of elements in the array.
arrayOf(
const Particle: d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>
Particle, 512));

const 
const simulate: TgpuGuardedComputePipeline<[x: number]>
simulate = 
const root: TgpuRoot
root.
WithBinding.createGuardedComputePipeline<[x: number]>(callback: (x: number) => void): TgpuGuardedComputePipeline<[x: number]>
Creates a compute pipeline that executes the given callback in an exact number of threads.
This is different from createComputePipeline() in that it does a bounds check on the
thread id, where as regular pipelines do not and work in units of workgroups.
@param ― callback A function converted to WGSL and executed on the GPU.
It can accept up to 3 parameters (x, y, z) which correspond to the global invocation ID
of the executing thread.
@example 
If no parameters are provided, the callback will be executed once, in a single thread.
const fooPipeline = root
  .createGuardedComputePipeline(() => {
    'use gpu';
    console.log('Hello, GPU!');
  });

fooPipeline.dispatchThreads();
// [GPU] Hello, GPU!
@example 
One parameter means n-threads will be executed in parallel.
const fooPipeline = root
  .createGuardedComputePipeline((x) => {
    'use gpu';
    if (x % 16 === 0) {
      // Logging every 16th thread
      console.log('I am the', x, 'thread');
    }
  });

// executing 512 threads
fooPipeline.dispatchThreads(512);
// [GPU] I am the 256 thread
// [GPU] I am the 272 thread
// ... (30 hidden logs)
// [GPU] I am the 16 thread
// [GPU] I am the 240 thread
createGuardedComputePipeline((
x: number
x) => {
  'use gpu';
  const 
const particle: {
    position: d.v3f;
    velocity: d.v3f;
}
particle = 
const particlesMutable: TgpuMutable<d.WgslArray<d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>>>
particlesMutable.
TgpuMutable<WgslArray<WgslStruct<{ position: Vec3f; velocity: Vec3f; }>>>.$: {
    position: d.v3f;
    velocity: d.v3f;
}[]
$[
x: number
x];
  
const particlesMutable: TgpuMutable<d.WgslArray<d.WgslStruct<{
    position: d.Vec3f;
    velocity: d.Vec3f;
}>>>
particlesMutable.
TgpuMutable<WgslArray<WgslStruct<{ position: Vec3f; velocity: Vec3f; }>>>.$: {
    position: d.v3f;
    velocity: d.v3f;
}[]
$[
x: number
x].
position: d.v3f
position =
    
const particle: {
    position: d.v3f;
    velocity: d.v3f;
}
particle.
position: d.v3f
position + 
const particle: {
    position: d.v3f;
    velocity: d.v3f;
}
particle.
velocity: d.v3f
velocity * 
const deltaTimeUniform: TgpuUniform<d.F32>
deltaTimeUniform.
TgpuUniform<F32>.$: number
$;
});

const simulate: TgpuGuardedComputePipeline<[x: number]>
simulate.
TgpuGuardedComputePipeline<[x: number]>.dispatchThreads(x: number): void
Dispatches the pipeline.
Unlike TgpuComputePipeline.dispatchWorkgroups(), this method takes in the
number of threads to run in each dimension.
Under the hood, the number of expected threads is sent as a uniform, and
"guarded" by a bounds check.
dispatchThreads(512);

The tradeoff is a little more setup, but the shader is no longer bound to one concrete resource. It can be separated cleanly from runtime resource setup and reused with different bind groups.

Since bind groups are immutable by nature, you need to create a new bind group each time you want to use a new resource. If you have a limited set of combinations, you can cache and reuse them.

Summary

You now know how to dispatch many GPU threads, map them to 1D and 2D data, and keep shader accesses in bounds. You have also seen when fixed resources are enough and when bind groups are useful for reusing the same shader with different buffers.