Skip to content

GPU Module Overview (Legacy)

This document is being replaced by GPU Module Overview, but still here since it needs to be checked if any information here should be merged into that page.

Drawing pipeline

This section gives an overview of the drawing pipeline of the GPU module.


Textures are used to hold pixel data. Textures can be 1, 2 or 3 dimensional, cubemap and an array of 2d textures/cubemaps. The internal storage of a texture (how the pixels are stored in memory on the GPU) can be set when creating a texture.

/* Create an empty texture with HD resolution where pixels are stored as half floats. */
GPUTexture *texture = GPU_texture_create_2d("MyTexture", 1920, 1080, 1,0, GPU_RGBA16F, NULL);

Frame buffer

A frame buffer is a group of textures you can render onto. These textures are arranged in a fixed set of slots. The first slot is reserved for a depth/stencil buffer. The other slots can be filled with regular textures, cube maps, or layer textures.

GPU_framebuffer_ensure_config is used to create/update a framebuffer.

GPUFramebuffer *fb = NULL;
GPU_framebuffer_ensure_config(&fb, {
  GPU_ATTACHMENT_NONE, // Slot reserved for depth/stencil buffer.

Shader program

A GPUShader is a program that runs on the GPU. The program can have several stages depending on the its usage. When rendering geometry it should at least have a vertex and fragment stage, it can have an optional geometry stage. It is not recommended to use geometry stages as Apple doesn't have support for it.

The order of execution of stages have a fixed order. When drawing geometry, first the vertex stage is performed, then the geometry stage (when available), and then the fragment stage. The logic of these stages can be loaded with GLSL-code.

GPUShader *sh_depth = GPU_shader_create_from_arrays({
  .vert = (const char *[]) {my_vert_glsl_code, NULL},
  .frag = (const char *[]) {my_frag_glsl_code, NULL},

This will create a GPUShader load and compile the vertex and fragment stage and link the stages into a program that can be used on the GPU. It also generates a GPUShaderInterface that handles lookup to input parameters (attributes, uniforms, uniform buffers, textures and shader storage buffer objects).

Cross Compilation

Our target is to cross compile GLSL to OpenGL3/4 and Vulkan. To create shaders that can be cross compiled the GPUShaderCreateInfo should be used when creating shaders.

This mechanism is introduced in Blender 3.1 and we are currently in the process of migrating all internal shaders we expect that all internal shaders will be migrated in Blender 3.2.

See GLSL Cross Compilation for more details.


Geometry is defined by a GPUPrimType, one index buffer (IBO) and one or more vertex buffers (VBOs). The GPUPrimType defines how the index buffer should be interpreted.

Indices inside the index buffer define the order how to read elements from the vertex buffer(s). Vertex buffers are a table where each row contains the data of an element. When multiple vertex buffers are used they are considered to be different columns of the same table. This matches how GL backends organize geometry on GPUs.

Index buffers

Index buffers can be created by using a GPUIndexBufferBuilder

GPUIndexBufBuilder ibuf
/* Construct a builder to create an index buffer that has 6 indexes.
 * And the number of elements in the vertex buffer is 12. */
GPU_indexbuf_init(&ibuf GPU_PRIM_TRIS, 6, 12);

GPU_indexbuf_add_tri_verts(&ibuf, 0, 1, 2);
GPU_indexbuf_add_tri_verts(&ibuf, 2, 1, 3);
GPU_indexbuf_add_tri_verts(&ibuf, 4, 5, 6);
GPU_indexbuf_add_tri_verts(&ibuf, 6, 5, 7);
GPU_indexbuf_add_tri_verts(&ibuf, 8, 9, 10);
GPU_indexbuf_add_tri_verts(&ibuf, 10, 9, 11);

GPUIndexBuf *ibo = GPU_indexbuf_build(&builder)

Vertex buffer

Vertex buffers contain data and attributes inside vertex buffers should match the attributes of the shader. Before a buffer can be created, the format of the buffer should be defined.

static GPUVertFormat format = {0};
GPU_vertformat_attr_add(&format, "pos", GPU_COMP_32, 2, GPU_FETCH_FLOAT);

Create a vertex buffer with this format and allocate the elements.

GPUVertBuf *vbo = GPU_vertbuf_create_with_format(&format);
GPU_vertbuf_data_alloc(vbo, 12);

Fill the buffer with the data.

for (int i = 0; i < 12; i ++) {
 GPU_vertbuf_attr_set(vbo,, i, positions[i]);


Use GPUBatches to draw geometry. A GPUBatch combines the geometry with a shader and its parameters and has functions to perform a draw call. To perform a draw call the next steps should be taken.

  1. Construct its geometry.
  2. Construct a GPUBatch with the geometry.
  3. Attach a GPUShader to the GPUBatch with the GPU_batch_set_shader function or attach a built in shader using the GPU_batch_program* functions.
  4. Set the parameters of the GPUShader using the GPU_batch_uniform*/GPU_batch_texture_bind functions.
  5. Perform a GPU_batch_draw* function.

This will draw on the geometry on the active frame buffer using the shader and the loaded parameters.


GPUTextures can be used as render target or as input of a shader, but not inside the same drawing call.

Immediate mode and built in shaders

To ease development for drawing panels/UI buttons the GPU module provides an immediate mode. This is a wrapper on top of what is explained above, but in a more legacy opengl fashion.

Blender provides builtin shaders. This is widely used to draw the user interface. A shader can be activated by calling immBindBuiltinProgram


This shader program needs a vertex buffer with a pos attribute, and a color can be set as uniform.

GPUVertFormat *format = immVertexFormat();
uint pos = GPU_vertformat_attr_add(format, "pos", GPU_COMP_F32, 2, GPU_FETCH_FLOAT);

/* Set the color attribute of the shader. */
immUniformColor4f(0.0f, 0.5f, 0.0f, 1.0f);

Fill the vertex buffer with the starting and ending position of the line to draw.

/* Construct a line index buffer with 2 elements (start point and end point to draw) */
immBegin(GPU_PRIM_LINES, 2);
immVertex2f(pos, 0.0, 100.0);
immVertex2f(pos, 100.0, 0.0);

By calling immEnd the data drawn on the GPU.


Use GPUBatches directly in cases where performance matters. Immediate mode buffers aren't cached, which can lead to poor performance.

Compute pipeline

Next to drawing geometry on a texture you can also use the GPU module for computational tasks. Currently the compute pipeline should only be used after checking if the platform can handle compute tasks. There should always be a fallback implemented for the CPU if the platform doesn't support compute. We expect that in 2022 all platforms will support compute capabilities.

GPU_compute_shader_support can be called to check if the platform supports compute tasks.

A compute task is a variant of a GPUShader that only has a compute stage.

GPUShader *shader = GPU_shader_create_compute(compute_glsl, nullptr, nullptr, "gpu_shader_compute_2d");

To activate the program the shader should be bound to the GPU device.


After the bind the parameters can be loaded with the GPU_texture_(image_)bind, GPU_shader_uniform*, functions.

The compute task can be called by the GPU_compute_dispatch function. In source/blender/gpu/tests/ there are several examples on how to use compute pipeline.

GPU_compute_dispatch(shader, 100, 100, 1);


Debugging on GPUs can be difficult as you cannot step through your code with a debugger. Tools like renderdoc help to detect what a call actually does on the GPU by recording the state before and after each call.

Starting blender with the --debug-gpu parameter blender will add more context to ease debugging.