Page MenuHome

Cycles: Allow PTX targets for CUDA kernel build.

Authored by Stefan Werner (swerner) on Oct 1 2019, 12:42 PM.



This is intended for developers on Windows primarily:
Now, CUDA architectures of type compute_xx are supported. This allows for quicker builds,
at the expense of the CUDA driver running ptxas the first time a kernel is loaded.

This can be held for 2.82.

Diff Detail

rB Blender

Event Timeline

Seems reasonable to me, just one comment.


These two cases are identical as far as I can see.

  • Merge branch 'cuda_ptx' of into cuda_ptx
  • Fixed a typo.

Do you have more reasoning on why ? and why this would benefit windows devs more than other devs?

Also who's time are we saving?

On a developer workstation we're moving a build time cost to a run time cost, so effectively there is no time saved except i'm now waiting twice. (Assuming a dev will/should only build for the GPU he has)

If we were to ship this to end users (and only ship with compute_* kernels) we exchange a one time buildbot cost for a cost on every users workstation wasting power and time on a global scale, which is not a very socially responsible thing to do either.

The only time that this saves time if a dev has all kernels enabled, but only runs a single architecture, which should be a relatively rare use case?

I have nothing against this landing, seems fine, i'm just trying to grasp what problem we are solving

The main helpful thing for me is that this is an option I can always have enabled, whereas WITH_CYCLES_CUDA_BINARIES I quite often enable/disable to save on compile time.

With this option I can catch CUDA build errors early as I'm refactoring code, instead of later on when running Blender.

I don't understand why "sm" means "cubin" and "compute" means "ptx", is that a standard thing?

Would be nice if this could be less hidden, but there's probably only a handful developers who would use this anyway.

It's the terms nvidia users to distinguish between high level virtual architecture (ptx) and actual code(sm)

Ok, that's fine then.

This is a bit hard to discover, but I can document it on

This revision is now accepted and ready to land.Oct 1 2019, 3:46 PM
  • Cleanup: Fix naming of a functio - Cycles: Added "compute_xx" as architecture options to CUDA kernels.

I clearly don't know how to use phabricator.

  • Merge branch 'master' into cuda_ptx