Blender Cycles "Launch failed in CUDA queue copy from device (integrator_intersect_shadow)" #93046

Closed
opened 2021-11-12 20:44:10 +01:00 by Imme1986 · 27 comments

System Information
Operating system: Windows 10
Graphics card: RTX 3070Ti

Blender Version
3.0 Beta and 2.93.4

Short description of error

Blender Error 2.jpg Blender error 1.jpg

The Viewport Render (Optix) shows errors like Launch failed in CUDA queue copy from device (integrator_intersect_shadow) or closes Blender instantly. This happens randomly in a few minutes. I think its the Principled Volume in Experimental in combination with displacement...

Note from @Alaska: Simplifying this scene is difficult. As the scene gets simpler it gets harder to reproduce the issue. I'm not sure what's causing this issue. I have left the area with trees, the volumetric world material and the car headlights because removing any of these makes it basically impossible for me to reproduce this issue.

Another thing to note is that the errors seem to only occur on Windows. When using Linux, rather than getting errors, the entire GUI would freeze for a bit then continue for a bit and freeze again, repeating until the render stops.

Exact steps for others to reproduce the error
Note: Ufo über Wald, straße (all files).blend appears to be the best file to test with. #93046 - Simplified File.blend does appear to work, at least in the case of @Alaska, but it seems to behave differently from the original file for the system of the original reporter.

  1. Download one of the files below.
  2. Enter rendered viewport mode with GPU Computer with OptiX enabled.
  3. Navigate around the scene, stop for a bit, navigate some more, and repeat until the error occurs. You might need to increase the viewport sample count

Ufo über Wald, straße (all files).blend
#93046 - Simplified File.blend

A Google Drive link to both of these files can be found here. This link is supplied as it may be faster to download it from there then from this site: https://drive.google.com/drive/folders/156XaW2bqIcwOD3VZl32zUdpeC7KrAgbl?usp=sharing

**System Information** Operating system: Windows 10 Graphics card: RTX 3070Ti **Blender Version** 3.0 Beta and 2.93.4 **Short description of error** |![Blender Error 2.jpg](https://archive.blender.org/developer/F11797302/Blender_Error_2.jpg)|![Blender error 1.jpg](https://archive.blender.org/developer/F11797301/Blender_error_1.jpg)| | -- | -- | The Viewport Render (Optix) shows errors like `Launch failed in CUDA queue copy from device (integrator_intersect_shadow)` or closes Blender instantly. This happens randomly in a few minutes. I think its the Principled Volume in Experimental in combination with displacement... > Note from @Alaska: Simplifying this scene is difficult. As the scene gets simpler it gets harder to reproduce the issue. I'm not sure what's causing this issue. I have left the area with trees, the volumetric world material and the car headlights because removing any of these makes it basically impossible for me to reproduce this issue. > >Another thing to note is that the errors seem to only occur on Windows. When using Linux, rather than getting errors, the entire GUI would freeze for a bit then continue for a bit and freeze again, repeating until the render stops. **Exact steps for others to reproduce the error** Note: `Ufo über Wald, straße (all files).blend` appears to be the best file to test with. `#93046 - Simplified File.blend` does appear to work, at least in the case of @Alaska, but it seems to behave differently from the original file for the system of the original reporter. 1. Download one of the files below. 2. Enter rendered viewport mode with GPU Computer with OptiX enabled. 3. Navigate around the scene, stop for a bit, navigate some more, and repeat until the error occurs. You might need to increase the viewport sample count [Ufo über Wald, straße (all files).blend](https://archive.blender.org/developer/F11798200/Ufo_über_Wald__straße__all_files_.blend) [#93046 - Simplified File.blend](https://archive.blender.org/developer/F11799484/T93046_-_Simplified_File.blend) A Google Drive link to both of these files can be found here. This link is supplied as it may be faster to download it from there then from this site: https://drive.google.com/drive/folders/156XaW2bqIcwOD3VZl32zUdpeC7KrAgbl?usp=sharing
Author

Added subscriber: @Immsen

Added subscriber: @Immsen
Member

Added subscriber: @Alaska

Added subscriber: @Alaska
Member

Changed status from 'Needs Triage' to: 'Needs User Info'

Changed status from 'Needs Triage' to: 'Needs User Info'
Member

Having exact steps to reproduce, or a file we can investigate would be really helpful. Are you able to provide those?

It's also entirely possible you're experiencing issues with outdated drivers, are you able to upgrade to the latest driver and see if that helps? https://www.nvidia.com/Download/index.aspx

Having exact steps to reproduce, or a file we can investigate would be really helpful. Are you able to provide those? It's also entirely possible you're experiencing issues with outdated drivers, are you able to upgrade to the latest driver and see if that helps? https://www.nvidia.com/Download/index.aspx
Author

Ufo über Wald, straße (all files).blend Yes, i have the latest driver. No problem, here is the file. Mainly with 3.0 Beta...

[Ufo über Wald, straße (all files).blend](https://archive.blender.org/developer/F11798200/Ufo_über_Wald__straße__all_files_.blend) Yes, i have the latest driver. No problem, here is the file. Mainly with 3.0 Beta...
Member

I personally could not reproduce the errors you're seeing. However that could be down to me using Linux and different GPU drivers to you.
However, I did notice something odd while testing this scene. While rendering in the viewport the entire GUI of my system will freeze. Not just Blender, but everything GUI related including other apps and my desktop environment. This might be the same issue you're experiencing, but manifesting in some other form.
Note: This issue does NOT happen when using CUDA and it doesn't happen with OptiX if the volumetrics are removed. It also doesn't happen if certain parts of the scene is removed. Final renders also doesn't appear to show this issue.

Here's a video demonstrating the freezing. Whenever the video freezes, so did my systems GUI.
Freezing during viewport rendering.mp4

System Information
Operating system: Linux-5.14.0-4-amd64-x86_64-with-glibc2.32 64 Bits
Graphics card: NVIDIA GeForce RTX 3090/PCIe/SSE2 NVIDIA Corporation 4.5.0 NVIDIA 495.44

Blender Versions:
3.1.0 Alpha, branch: master, commit date: 2021-11-13 03:05, hash: ab9ec193c3
3.0.0 Beta, branch: master, commit date: 2021-11-05 22:17, hash: 7a5b8cb202


@Immsen if possible are you able to reduce the complexity of the file down to just the required objects to reproduce the issue? Also, what are the exact steps to reproduce the issue? Enable viewport rendering with OptiX and just do things until the error occurs?

I personally could not reproduce the errors you're seeing. However that could be down to me using Linux and different GPU drivers to you. However, I did notice something odd while testing this scene. While rendering in the viewport the entire GUI of my system will freeze. Not just Blender, but everything GUI related including other apps and my desktop environment. This might be the same issue you're experiencing, but manifesting in some other form. Note: This issue does NOT happen when using CUDA and it doesn't happen with OptiX if the volumetrics are removed. It also doesn't happen if certain parts of the scene is removed. Final renders also doesn't appear to show this issue. Here's a video demonstrating the freezing. Whenever the video freezes, so did my systems GUI. [Freezing during viewport rendering.mp4](https://archive.blender.org/developer/F11798647/Freezing_during_viewport_rendering.mp4) **System Information** Operating system: Linux-5.14.0-4-amd64-x86_64-with-glibc2.32 64 Bits Graphics card: NVIDIA GeForce RTX 3090/PCIe/SSE2 NVIDIA Corporation 4.5.0 NVIDIA 495.44 **Blender Versions:** 3.1.0 Alpha, branch: master, commit date: 2021-11-13 03:05, hash: `ab9ec193c3` 3.0.0 Beta, branch: master, commit date: 2021-11-05 22:17, hash: `7a5b8cb202` --- @Immsen if possible are you able to reduce the complexity of the file down to just the required objects to reproduce the issue? Also, what are the exact steps to reproduce the issue? Enable viewport rendering with OptiX and just do things until the error occurs?
Alaska changed title from blender launch failed in Cuda queue copy from device (integrator intersect schadow) to Blender Cycles "Launch failed in CUDA queue copy from device (integrator_intersect_shadow)" 2021-11-13 08:44:35 +01:00
Member

Changed status from 'Needs User Info' to: 'Confirmed'

Changed status from 'Needs User Info' to: 'Confirmed'
Member

I have simplified the scene and added some extra information to the task.
I am also marking as confirmed.

System Information
Operating system: Windows-10-10.0.22000-SP0 64 Bits (Windows 11)
Graphics card: NVIDIA GeForce RTX 3090/PCIe/SSE2 NVIDIA Corporation 4.5.0 NVIDIA 496.49
Blender version tested: Blender 3.0 Master bd734cc4419a

I have simplified the scene and added some extra information to the task. I am also marking as confirmed. **System Information** Operating system: Windows-10-10.0.22000-SP0 64 Bits (Windows 11) Graphics card: NVIDIA GeForce RTX 3090/PCIe/SSE2 NVIDIA Corporation 4.5.0 NVIDIA 496.49 Blender version tested: Blender 3.0 Master `bd734cc4419a`
Author

I have tested the simple file. Now its not showing the text (Launch failed in CUDA queue....) but the screen turn black for a second again and again, then crash blender with a big black Screen. The video below.
And yes , without Principled Volume the error doesnt happend...Film 4.mp4

I have tested the simple file. Now its not showing the text (Launch failed in CUDA queue....) but the screen turn black for a second again and again, then crash blender with a big black Screen. The video below. And yes , without Principled Volume the error doesnt happend...[Film 4.mp4](https://archive.blender.org/developer/F11801154/Film_4.mp4)
Member

A quick question. In your report you list both Blender 2.93.4 and Blender 3.0 there. Is that because you can reproduce the issue in both versions? Or is the issue only reproducible in Blender 3.0?

I'm personally unable to reproduce any issues with Blender 2.93.4 on Linux.

A quick question. In your report you list both Blender 2.93.4 and Blender 3.0 there. Is that because you can reproduce the issue in both versions? Or is the issue only reproducible in Blender 3.0? I'm personally unable to reproduce any issues with Blender 2.93.4 on Linux.
Author

I´m not shure. The first time the project crash in 2.93 too in a short test. Today only in 3.0.

I´m not shure. The first time the project crash in 2.93 too in a short test. Today only in 3.0.
Member

Since it seems to be significantly more common in Blender 3.0, I will take this as a regression and mark the priority as "High".

Since it seems to be significantly more common in Blender 3.0, I will take this as a regression and mark the priority as "High".

Added subscriber: @Sergey

Added subscriber: @Sergey

I can see unexpected hiccups during rendering. Co-incidentally they happen while intersect shadow kernel is scheduled.

I can see unexpected hiccups during rendering. Co-incidentally they happen while intersect shadow kernel is scheduled.
Sergey Sharybin self-assigned this 2021-11-16 14:14:58 +01:00

Forgot to assign to self. Doing deeper investigation right now.

Forgot to assign to self. Doing deeper investigation right now.
Contributor

Added subscriber: @Raimund58

Added subscriber: @Raimund58

Added subscriber: @pmoursnv

Added subscriber: @pmoursnv

There seems to be an issue in OptiX when specific combination of starting point and ray length causes a slowdown. Here is a repro case I've ended up with:

P2600: (An Untitled Masterwork)

diff --git a/intern/cycles/kernel/bvh/bvh.h b/intern/cycles/kernel/bvh/bvh.h
index 0e083812355..682f020bed1 100644
--- a/intern/cycles/kernel/bvh/bvh.h
+++ b/intern/cycles/kernel/bvh/bvh.h
@@ -383,11 +383,22 @@ ccl_device_intersect bool scene_intersect_shadow_all(KernelGlobals kg,
     ray_mask = 0xFF;
   }
 
+  /* About -9454910.00000000 68784048.00000000 136430800.00000000 */
+  const float3 P = make_float3(
+      __uint_as_float(3406841150), __uint_as_float(1283666422), __uint_as_float(1291983949));
+
+  /* About 0.06176370 -0.44932911 -0.89122874 */
+  const float3 D = make_float3(
+      __uint_as_float(1031601136), __uint_as_float(3202748023), __uint_as_float(3211011985));
+
+  /* About 153081664.000000 */
+  const float t = __uint_as_float(1293024628);
+
   optixTrace(scene_intersect_valid(ray) ? kernel_data.bvh.scene : 0,
-             ray->P,
-             ray->D,
+             P,
+             D,
              0.0f,
-             ray->t,
+             t,
              ray->time,
              ray_mask,
              /* Need to always call into __anyhit__kernel_optix_shadow_all_hit. */
@@ -402,6 +413,8 @@ ccl_device_intersect bool scene_intersect_shadow_all(KernelGlobals kg,
              p4,
              p5);
 
+  return false;
+
   *num_recorded_hits = uint16_unpack_from_uint_0(p2);
   *throughput = __uint_as_float(p1);
 

Applying this patch and rendering any scene with transparent shadow goes super slow. For example, I've been rendering the classroom.blend scene on OptiX with this patch applied and it was taking a very long time per sample (point is: slowdown is not dependent on scene).

I've used an uint-as-float trick to have enough precision when assigning values. There are readable floating point values in the comments.

Lowering t to 100 makes rendering super fast again (the render does not contain any shadow, but this is expected due to return false; which I've used to have predictable result of render regardless of what values are passed to the optixTrace.

Such huge values are coming from equiangular volume sampling which happens over an infinite ray (ray with length of FTL_MAX). In the case of the .blend file from this report we might be able to tweak some termination/power heuristics and avoid shading points so far away, but it is not possible to avoid big numbers in all cases (i,e, there could be a cloud far away in a sparse atmosphere).

@pmoursnv, Can you help investigating the issue from your side? Is it some numeric stability which leads to inf values causing OptiX to traverse all the nodes?

There seems to be an issue in OptiX when specific combination of starting point and ray length causes a slowdown. Here is a repro case I've ended up with: [P2600: (An Untitled Masterwork)](https://archive.blender.org/developer/P2600.txt) ``` diff --git a/intern/cycles/kernel/bvh/bvh.h b/intern/cycles/kernel/bvh/bvh.h index 0e083812355..682f020bed1 100644 --- a/intern/cycles/kernel/bvh/bvh.h +++ b/intern/cycles/kernel/bvh/bvh.h @@ -383,11 +383,22 @@ ccl_device_intersect bool scene_intersect_shadow_all(KernelGlobals kg, ray_mask = 0xFF; } + /* About -9454910.00000000 68784048.00000000 136430800.00000000 */ + const float3 P = make_float3( + __uint_as_float(3406841150), __uint_as_float(1283666422), __uint_as_float(1291983949)); + + /* About 0.06176370 -0.44932911 -0.89122874 */ + const float3 D = make_float3( + __uint_as_float(1031601136), __uint_as_float(3202748023), __uint_as_float(3211011985)); + + /* About 153081664.000000 */ + const float t = __uint_as_float(1293024628); + optixTrace(scene_intersect_valid(ray) ? kernel_data.bvh.scene : 0, - ray->P, - ray->D, + P, + D, 0.0f, - ray->t, + t, ray->time, ray_mask, /* Need to always call into __anyhit__kernel_optix_shadow_all_hit. */ @@ -402,6 +413,8 @@ ccl_device_intersect bool scene_intersect_shadow_all(KernelGlobals kg, p4, p5); + return false; + *num_recorded_hits = uint16_unpack_from_uint_0(p2); *throughput = __uint_as_float(p1); ``` Applying this patch and rendering any scene with transparent shadow goes super slow. For example, I've been rendering the `classroom.blend` scene on OptiX with this patch applied and it was taking a very long time per sample (point is: slowdown is not dependent on scene). I've used an uint-as-float trick to have enough precision when assigning values. There are readable floating point values in the comments. Lowering `t` to 100 makes rendering super fast again (the render does not contain any shadow, but this is expected due to `return false;` which I've used to have predictable result of render regardless of what values are passed to the `optixTrace`. Such huge values are coming from equiangular volume sampling which happens over an infinite ray (ray with length of `FTL_MAX`). In the case of the .blend file from this report we might be able to tweak some termination/power heuristics and avoid shading points so far away, but it is not possible to avoid big numbers in all cases (i,e, there could be a cloud far away in a sparse atmosphere). @pmoursnv, Can you help investigating the issue from your side? Is it some numeric stability which leads to `inf` values causing OptiX to traverse all the nodes?
Member

The problem here is the distance the ray travels before it can hit something. Essentially more nodes along the ray path are checked the further away the ray started at currently, to avoid precision issues. In this case the ray is so far away from any object in the scene, that a majority of the nodes in the scene are traversed, as you suspected, which takes a long time (primarily limited by the bandwidth required to transfer all that acceleration structure memory to the chip, rather than just a couple nodes as usual).

There are ways to get around this problem for now:

  • Could calculate a bounding box around the entire scene and snap rays that are outside that to the surface of the bounding box, so that it is more likely that a ray never has to travel long before it hits anything.
  • Or could split up the ray into smaller segments (limit t to some reasonable value and if nothing was hit, trace a new ray that starts at the end of the last one, and so on, until the original t is reached).
The problem here is the distance the ray travels before it can hit something. Essentially more nodes along the ray path are checked the further away the ray started at currently, to avoid precision issues. In this case the ray is so far away from any object in the scene, that a majority of the nodes in the scene are traversed, as you suspected, which takes a long time (primarily limited by the bandwidth required to transfer all that acceleration structure memory to the chip, rather than just a couple nodes as usual). There are ways to get around this problem for now: - Could calculate a bounding box around the entire scene and snap rays that are outside that to the surface of the bounding box, so that it is more likely that a ray never has to travel long before it hits anything. - Or could split up the ray into smaller segments (limit `t` to some reasonable value and if nothing was hit, trace a new ray that starts at the end of the last one, and so on, until the original `t` is reached).

Added subscriber: @brecht

Added subscriber: @brecht

For the issue with this particular .blend file, I think we should not be doing uniform sampling for distant lights + world volumes. For a finite sized volume it makes sense, but over such a long distance there is no point, anything so far away is not going to contribute.

There are other cases where such positions happen. Even with distance sampling, if the density is very low you can still end up arbitrarily far away. I'm not sure if that's common enough to add a check for this. I don't remember a report like this before, though this is probably the kind of thing that people might not report even if they do notice it. Hard to say if it's worth fixing.

I don't think splitting up the ray is really going to be faster, that's going to be a lot of segments for a distance this large?

Clipping the ray to the scene bounding box seems like the most practical solution. We'd need to be careful to ensure the recorded intersection distances are adjusted accordingly, if we do it isolated in the scene intersection function. Doing it on the ray in the integrator state may also be possible, but then we need to be careful to set sd->ray_length and mis_ray_t to the appropriate values.

For the issue with this particular .blend file, I think we should not be doing uniform sampling for distant lights + world volumes. For a finite sized volume it makes sense, but over such a long distance there is no point, anything so far away is not going to contribute. There are other cases where such positions happen. Even with distance sampling, if the density is very low you can still end up arbitrarily far away. I'm not sure if that's common enough to add a check for this. I don't remember a report like this before, though this is probably the kind of thing that people might not report even if they do notice it. Hard to say if it's worth fixing. I don't think splitting up the ray is really going to be faster, that's going to be a lot of segments for a distance this large? Clipping the ray to the scene bounding box seems like the most practical solution. We'd need to be careful to ensure the recorded intersection distances are adjusted accordingly, if we do it isolated in the scene intersection function. Doing it on the ray in the integrator state may also be possible, but then we need to be careful to set `sd->ray_length` and `mis_ray_t` to the appropriate values.

This issue was referenced by blender/cycles@341b77e558

This issue was referenced by blender/cycles@341b77e558ce9f6d97d6cbd8917067a303e06df6

This issue was referenced by 1b686c60b5

This issue was referenced by 1b686c60b5a7f7f7604d7ba5012aa5afa15f0d07

Changed status from 'Confirmed' to: 'Resolved'

Changed status from 'Confirmed' to: 'Resolved'

Added subscriber: @omerati54

Added subscriber: @omerati54

Hi!

After using Blender for the lasy year, i was always having this optix and CUDA issues. I have Intel 10700KF and RTX 3070.

I saw someone pointed here that this issue is something that you cant figure if its a specific scene problem or a common issue, but for the last half a year im having this kind of problem.
and of course a lot of optix and cuda issues. almost every render im trying to achieve is having this issue wich causes me to keep my work in my drawer, tho i love blender.

ill be the happiest man alive if this issue will get figured!

Hi! After using Blender for the lasy year, i was always having this optix and CUDA issues. I have Intel 10700KF and RTX 3070. I saw someone pointed here that this issue is something that you cant figure if its a specific scene problem or a common issue, but for the last half a year im having this kind of problem. and of course a lot of optix and cuda issues. almost every render im trying to achieve is having this issue wich causes me to keep my work in my drawer, tho i love blender. ill be the happiest man alive if this issue will get figured!
Contributor

@omerati54 If you still encounter this issue, please open a new bug report. If you haven't updated Blender yet, you should do it before opening a new bug report.

@omerati54 If you still encounter this issue, please open a new bug report. If you haven't updated Blender yet, you should do it before opening a new bug report.
Sign in to join this conversation.
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
8 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: blender/blender#93046
No description provided.