CUDA error: Unknown error in cuCtxSynchronize() : does not recover #36986

Closed
opened 2013-10-07 16:58:46 +02:00 by Max Christian · 10 comments

%%%I could reproduce this bug using windows xp 32 bit and linux 64 using a geforce gtx 580 card and there are many reporters of this bug, but noone i found used animations. I decided to open this report, because I came to the opinion that my observations are going deeper than those I found here, so that I cannot really say if they belong together.

the bug: while rendering an animation with cuda a cuCtxSynchronize-error occurs at some point and the following frames ain't rendered any more. it isn't always the same frame, but seems to be roughly after the same count of frames rendered. additionally i noticed, that glossy surfaces started to glow in purple without additional geometry entering the scene. in the console i found "CUDA error: Launch failed in cuMemFree(cuda_device_ptr(mem.device_pointer))" making me think, that there is a memory leak, which influences following frames in the animation. interesting: after canceling the rendering of the animation and redoing it it doesn't work and presents the same message immediately, but using the preview-renderer works fine AND after using it I am even able to continue rendering the animation! it seems to me as if all the code to reinit the cuda-engine is there, but should be executed before rendering any frame.

  • sadly i was not allowed to redistribute the textures i used
  • but i included the video which came out: i restarted once%%%
%%%I could reproduce this bug using windows xp 32 bit and linux 64 using a geforce gtx 580 card and there are many reporters of this bug, but noone i found used animations. I decided to open this report, because I came to the opinion that my observations are going deeper than those I found here, so that I cannot really say if they belong together. the bug: while rendering an animation with cuda a cuCtxSynchronize-error occurs at some point and the following frames ain't rendered any more. it isn't always the same frame, but seems to be roughly after the same count of frames rendered. additionally i noticed, that glossy surfaces started to glow in purple without additional geometry entering the scene. in the console i found "CUDA error: Launch failed in cuMemFree(cuda_device_ptr(mem.device_pointer))" making me think, that there is a memory leak, which influences following frames in the animation. interesting: after canceling the rendering of the animation and redoing it it doesn't work and presents the same message immediately, but using the preview-renderer works fine AND after using it I am even able to continue rendering the animation! it seems to me as if all the code to reinit the cuda-engine is there, but should be executed before rendering any frame. * sadly i was not allowed to redistribute the textures i used * but i included the video which came out: i restarted once%%%
Author

Changed status to: 'Open'

Changed status to: 'Open'

#38517 was marked as duplicate of this issue

#38517 was marked as duplicate of this issue

%%%This file has Cache BVH and Persistent Images enabled, does disabling either of those avoid the problem (I suspect especially Persistent Images)?

Which exact Blender version are you testing this with?%%%

%%%This file has Cache BVH and Persistent Images enabled, does disabling either of those avoid the problem (I suspect especially Persistent Images)? Which exact Blender version are you testing this with?%%%

%%%Max, please get back to us on this. %%%

%%%Max, please get back to us on this. %%%
Author

%%%I rerendered with Cache BVH and Persistent Images disabled, but the error appeared again. As a consequence I played around with the sampling and it seems as if I could avoid the cuCtxSynchronize-error by using less than 10 samples, although the purple glow fades in and suddenly disappears like before. How else can individual frames be linked? %%%

%%%I rerendered with Cache BVH and Persistent Images disabled, but the error appeared again. As a consequence I played around with the sampling and it seems as if I could avoid the cuCtxSynchronize-error by using less than 10 samples, although the purple glow fades in and suddenly disappears like before. How else can individual frames be linked? %%%
Author

I think I found the issue! I had connected a Glass BSDF output to the color input of a Transparent BSDF node by accident. I found the message

Cycles shader graph connect: can only connect closure to closure (bsdf.BSDF to transparent.Color).

by using my new script, which starts, renders and quits blender for each individual frame. I thought that might help, because I had more luck with individual frames than with rendering animations. Here comes the script:

batch-render.bat:

@echo off
for /L %%N IN (1, 1, 1000) DO (
  :start
  echo rendering Frame %%N...
  x:\blender-2.69-windows32\blender.exe -b zahnraeder-apply-mods.blend -f %%N > %%N.TXT
  IF NOT ERRORLEVEL 0 (
    echo "error while rendering, retrying..."
    goto start
  )
)

Notice: This batch creates a txt-file for each individual frame (see: > %%N.TXT), which can be used to review failures later on. I haven't seen the "error while rendering" message since than. That's why I am pretty sure it had to do with the message above.

I think I found the issue! I had connected a `Glass BSDF` output to the `color` input of a `Transparent BSDF` node by accident. I found the message ``` Cycles shader graph connect: can only connect closure to closure (bsdf.BSDF to transparent.Color). ``` by using my new script, which starts, renders and quits blender for each individual frame. I thought that might help, because I had more luck with individual frames than with rendering animations. Here comes the script: `batch-render.bat`: ``` @echo off for /L %%N IN (1, 1, 1000) DO ( :start echo rendering Frame %%N... x:\blender-2.69-windows32\blender.exe -b zahnraeder-apply-mods.blend -f %%N > %%N.TXT IF NOT ERRORLEVEL 0 ( echo "error while rendering, retrying..." goto start ) ) ``` Notice: This batch creates a txt-file for each individual frame (see: `> %%N.TXT`), which can be used to review failures later on. I haven't seen the "error while rendering" message since than. That's why I am pretty sure it had to do with the message above.
Author

fail again. I found out, that the violet shine is introduced by ffmpeg when using it together with png graphics from blender. But the cuda error appeared again after I tweaked some render settings. Sadly my script doesn't work, because blender does not return a failure state as ERRORLEVEL/ return value, when the cuda error occurs. I will continue trying and keep you updated.

batch-render.sh

#!/bin/bash

for i in {1..1000}; do
  export NUMBER="$i"
  export FILENAME="$NUMBER.txt"
  echo "error" > $FILENAME
  while grep --quiet "error" $FILENAME
  do
          echo rendering Frame $NUMBER, Debuggingfile: $FILENAME...
          /home/max/Desktop/blender-2.69-linux-glibc211-x86_64/blender -b zahnraeder-apply-mods.blend -f $NUMBER 1> $FILENAME 2>&1
          if [[ $? == 0 ]]; then
                echo "SUCCESS" >> $FILENAME
          else
            echo "error" >> $FILENAME
          fi
  done
done;



fail again. I found out, that the violet shine is introduced by ffmpeg when using it together with png graphics from blender. But the cuda error appeared again after I tweaked some render settings. Sadly my script doesn't work, because blender does not return a failure state as ERRORLEVEL/ return value, when the cuda error occurs. I will continue trying and keep you updated. batch-render.sh #!/bin/bash ``` for i in {1..1000}; do export NUMBER="$i" export FILENAME="$NUMBER.txt" echo "error" > $FILENAME while grep --quiet "error" $FILENAME do echo rendering Frame $NUMBER, Debuggingfile: $FILENAME... /home/max/Desktop/blender-2.69-linux-glibc211-x86_64/blender -b zahnraeder-apply-mods.blend -f $NUMBER 1> $FILENAME 2>&1 if [[ $? == 0 ]]; then echo "SUCCESS" >> $FILENAME else echo "error" >> $FILENAME fi done done; ```

Added subscriber: @ViktorForgacs

Added subscriber: @ViktorForgacs

◀ Merged tasks: #38517.

◀ Merged tasks: #38517.
Author

Finally I had success rendering the scene!

I could fix that error by debugging my node setup. Sadly I cannot tell, what helped me, but:

  • I found a color node which was connected to a wrong input (to a shader if i remember correctly) - my best guess is that this was the errors source.
  • I found a shader output which was connected to an incompatible input ('volume' or such)
  • I replaced some mix shader nodes, where one input was missing
    After fixing these errors and some optimizations in the geometry, where i had a 'chink of light' AND after restarting my xorg server I was able to render all 1000 images without any error. It seems as if the xorg-server initialization would somehow reset the video cards memory.
Finally I had success rendering the scene! I could fix that error by debugging my node setup. Sadly I cannot tell, what helped me, but: - I found a color node which was connected to a wrong input (to a shader if i remember correctly) - my best guess is that this was the errors source. - I found a shader output which was connected to an incompatible input ('volume' or such) - I replaced some mix shader nodes, where one input was missing After fixing these errors and some optimizations in the geometry, where i had a 'chink of light' AND after restarting my xorg server I was able to render all 1000 images without any error. It seems as if the xorg-server initialization would somehow reset the video cards memory.
Sign in to join this conversation.
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
4 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: blender/blender#36986
No description provided.