GPU rendering hangs system on 2.7 not on 2.69 #39559

Closed
opened 2014-04-02 10:39:28 +02:00 by Maverick · 32 comments

System Information
win7 64 bit gtx titan i7

Blender Version
Broken: blender-2.70-6aa75d3-win64
Worked: 2.69 win64 official

Short description of error
This can be observed on heavy scenes (instanced particles, grass, heavy geometry).
On specified git build it is impossible to use the operating system during Cycles GPU rendering.
Almost impossible to even stop blender from the task manager.

However using 2.69 there is no problem on the same file.

Exact steps for others to reproduce the error
Try to reproduce with a large complex file.

**System Information** win7 64 bit gtx titan i7 **Blender Version** Broken: blender-2.70-6aa75d3-win64 Worked: 2.69 win64 official **Short description of error** This can be observed on heavy scenes (instanced particles, grass, heavy geometry). On specified git build it is impossible to use the operating system during Cycles GPU rendering. Almost impossible to even stop blender from the task manager. However using 2.69 there is no problem on the same file. **Exact steps for others to reproduce the error** Try to reproduce with a large complex file.
Author

Changed status to: 'Open'

Changed status to: 'Open'
Author

Added subscriber: @dragostanasie

Added subscriber: @dragostanasie
Author

Note: this is not related to the TdrDelay registry hack as it works fine with previous blender versions (eg 2.69)

Note: this is not related to the TdrDelay registry hack as it works fine with previous blender versions (eg 2.69)

Added subscriber: @ThomasDinges

Added subscriber: @ThomasDinges

The only real solution here it to use 2 GPUs, 1 dedicated for Display, 1 for rendering.

We have no way of "limiting" the GPU load atm.

The only real solution here it to use 2 GPUs, 1 dedicated for Display, 1 for rendering. We have no way of "limiting" the GPU load atm.
Author

No problem when running with just one GPU on 2.69.

2.70 freezes both blender and the entire system.

No problem when running with just one GPU on 2.69. 2.70 freezes both blender and the entire system.

Added subscriber: @brecht

Added subscriber: @brecht

Added subscriber: @marcog

Added subscriber: @marcog

Unfortunately I experienced quite the same, even worse.

My specs: Win7 64 bit - Nvidia 570 1.3Gb - single card - latest drivers from past months or so, don't remember because now my system updated to 335.23 (yesterday)

I haven't submitted a bug because the error is totally random here and can't reproduce. Happens even on simple scenes which takes few tens of Mb of memory, so i'd exclude an exceeding memory problem.

The worst case has been that it freezed without even rendering, but during modeling and another time while simulating smoke in viewport to test new recent features. So i really fear that the dual gpu solution might not solve this, because i'm not entirely sure it's rendering related.

The error i get is black screen like when Windows gives the runtime error/lost connection with display driver etc...but now instead of closing Blender, stays black and i have to force reboot, with more risks for the system.

Last note, i had these problems with 2.69.x too, with daily builds, the more near to 2.70, the more freezes i got. Sorry i have no precise datas.

Couldn't be something related to the new feature of blocking the UI introduced some time ago? https://developer.blender.org/D142
Or perhaps is Nvidia drivers or Microsoft updates related?

Unfortunately I experienced quite the same, even worse. My specs: Win7 64 bit - Nvidia 570 1.3Gb - single card - latest drivers from past months or so, don't remember because now my system updated to 335.23 (yesterday) I haven't submitted a bug because the error is totally random here and can't reproduce. Happens even on simple scenes which takes few tens of Mb of memory, so i'd exclude an exceeding memory problem. The worst case has been that it freezed without even rendering, but during modeling and another time while simulating smoke in viewport to test new recent features. So i really fear that the dual gpu solution might not solve this, because i'm not entirely sure it's rendering related. The error i get is black screen like when Windows gives the runtime error/lost connection with display driver etc...but now instead of closing Blender, stays black and i have to force reboot, with more risks for the system. Last note, i had these problems with 2.69.x too, with daily builds, the more near to 2.70, the more freezes i got. Sorry i have no precise datas. Couldn't be something related to the new feature of blocking the UI introduced some time ago? https://developer.blender.org/D142 Or perhaps is Nvidia drivers or Microsoft updates related?
Author

Could this be the reason of system freeze https://developer.blender.org/rB1d016758330b ?

I don't have a build before 6 of March to test though.

Could this be the reason of system freeze https://developer.blender.org/rB1d016758330b ? I don't have a build before 6 of March to test though.
Member

Added subscriber: @MartijnBerger

Added subscriber: @MartijnBerger
Member

"Could this be the reason of system freeze https://developer.blender.org/rB1d016758330b ?

I don't have a build before 6 of March to test though."

Yes it could but there are many other candidates. The large changes to cycles that @brecht made with his commit wave of the 29th and 30th of march are a much more likely candidate to me. I might have a build from the 28th or ill build one for you to test.

"Could this be the reason of system freeze https://developer.blender.org/rB1d016758330b ? I don't have a build before 6 of March to test though." Yes it could but there are many other candidates. The large changes to cycles that @brecht made with his commit wave of the 29th and 30th of march are a much more likely candidate to me. I might have a build from the 28th or ill build one for you to test.
Author

I have a build from 7th of march and it has the same problem, locks-down the entire OS.

Retested 2.69 again though and it works smoothly.
Not only is the operating system responsive but even Blender itself is responsive (you can interact with the UI, the render window updates nicely).

I have a build from 7th of march and it has the same problem, locks-down the entire OS. Retested 2.69 again though and it works smoothly. Not only is the operating system responsive but even Blender itself is responsive (you can interact with the UI, the render window updates nicely).
Member

@dragostanasie you are aware that that change is NOT in 2.70 official right ?

so if 2.70 official is broken to it is not that change.

@dragostanasie you are aware that that change is NOT in 2.70 official right ? so if 2.70 official is broken to it is not that change.
Author

Git hash number does not help much in identifying these.

So let me summarize this again:

  • 2.69 official works without problems.
  • 7th of March build (blender-2.70-571f184-win64) fails
  • 31st of March build (blender-2.70-6aa75d3-win64) fails
Git hash number does not help much in identifying these. So let me summarize this again: - 2.69 official works without problems. - 7th of March build (blender-2.70-571f184-win64) fails - 31st of March build (blender-2.70-6aa75d3-win64) fails
Author

NVIDIA Driver is 335.23 (3/10/2014)

This might be relevant.

NVIDIA Driver is 335.23 (3/10/2014) This might be relevant.
Member

maverick can you try 2.70 official ? it that works you might have found the problem.

maverick can you try 2.70 official ? it that works you might have found the problem.
Author

Yes, 2.70 win64 official works as good as 2.69 official.

Yes, 2.70 win64 official works as good as 2.69 official.
Member

So 1d01675833 is the most likely candidate then.
Ill look into making that behaviour optional. It provides great benefits for cases where you have more then 1 GPU in terms of CPU usage.

So 1d01675833 is the most likely candidate then. Ill look into making that behaviour optional. It provides great benefits for cases where you have more then 1 GPU in terms of CPU usage.
Author

Thanks juicyfruit !
I'm glad you found the issue, I was afraid it was due to recent nvidia driver update.

Will wait for the commit.

Thanks juicyfruit ! I'm glad you found the issue, I was afraid it was due to recent nvidia driver update. Will wait for the commit.
Author

Is it possible to disable the Streams and Async feature when only one GPU is present until a more elegant solution is found ?
On single GPU machines you cannot use the CPU for rendering at the same time because you can no longer stop blender and need to reset the machine.
Thanks.

Is it possible to disable the Streams and Async feature when only one GPU is present until a more elegant solution is found ? On single GPU machines you cannot use the CPU for rendering at the same time because you can no longer stop blender and need to reset the machine. Thanks.
Member

One thing we could do is make this user selectable. But that is not a good solution. The other thing is revert to busy waiting when using just one GPU but that is also bad as some people with a dedicated GPU use just one GPU for rendering. Detecting that you want to render on a GPU that you also use for display is kind of hard.

@ThomasDinges what do you think is the best solution?

The problem comes from the fact that the async kernel queues enough tasks that the gpu only has to report back once a second and then only for a very brief period. For me this yields good performance even in a 1 card scenario but the computer is unusable during rendering other then viewing the progress.
In the old situation only 1 kernel gets queued and then the computer busy-waits for it. literally maxing out 1 core 100% per card to get an answer this wields a more interactive computer is you have both multiple cores and and decent GPU at the expense of burning some extra watts of CPU on nothing. For a 4GPU system this is 4 cores at 100% doing nothing.
I do not think is is much work to make this a toggle but am not very thrilled by the concept of adding a button / setting for it.

One thing we could do is make this user selectable. But that is not a good solution. The other thing is revert to busy waiting when using just one GPU but that is also bad as some people with a dedicated GPU use just one GPU for rendering. Detecting that you want to render on a GPU that you also use for display is kind of hard. @ThomasDinges what do you think is the best solution? The problem comes from the fact that the async kernel queues enough tasks that the gpu only has to report back once a second and then only for a very brief period. For me this yields good performance even in a 1 card scenario but the computer is unusable during rendering other then viewing the progress. In the old situation only 1 kernel gets queued and then the computer busy-waits for it. literally maxing out 1 core 100% per card to get an answer this wields a more interactive computer is you have both multiple cores and and decent GPU at the expense of burning some extra watts of CPU on nothing. For a 4GPU system this is 4 cores at 100% doing nothing. I do not think is is much work to make this a toggle but am not very thrilled by the concept of adding a button / setting for it.
Member

Added a trivial patch to fall back to old style busy waiting

Added a trivial patch to fall back to old style busy waiting
Member

We might be able to test if there is an opengl context on the device by trying to create a CUDA-GL shared resource.
It might be a non intrusive way of figuring out if a device is actually also a display device and then using busy-waiting for that one device. This could allow us to have best of both worlds

We might be able to test if there is an opengl context on the device by trying to create a CUDA-GL shared resource. It might be a non intrusive way of figuring out if a device is actually also a display device and then using busy-waiting for that one device. This could allow us to have best of both worlds
Member

I tried using cuGLGetDevices to get cuda device associated with context but to do that we would need a way (non hack) to get active opengl context into device_cuda at the time of device creation. For me first test with hack seems to be able to give back my one cuda device. Not sure how to compare them properly though but that could provide a way of having blender use busy waiting on devices that it is also using to draw.

I tried using cuGLGetDevices to get cuda device associated with context but to do that we would need a way (non hack) to get active opengl context into device_cuda at the time of device creation. For me first test with hack seems to be able to give back my one cuda device. Not sure how to compare them properly though but that could provide a way of having blender use busy waiting on devices that it is also using to draw.

This issue was referenced by blender/blender-addons-contrib@18da79f471

This issue was referenced by blender/blender-addons-contrib@18da79f471f4aa15df895d253b03c9c3600411cf

This issue was referenced by 18da79f471

This issue was referenced by 18da79f471f4aa15df895d253b03c9c3600411cf
rich33584 commented 2014-04-22 15:51:23 +02:00 (Migrated from localhost:3001)

Added subscriber: @rich33584

Added subscriber: @rich33584
rich33584 commented 2014-04-22 15:51:23 +02:00 (Migrated from localhost:3001)

I am having a similar issue, except in CPU rendering.
The tiles will start to render and then hang. They will usually hang one at a time until all tiles stop rendering. They seem to hang at different sample counts. Sometimes the tiles will be transparent with nothing rendered at all. In preview render it will hang after just a few samples.
It only has happened to me in heavy particle/instanced scenes. Max mem usage on my current project is 3.6 gigs.
To me it seems to be after a certain percentage of RAM is used. I can eliminate the grass and the rest of the scene will render fine. I can go the other way and eliminate the trees and and the grass will render fine.
Files seem to render fine in 2.69. This scene is nothing compared to what I have rendered in the past on my machine.
I have abandoned 2 projects already because of this and I dont want to abandon another...
Windows 7
I7 processor
8 gigs of RAM
CPU rendering.

I am having a similar issue, except in CPU rendering. The tiles will start to render and then hang. They will usually hang one at a time until all tiles stop rendering. They seem to hang at different sample counts. Sometimes the tiles will be transparent with nothing rendered at all. In preview render it will hang after just a few samples. It only has happened to me in heavy particle/instanced scenes. Max mem usage on my current project is 3.6 gigs. To me it seems to be after a certain percentage of RAM is used. I can eliminate the grass and the rest of the scene will render fine. I can go the other way and eliminate the trees and and the grass will render fine. Files seem to render fine in 2.69. This scene is nothing compared to what I have rendered in the past on my machine. I have abandoned 2 projects already because of this and I dont want to abandon another... Windows 7 I7 processor 8 gigs of RAM CPU rendering.

@rich33584: it sounds like you are seeing a different issues, this is a bug report specifically about GPU rendering.

Please report a new bug and try to attach a .blend file that reproduces the issue and I'll look into it. I haven't heard about an issue similar to what you mentioned before, so probably there is some specific setting, geometry or shader node setup that causes this problem, and it's too difficult for us to guess what that is exactly without a .blend file.

@rich33584: it sounds like you are seeing a different issues, this is a bug report specifically about GPU rendering. Please report a new bug and try to attach a .blend file that reproduces the issue and I'll look into it. I haven't heard about an issue similar to what you mentioned before, so probably there is some specific setting, geometry or shader node setup that causes this problem, and it's too difficult for us to guess what that is exactly without a .blend file.

Changed status from 'Open' to: 'Resolved'

Changed status from 'Open' to: 'Resolved'
Brecht Van Lommel self-assigned this 2014-04-22 16:03:09 +02:00

Regarding this bug, as far as I know it is fixed now in {18da79f471f4}.

If not, please tell me and I'll reopen this report.

Regarding this bug, as far as I know it is fixed now in {18da79f471f4}. If not, please tell me and I'll reopen this report.
Sign in to join this conversation.
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
6 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: blender/blender#39559
No description provided.