- User Since
- Feb 21 2019, 1:26 PM (115 w, 1 d)
Wed, Apr 28
I understand what it does, but I'm not familiar with macOS, so can't rate the solution. Semantically it looks good to me. So if it works I see no problem with committing this.
Mon, Apr 26
Wed, Apr 21
@Brecht Van Lommel (brecht) @Clément Foucault (fclem) Still haven't reproduced it exactly. But based on the stack trace and exception info it looks like the problem is a broken VBO binding. The driver is trying to interpret a vertex attribute offset (that Blender set up via "glVertexAttribPointer" previously) as a CPU pointer, rather than a VBO offset, which happens if no valid VBO is bound (as defined in the OpenGL spec).
I can force pretty much the same crash to happen if I replace glBindBuffer(GL_ARRAY_BUFFER, vbo_id_); with glBindBuffer(GL_ARRAY_BUFFER, 0xdeadbeef); in source\blender\gpu\opengl\gl_vertex_buffer.cc. This would indicate that at some point for some unknown reason a VBO ID breaks in Blender leading to the crash everybody above is experiencing. Maybe Blender tries to render a deleted GLVertBuf object (and therefore accessing bogus data and binding a broken VBO ID)?
Tue, Apr 20
Replaced "BLENDER_CMAKE_SYSTEM_PROCESSOR" with a "BLENDER_PLATFORM_ARM" CMake variable and simplified some checks
Tue, Apr 13
@Hans Goudey (HooglyBoogly) Thanks! Rebased.
Mon, Apr 12
Mar 30 2021
Mar 11 2021
Mar 3 2021
Fixed incorrect parameter to some device_only_memory instances after previous change.
Implemented allow_host_memory_fallback parameter (I agree, this is nicer) and fixed the high peak memory usage during OptiX acceleration structure building by limiting the actual OptiX acceleration structure building to a single thread at a time (using a mutex lock in build_optix_bvh).
This solved the problem in my tests, while still keeping the rest of the bottom-level BVH build running in parallel, which is noticeable faster in some scenes compared to just running it serialized (presumably because of the curve conversion loops).
With this peak memory usage did not exceed the memory usage during rendering in the splash screen scene, so was still able to render it successfully on a smaller GPU where it failed before. At the same time loading speed did not regress in a perceptible fashion.
Mar 2 2021
Feb 25 2021
Feb 23 2021
Feb 22 2021
Added separator between render and viewport denoising settings and rebased on master
Feb 19 2021
How about this (explicitly stating that where the other settings are)?:
Feb 17 2021
This is happening because lookup into the NanoVDB data structure is not clamped, whereas dense texture lookups are (see EXTENSION_CLIP, which is the default for images, including volumes). And since either linear or cubic interpolation is always enabled (you can't enable closest interpolation in Cycles ... which is rather odd and should probably be addressed at some point too), you get an interpolated result in the neighboring voxels. So I actually think the current behavior with NanoVDB is more correct?.
Based on my understanding of the problem this makes sense and the code looks correct. Have not tested it yet though.
Feb 16 2021
Feb 12 2021
Feb 8 2021
Is still being worked on, turns out to be a more complex problem, so no fix is available yet. I continue to have this on my radar though, so will update when that changes.
Jan 29 2021
Jan 28 2021
@Brecht Van Lommel (brecht) I don't see requested_features.use_background_light changing when toggling portal. The problem appears to be that kernels are reloaded before the background light is enabled again (which happens in LightManager::test_enabled_lights, which is called after the kernel reload):
Jan 27 2021
Yeah, things get funky with OptiX when intersecting objects with huge scaling. There's another report on this here too: T81566.
To solve this we could introduce some logic that detects and skips self-intersections when they happen (e.g. by keeping track of the last hit and checking if the new hit point is behind that based on the ray direction) or something along those lines.
Looks good to me. My main concern with locking directly inside generic_free would have been the double lock when called from tex_free, but you addressed by unlocking before, so everything looks functional.
I'll do some more testing to be sure and commit later. Thank you for this!
Jan 25 2021
Jan 22 2021
This is expected behavior: There is a limit of 64 transparent hits supported for a single ray until it hits a light source (see SHADOW_STACK_MAX_HITS in Cycles source code) on the GPU. On the CPU it can dynamically allocate more memory to extend that range, and with CUDA there is a slow fallback implementation using a loop. In OptiX that fallback implementation is disabled for better general performance, so you can't do more than 64. We could enable the fallback there as well, but that would come at a performance hit even if no more than 64 hits are encountered, so decided against it for now.
Jan 21 2021
Jan 20 2021
Yeah. I was finally able to reproduce this now and it's crashing because trying to write to a GPU address on the CPU. The tile stealing code steals the tile buffer for the GPU and moves the data there, but next time around the tile may be re-used on the CPU again (which only happens in progressive mode, otherwise tiles are deleted when done). The buffer is still lying on the GPU then though and never moved back, so is inaccessible to the CPU.
Jan 19 2021
@Max (maxim_d33) It's storing some matrices with double because the OpenVDB ones too are using double precision (so is done for the sake of preserving data, since it should be possible to do OpenVDB->NanoVDB->OpenVDB without data loss). But there are copies of the same matrices stored as float too, so the library user can decide which ones to use. In case of Cycles the single precision variants are used, so we really only need the double fields for correct data layout, but that could be achieved with some other placeholder type as well.
Problem is that volumes will still be converted to the NanoVDB structure (in cycles/render/image_vdb.cpp), despite the OpenCL kernels then not being able to interpret the data when compiled without WITH_NANOVDB (and therefore just not rendering volumes altogether).
So to disable NanoVDB selectively we'd have to somehow pass along the information that NanoVDB is not supported up to the image loader.
Jan 18 2021
This is a bug introduced with the tile stealing implementation (rB517ff40b124bc9d1324ccf7561a59ac51bf86602). Since the OptiX denoiser runs on the GPU, the tile stealing code erroneously steals CPU tiles and moves them to the OptiX device. But in this configuration the OptiX device was only set up for denoising, not rendering, so it crashes. Will look for a fix.
Jan 14 2021
Yeah, it's on the upstream's TODO list. Probably would be nicer to wait for that to happen instead of doing it with a patch, so I'll put some pressure on it.
Jan 13 2021
Restored some changes from Diff 32724 that were overwritten in Diff 32726 by accident
Fixed rendering with OptiX (added new point cloud intersection program, updated SBT + SBT offsets and fixed any-hit programs to handle point cloud primitives correctly)
Jan 12 2021
@Brecht Van Lommel (brecht) On getting the intersection program correct:
Jan 11 2021
Yeah, I think it is better to go with 7.1 for now for increased compatibility with older drivers. It will indeed also activate support for the built-in curve primitive too, but that's still disabled by default and only enabled via a debug option, so does not change for the normal user.
There is currently no 7.2-specific feature implemented (apart from enabling validation mode when building debug kernels), so we wouldn't gain much updating all the way. I've played around with the new specialization feature, but couldn't really get useful performance benefits for Cycles out of it, so didn't add it. Maybe that changes at some point.
Jan 10 2021
Jan 8 2021
@Brecht Van Lommel (brecht) Raising the instance limit to 134'217'727 requires building with the OptiX 7.1 SDK or higher, otherwise it continues to be capped at 8'388'607. It looks like buildbot is currently building with the OptiX 7.0 SDK still, so could be worth updating that now to support this.
Jan 7 2021
This is happening because the current OptiX implementation in Blender only supports up to 8'388'607 instances, but this scene is using 10'012'160, which is why things fall apart for those instances that go beyond the limit.
It is technically possible to raise that limit to 134'217'727 now though, which solves the problem in this scene. I'll commit that later to fix it.
Jan 6 2021
This fixed all occurrences of changing objects/geometry in the viewport causing a crash when rendering with either CPU or OptiX and this was one of those cases, so yes, should be fully fixed.
Jan 5 2021
Should be fixed now!
Sounds good. I think I'll just go with a Geometry::tag_bvh_update routine for now and we can then later see if it can be simplified with the update flags when that gets merged.