- User Since
- Oct 17 2010, 10:55 AM (401 w, 1 d)
Jan 4 2018
Jan 3 2018
Jul 30 2017
Jun 26 2016
Nov 10 2015
Oct 31 2015
Oct 12 2015
Jun 8 2015
May 18 2015
Fixed in rB88acb3c5
May 16 2015
Hi! I would like to download Blender 2.74(***), but it downloads 2.74, same for all other versions. Please check this problem. I downloaded here: http://www.blender.org/features/past-releases/
May 15 2015
Reproduced this problem with no ocio, added an inline comment with possible fix here: rB4fc318811
May 13 2015
Superseded by completely another more complex approach. Still experimenting, but new patch has nothing common with this patch, so closing this one.
May 11 2015
May 10 2015
Based in the fact, that no rotation/shear transformations are used with rastertocamera transform, and perspective coefficient is never zero (how could it be?).
May 5 2015
Fix if() formatting and add pad3.
May 4 2015
@Thomas Dinges (dingto), it seems to me that current implementation is affected by "off-by-one" mistake somewhere around here: https://developer.blender.org/diffusion/B/browse/master/intern/cycles/render/blackbody.cpp$96 . I've already noticed that our current implementation gives different colors from OSL and my approximation is based on unbiased function.
For CUDA (7.0) the difference is smaller, 00:52.86 (lut) and 00:51:91 (approx).
Quick test on CPU with plane gave 25.45s (lut) vs 20.26s (approx). Input socket is obviously connected.
Difference between original (non-lut) function and approximation:
May 1 2015
I've checked rB5ad79b83afae again with NVidia and it doesn't compile on my computer anymore even with ADV shading disabled. The compile process looks similar to nvcc, clBuildProgram works for 1 minute or so and then eats up all available RAM and crashes. I guess this is related to @firstname.lastname@example.org (varunsundar08) report.
Apr 8 2015
Apr 7 2015
Common diff in PTX (same for SASS), 92 replacements:
< mov.f32 %f6781, 0f00000000; < max.ftz.f32 %f6782, %f1500, %f6781; < mov.f32 %f6783, 0f3F800000; < min.ftz.f32 %f10110, %f6782, %f6783; --- > cvt.ftz.sat.f32.f32 %f10029, %f1500;
Apr 6 2015
Apr 1 2015
I've tested this patch (the original one from D1200) on Ubuntu 14.10 x86-64 and NVidia GTX 690 with MikePan BMW scene.
Mar 30 2015
Mar 27 2015
Mar 26 2015
Procedural checker texture with Position node.
Mar 25 2015
Sorry, I don't understand your proposal.
Mar 23 2015
Mar 22 2015
Animation works too:
Added sergey as reviewer.
Mar 10 2015
Mar 7 2015
Committed an actual fix. This problem comes from embree, both nmadd and nmsub functions in avxf and ssef are incorrect there.
Dec 24 2014
Dec 1 2014
Oct 13 2014
This patch affects BTS repo, adding @Bastien Montagne (mont29).
Aug 28 2014
Aug 9 2014
Aug 5 2014
Tables for common benchmark files:
Jul 28 2014
Jul 3 2014
Jun 5 2014
May 15 2014
For future references: this procedural texture could be used for recreating stars. Unfortunately, it is not so easy to use, but this approach is more flexible and can be applied in cycles too.
May 12 2014
May 5 2014
Apr 27 2014
Apr 22 2014
Apr 12 2014
I forgot to mention that this assert is visible only with debug builds. Asan crashlog:
Mar 30 2014
jrp, are you sure that 2 iterations of Halley's method is faster than 3 iterations of Newton-Raphson method? A single iteration of Halley's method has 6*, 4+ and 1/, while Newton-Raphson has only 4*, 1+ and 1/. Halley's method is less robust because it calculates approx^5, so the working domain will be smaller.
Mar 25 2014
Mar 24 2014
Cycles provides multiple entry points (a.k.a. kernel functions in OpenCL or global functions in CUDA) for different CPUs and platforms. These entry points are located in files:
- kernel.cl (+ kernel_compat_opencl.h)
- kernel.cu (+ kernel_compat_cuda.h)
- kernel_avx.cpp (+ kernel_compat_cpu.h)
- kernel_sse2.cpp (+ kernel_compat_cpu.h)
- kernel_sse3.cpp (+ kernel_compat_cpu.h)
- kernel_sse41.cpp (+ kernel_compat_cpu.h)
Mar 23 2014
- The slowdown from enabling KERNEL_SSE occurs mostly because compilers are unable to reassociate and simplify math expressions, written with intrinsics (except for icc). However it does not prevents us from adding sse_float4 or sse_float3 classes with overloaded operators. In this way we could control places, where compiler uses intrinsics, and where it tries to optimize automatically with autovectorization.
Mar 22 2014
Confirming bisect by @Martijn Berger (juicyfruit).
Mar 21 2014
Mar 6 2014
Mar 5 2014
CMake compiles after adding this extra patch:
Feb 25 2014
Feb 17 2014
Feb 14 2014
Feb 12 2014
Also, both renders have almost no noise, so I guess this modification has some kind of clamping.
I also can't reproduce this problem. I don't see any connection with SSE optimizations. Front headlights are just simple mesh objects with no textures.
Feb 11 2014
Closed by commit rBbd44dcb63229.