Hi,
this is the Image Reconstruction kernel modifications.
included:
- a weight pass, used to weight other "marked" passes
- more filters, Mitchell, Sinc, Triangle
- as this is only the Cycles part, here is how I write the image from my side:
BufferParams& params = rtile.buffers->params; int x = params.full_x - options.session->tile_manager.params.full_x; int y = params.full_y - options.session->tile_manager.params.full_y; int w = params.width; int h = params.height; RenderBuffers *buffers = rtile.buffers; /* copy data from device */ if (!buffers->copy_from_device()) return; float exposure = scene->film->exposure; std::vector<float> pixels(params.width*params.height * 4); std::vector<float> pixels_weight(params.width*params.height); buffers->get_pass_rect(PASS_WEIGHT, 1.f, rtile.sample, 1, &pixels_weight[0]); if (buffers->get_pass_rect(PASS_COMBINED, 1.f, rtile.sample, 4, &pixels[0])) { for (size_t i = 0; i < pixels_weight.size(); ++i) { if (pixels_weight[i] != 0.f) { float invWt = 1.f / pixels_weight[i]; pixels[i * 4] *= invWt; pixels[i * 4 + 1] *= invWt; pixels[i * 4 + 2] *= invWt; pixels[i * 4 + 3] *= invWt; pixels[i * 4 + 3] = saturate(pixels[i * 4 + 3]); } else { pixels[i * 4] = 0.f; pixels[i * 4 + 1] = 0.f; pixels[i * 4 + 2] = 0.f; pixels[i * 4 + 3] = 1.f; } } update(&pixels[0], x, y, w, h);//function callback } // NOTE: I disabled scale inside get_pass_rect()
- my test case: background = true, filename.empty() = true, progressive_refine = false
- ImageReconstructionSampleBoundaries needs a tiny redesign "it is 33 variables, possibly remove 1 or padd it"
known issues that needs a design decision "possibly Sergey can continue from this point" :
- as the tile data is kinda private "not writing to a global buffer" , so there is artifacts between tile borders "I'm clamping edges"
- - solution: tile buffer can be a little larger "old range: (sx -> sx + sw, sy -> sy + sh), new range: (sx - filter_width -> sx + sw + filter_width, sy - filter_width -> sy + sw + filter_width)
- and use this range inside the kernel call (sx, sy, sw, sh)
- in the write function "to blender" , there will be a global buffer that is accumulating data "as the tile buffer extended ranges will be overlapping"
I tested on CPU and CUDA "3.5", windows 8, i7 3930k, GTX 780., should work fine on OpenCL "not tested"
cheers,
Mohamed Sakr