Cycles Image Reconstruction kernel #45519

New Issue

Mohamed Sakr · 2015-07-22T09:24:45+02:00

Mohamed Sakr commented

2015-07-22 09:24:45 +02:00

Hi,

this is the Image Reconstruction kernel modifications.

included:

a weight pass, used to weight other "marked" passes
more filters, Mitchell, Sinc, Triangle
as this is only the Cycles part, here is how I write the image from my side:

	BufferParams& params = rtile.buffers->params;
	int x = params.full_x - options.session->tile_manager.params.full_x;
	int y = params.full_y - options.session->tile_manager.params.full_y;
	int w = params.width;
	int h = params.height;

	RenderBuffers *buffers = rtile.buffers;

	/* copy data from device */
	if (!buffers->copy_from_device())
		return;

	float exposure = scene->film->exposure;
	std::vector<float> pixels(params.width*params.height * 4);
	std::vector<float> pixels_weight(params.width*params.height);

	buffers->get_pass_rect(PASS_WEIGHT, 1.f, rtile.sample, 1, &pixels_weight[0]);
	if (buffers->get_pass_rect(PASS_COMBINED, 1.f, rtile.sample, 4, &pixels[0]))
	{
		for (size_t i = 0; i < pixels_weight.size(); ++i)
		{
			if (pixels_weight[i] != 0.f)
			{
				float invWt = 1.f / pixels_weight[i];
				pixels[i * 4] *= invWt;
				pixels[i * 4 + 1] *= invWt;
				pixels[i * 4 + 2] *= invWt;
				pixels[i * 4 + 3] *= invWt;
				pixels[i * 4 + 3] = saturate(pixels[i * 4 + 3]);
			}
			else
			{
				pixels[i * 4] = 0.f;
				pixels[i * 4 + 1] = 0.f;
				pixels[i * 4 + 2] = 0.f;
				pixels[i * 4 + 3] = 1.f;
			}
		}
		update(&pixels[0], x, y, w, h);//function callback
	}

// NOTE: I disabled scale inside get_pass_rect()

** my test case: background = true, filename.empty() = true, progressive_refine = false

ImageReconstructionSampleBoundaries needs a tiny redesign "it is 33 variables, possibly remove 1 or padd it"

known issues that needs a design decision "possibly Sergey can continue from this point" :

as the tile data is kinda private "not writing to a global buffer" , so there is artifacts between tile borders "I'm clamping edges"
- solution: tile buffer can be a little larger "old range: (sx -> sx + sw, sy -> sy + sh), new range: (sx - filter_width -> sx + sw + filter_width, sy - filter_width -> sy + sw + filter_width)
- and use this range inside the kernel call (sx, sy, sw, sh)
- in the write function "to blender" , there will be a global buffer that is accumulating data "as the tile buffer extended ranges will be overlapping"

I tested on CPU and CUDA "3.5", windows 8, i7 3930k, GTX 780., should work fine on OpenCL "not tested"

cheers,
Mohamed Sakr

Hi, this is the Image Reconstruction kernel modifications. included: * a weight pass, used to weight other "marked" passes * more filters, Mitchell, Sinc, Triangle * as this is only the Cycles part, here is how I write the image from my side: ``` BufferParams& params = rtile.buffers->params; int x = params.full_x - options.session->tile_manager.params.full_x; int y = params.full_y - options.session->tile_manager.params.full_y; int w = params.width; int h = params.height; RenderBuffers *buffers = rtile.buffers; /* copy data from device */ if (!buffers->copy_from_device()) return; float exposure = scene->film->exposure; std::vector<float> pixels(params.width*params.height * 4); std::vector<float> pixels_weight(params.width*params.height); buffers->get_pass_rect(PASS_WEIGHT, 1.f, rtile.sample, 1, &pixels_weight[0]); if (buffers->get_pass_rect(PASS_COMBINED, 1.f, rtile.sample, 4, &pixels[0])) { for (size_t i = 0; i < pixels_weight.size(); ++i) { if (pixels_weight[i] != 0.f) { float invWt = 1.f / pixels_weight[i]; pixels[i * 4] *= invWt; pixels[i * 4 + 1] *= invWt; pixels[i * 4 + 2] *= invWt; pixels[i * 4 + 3] *= invWt; pixels[i * 4 + 3] = saturate(pixels[i * 4 + 3]); } else { pixels[i * 4] = 0.f; pixels[i * 4 + 1] = 0.f; pixels[i * 4 + 2] = 0.f; pixels[i * 4 + 3] = 1.f; } } update(&pixels[0], x, y, w, h);//function callback } // NOTE: I disabled scale inside get_pass_rect() ``` ** my test case: background = true, filename.empty() = true, progressive_refine = false * ImageReconstructionSampleBoundaries needs a tiny redesign "it is 33 variables, possibly remove 1 or padd it" known issues that needs a design decision "possibly Sergey can continue from this point" : * as the tile data is kinda private "not writing to a global buffer" , so there is artifacts between tile borders "I'm clamping edges" * - solution: tile buffer can be a little larger "old range: (sx -> sx + sw, sy -> sy + sh), new range: (sx - filter_width -> sx + sw + filter_width, sy - filter_width -> sy + sw + filter_width) - and use this range inside the kernel call (sx, sy, sw, sh) - in the write function "to blender" , there will be a global buffer that is accumulating data "as the tile buffer extended ranges will be overlapping" I tested on CPU and CUDA "3.5", windows 8, i7 3930k, GTX 780., should work fine on OpenCL "not tested" cheers, Mohamed Sakr

Mohamed Sakr commented

2015-07-22 09:24:45 +02:00

Changed status to: 'Open'

Sergey Sharybin was assigned by Mohamed Sakr

2015-07-22 09:24:45 +02:00

Mohamed Sakr commented

2015-07-22 09:24:45 +02:00

Added subscriber: @MohamedSakr

Mohamed Sakr commented

2015-07-22 09:26:04 +02:00

ops, I missed and put it in BF Blender, should have chosen BF Cycles.

Mohamed Sakr commented

2015-07-22 09:36:01 +02:00

forgot to add the diff, CyclesImageReconstructionkernel.7z
sorry for this, didn't know where to add the diff, my first time here.

may be any admin recreates the whole task or update it.

forgot to add the diff, [CyclesImageReconstructionkernel.7z](https://archive.blender.org/developer/F211539/CyclesImageReconstructionkernel.7z) sorry for this, didn't know where to add the diff, my first time here. may be any admin recreates the whole task or update it.

Mohamed Sakr commented

2015-07-22 09:38:12 +02:00

also disabled Saturate inside get_pass_rect() for alpha, Sature is done after weight denormalizing

* also disabled Saturate inside get_pass_rect() for alpha, Sature is done after weight denormalizing

Martijn Berger commented

2015-07-22 09:50:59 +02:00

Added subscriber: @MartijnBerger

Martijn Berger commented

2015-07-22 09:50:59 +02:00

Could you please fix the patch ?

It is pretty much unreadable and very very hard to spot what you are changing.

Could you please fix the patch ? It is pretty much unreadable and very very hard to spot what you are changing.

Mohamed Sakr commented

2015-07-22 10:00:35 +02:00

sorry for being noob, my first time to do a diff
so I got many commits, I squashed them into 1, did a diff from the master

what am I missing?

sorry for being noob, my first time to do a diff so I got many commits, I squashed them into 1, did a diff from the master what am I missing?

Mohamed Sakr commented

2015-07-22 13:08:19 +02:00

ok, I figured out how, thanks Martjin :)
here is the correct diff, CyclesImageReconstruction.diff

ok, I figured out how, thanks Martjin :) here is the correct diff, [CyclesImageReconstruction.diff](https://archive.blender.org/developer/F211578/CyclesImageReconstruction.diff)

Aaron Carlisle commented

2015-07-22 20:03:05 +02:00

Added subscriber: @Blendify

Sergey Sharybin commented

2015-07-22 20:35:44 +02:00

General code feedback:

Please follow the existing code style
Please do not add your name in the comments, if the patch is accepted authotship of the code is easy to find via git blame

Regarding this particular change:

It's not really clear how to check it and what to expect. After applying this patch default cube scene is becoming more and more red with more samples
Mitchell filter i'm not sure is used correctly here
Atomics are in fact quite bad. They're stopping all conveyors, introducing really bad latency issues with lots of threads
Not even sure why do you need to do it in the kernel itself

General code feedback: - Please follow the existing code style - Please do not add your name in the comments, if the patch is accepted authotship of the code is easy to find via git blame Regarding this particular change: - It's not really clear how to check it and what to expect. After applying this patch default cube scene is becoming more and more red with more samples - Mitchell filter i'm not sure is used correctly here - Atomics are in fact quite bad. They're stopping all conveyors, introducing really bad latency issues with lots of threads - Not even sure why do you need to do it in the kernel itself

Mohamed Sakr commented

2015-07-22 21:40:08 +02:00

I'm not sure why it is getting red in your case "it is fine here" , it is related to the write part in your side "as I changed some stuff in the get_rect(), they should accordingly adjust some stuff inside the write_cb() and update_cb() functions
I used the Mitchell filter function inside PBRT as it is
not sure if there is an alternative to atomics "without using CUDA dynamic parallelism"
why it is done in the kernel, let me compare: (pseudo code)
a) done as a filter (compositing):
pixel_value = sum(neighbor pixel_value * neighbor pixel_weight) / neighbor_pixel_count
weight here is calculated from the pixel center

b) done inside the kernel:
for(i in samples) { pixel_value += sum(neighbor pixel_value * neighbor pixel_weight)

                           pixel_weight += sum(neighbor pixel_weight} }

pixel_value /= pixel_weight //normalizing if weight != 0

here it depends on per sample position (weight) and color, you can see the difference from the equations.

conclusion: the difference is:(compositing filter is working on pixel centers and average values(using sum)), (kernel filter is working on pixels float position and exact pixel values(using sample value))

just tell me what you want me to update, I will do the general code feedback, what else?

* I'm not sure why it is getting red in your case "it is fine here" , it is related to the write part in your side "as I changed some stuff in the get_rect(), they should accordingly adjust some stuff inside the write_cb() and update_cb() functions * I used the Mitchell filter function inside PBRT as it is * not sure if there is an alternative to atomics "without using CUDA dynamic parallelism" * why it is done in the kernel, let me compare: (pseudo code) a) done as a filter (compositing): pixel_value = sum(neighbor pixel_value * neighbor pixel_weight) / neighbor_pixel_count weight here is calculated from the pixel center b) done inside the kernel: for(i in samples) { pixel_value += sum(neighbor pixel_value * neighbor pixel_weight) ``` pixel_weight += sum(neighbor pixel_weight} } ``` pixel_value /= pixel_weight //normalizing if weight != 0 here it depends on per sample position (weight) and color, you can see the difference from the equations. conclusion: the difference is:(compositing filter is working on pixel centers and average values(using sum)), (kernel filter is working on pixels float position and exact pixel values(using sample value)) just tell me what you want me to update, I will do the general code feedback, what else?

Mohamed Sakr commented

2015-07-23 02:08:21 +02:00

I see the filter result is not correct, the only place where I suspect is the filter table itself
in PBRT the filter table is just the function, in Cycles the filter table is done using a long process and MIS stuff.

I will create another filter table similar to PBRT and re-test.

1 note: kernel filter enhances AA, compositing filter doesn't affect AA that much.

I see the filter result is not correct, the only place where I suspect is the filter table itself in PBRT the filter table is just the function, in Cycles the filter table is done using a long process and MIS stuff. I will create another filter table similar to PBRT and re-test. 1 note: kernel filter enhances AA, compositing filter doesn't affect AA that much.

Brecht Van Lommel commented

2015-07-24 02:01:50 +02:00

Added subscriber: @brecht

Brecht Van Lommel commented

2015-07-24 02:01:50 +02:00

I strongly suggest to not add any filters with negative weights to Cycles, like Mitchell or sinc. In my opinion they only add problems and I would not suggest anyone to use them in practice. The triangle filter is also not useful. Blackman–Harris could be good to add, but it's the only one I can think of.

With a path tracer like Cycles there's enough AA samples that you don't need to artificially sharpen the image, and deal with problems like ringing or clamping negative values.

I strongly suggest to not add any filters with negative weights to Cycles, like Mitchell or sinc. In my opinion they only add problems and I would not suggest anyone to use them in practice. The triangle filter is also not useful. Blackman–Harris could be good to add, but it's the only one I can think of. With a path tracer like Cycles there's enough AA samples that you don't need to artificially sharpen the image, and deal with problems like ringing or clamping negative values.

Mohamed Sakr commented

2015-07-24 04:04:17 +02:00

@brecht , what do you think about kernel filter vs compositing filter? "I prefer kernel filter as stated above"
also you got a good point about the negative weights of Mitchell
what about atomics? any idea to avoid them? only 2 things that I think of are:
1- another buffer to write to (bad for memory, also bad as it is global memory write), or
2- shared memory atomics "faster on Maxwell, slower on Kepler"
or anything that you may suggest.

@brecht , what do you think about kernel filter vs compositing filter? "I prefer kernel filter as stated above" also you got a good point about the negative weights of Mitchell what about atomics? any idea to avoid them? only 2 things that I think of are: 1- another buffer to write to (bad for memory, also bad as it is global memory write), or 2- shared memory atomics "faster on Maxwell, slower on Kepler" or anything that you may suggest.

Brecht Van Lommel commented

2015-07-24 04:33:53 +02:00

The "compositing filter" looks like a blur operation you could do in the compositor, there's no point doing that in Cycles.

My understanding is that this is an alternative sampling method to the filter importance sampling that we use, and the only way to do that correctly is to work on individual AA samples. Both this code and the existing method should converge to the same result. There are some pros and cons as discussed in the filter importance sampling paper, which converges better depends on the scene. Also interesting to note filter importance sampling is required for denoising as added in PRMan.

In any case, I think that before accepting this patch there should be evidence that it's actually significantly better for some example scenes, and the extra render time of padded tiles is worth it. Also because the implementation adds significant complexity to the code. I have a feeling that with the high number of AA samples typically used in Cycles scenes (and more and more in production scenes elsewhere), filter importance sampling might be pretty competitive.

I don't know what the best way is to implement this. If you "scatter" using atomics you will suffer from contention but the code stays relatively simple. The alternative is to split it into two stages, one where you write unfiltered samples into a buffer and then a second where you "gather" the samples and filter them, so that each thread can write to its own pixel without any contention. But the first thing for me would be to prove that this method is helpful and then worry about optimization.

The "compositing filter" looks like a blur operation you could do in the compositor, there's no point doing that in Cycles. My understanding is that this is an alternative sampling method to the filter importance sampling that we use, and the only way to do that correctly is to work on individual AA samples. Both this code and the existing method should converge to the same result. There are some pros and cons as discussed in the filter importance sampling paper, which converges better depends on the scene. Also interesting to note filter importance sampling is required for denoising as added in PRMan. In any case, I think that before accepting this patch there should be evidence that it's actually significantly better for some example scenes, and the extra render time of padded tiles is worth it. Also because the implementation adds significant complexity to the code. I have a feeling that with the high number of AA samples typically used in Cycles scenes (and more and more in production scenes elsewhere), filter importance sampling might be pretty competitive. I don't know what the best way is to implement this. If you "scatter" using atomics you will suffer from contention but the code stays relatively simple. The alternative is to split it into two stages, one where you write unfiltered samples into a buffer and then a second where you "gather" the samples and filter them, so that each thread can write to its own pixel without any contention. But the first thing for me would be to prove that this method is helpful and then worry about optimization.

Mohamed Sakr commented

2015-07-24 09:10:21 +02:00

I fixed the problem now, added another table without MIS, and results seems correct
here are 4 images, 16 spp, 64 spp, with and without filter, filter is Gaussian , width = 1.5

render time difference is nothing!! "but as this is only for 1 pass, it needs testing with all passes active and using filter, and possibly with the 2 pixels border extension to avoid bucket seams"

I fixed the problem now, added another table without MIS, and results seems correct here are 4 images, 16 spp, 64 spp, with and without filter, filter is Gaussian , width = 1.5 ![henrysss_head_64_filter.jpg](https://archive.blender.org/developer/F212083/henrysss_head_64_filter.jpg) ![henrysss_head_filter.jpg](https://archive.blender.org/developer/F212085/henrysss_head_filter.jpg) ![henrysss_head_64.jpg](https://archive.blender.org/developer/F212086/henrysss_head_64.jpg) ![henrysss_head.jpg](https://archive.blender.org/developer/F212084/henrysss_head.jpg) render time difference is nothing!! "but as this is only for 1 pass, it needs testing with all passes active and using filter, and possibly with the 2 pixels border extension to avoid bucket seams"

Sergey Sharybin commented

2015-07-24 10:21:11 +02:00

@brecht, thanks for the reply and explanation, don't have much time for the tracker this days..

@MohamedSakr,

I don't think the implementation totally matched explanation (unless i'm missing something). Filtering is still happening purely inside pass buffers which doesn't have spatial information from the sample. So unless you're re-generating filter kernel for each AA sample (which is gonna to be rather really bad) then i don't see how you exactly exploiting spatial information?

By not doing the filter in the kernel i didn't mean to do it in post-pro. This should have been read: don't do this in path tracing kernel, but better to have a dedicated kernel for this which will avoid use of atomics. Should be easy to do with the current formulation of the kernel and shouldn't give much overhead even on CUDA with dynamic kernel launching.

As for the example images, if i decoded what's on them correct (adding proper labeling would have been helped..) then the filtered result is far too fuzzy, not really acceptable.

But now, i'm not convinced this is a way we should go at all. To me it seems instead of doing all sort of weird and wonderful stuff in Cycles itself it should be done as a post-pro and Cycles should just be giving required information, like sample spatial information, variances and so on.

That being said, the simplest ever thing to do is to support FSAA in Cycles. It'll improve artistic pipeline, especially with hairy characters in the scene.

With some additional thoughts and generalization it shouldn't be a rocket science to go to a design when Cycles gives required information and all actual filtering happens outside of render kernel (in Blender it could happen in compositor, but actual logic for filtering could still be implemented from Cycles side).

@brecht, thanks for the reply and explanation, don't have much time for the tracker this days.. @MohamedSakr, I don't think the implementation totally matched explanation (unless i'm missing something). Filtering is still happening purely inside pass buffers which doesn't have spatial information from the sample. So unless you're re-generating filter kernel for each AA sample (which is gonna to be rather really bad) then i don't see how you exactly exploiting spatial information? By not doing the filter in the kernel i didn't mean to do it in post-pro. This should have been read: don't do this in path tracing kernel, but better to have a dedicated kernel for this which will avoid use of atomics. Should be easy to do with the current formulation of the kernel and shouldn't give much overhead even on CUDA with dynamic kernel launching. As for the example images, if i decoded what's on them correct (adding proper labeling would have been helped..) then the filtered result is far too fuzzy, not really acceptable. ---- But now, i'm not convinced this is a way we should go at all. To me it seems instead of doing all sort of weird and wonderful stuff in Cycles itself it should be done as a post-pro and Cycles should just be giving required information, like sample spatial information, variances and so on. That being said, the simplest ever thing to do is to support FSAA in Cycles. It'll improve artistic pipeline, especially with hairy characters in the scene. With some additional thoughts and generalization it shouldn't be a rocket science to go to a design when Cycles gives required information and all actual filtering happens outside of render kernel (in Blender it could happen in compositor, but actual logic for filtering could still be implemented from Cycles side).

Mohamed Sakr commented

2015-07-24 10:30:26 +02:00

@Sergey, inside kernel_camera.h, camera_sample(), here it generates weights "and that's why I couldn't make them in another kernel" , as each pixel is giving 25 weights to neighboring pixels, so it doesn't depend only on a single weight and a single L, it depends on L and 25 weights per pixel.

this test was just a rough test to show the idea, 1.5 filter width is quite large "0.75 should give great results"
about the image names, images with name _64 are 64 spp, name _filter is using the filter, other images are 16 spp

@Sergey, inside kernel_camera.h, camera_sample(), here it generates weights "and that's why I couldn't make them in another kernel" , as each pixel is giving 25 weights to neighboring pixels, so it doesn't depend only on a single weight and a single L, it depends on L and 25 weights per pixel. this test was just a rough test to show the idea, 1.5 filter width is quite large "0.75 should give great results" about the image names, images with name _64 are 64 spp, name _filter is using the filter, other images are 16 spp

Brecht Van Lommel commented

2015-07-24 11:35:46 +02:00

My understanding was that this is an alternative importance sampling method for pixel filters basically, which can be less noisy especially with low AA samples. FSAA would indeed be great and give some of the same advantages, though I guess for the case of e.g. 1024 AA samples you can't store them all and would still need to do some filtering immediately.

The example renders do not look like they would converge to the same result though? The ones with this patch look much blurrier.

My understanding was that this is an alternative importance sampling method for pixel filters basically, which can be less noisy especially with low AA samples. FSAA would indeed be great and give some of the same advantages, though I guess for the case of e.g. 1024 AA samples you can't store them all and would still need to do some filtering immediately. The example renders do not look like they would converge to the same result though? The ones with this patch look much blurrier.

Sergey Sharybin commented

2015-07-24 11:53:01 +02:00

True, but think trick to try is something inbetween of real FSAA and current approach, like storing several (8, 16 32?) buffer with samples which are distributed close to each other. Roughly speaking render higher resolution with less AA samples per pixel (probably some weighting tricks are needed as well). Then compositor either deals with each individual buffer extracted from the bigger one or just do downsample with a nice kernel.

This would also help for denoising by giving extra information if noise is caused by bad convergence or it's just a high frequency texture applied on the mesh. Or even help with importance map calculation for adaptive sampling. things like that.

And in any case, even if we apply filter on each of those AA samples i'm not really sure the patch will give measurable improvement.

True, but think trick to try is something inbetween of real FSAA and current approach, like storing several (8, 16 32?) buffer with samples which are distributed close to each other. Roughly speaking render higher resolution with less AA samples per pixel (probably some weighting tricks are needed as well). Then compositor either deals with each individual buffer extracted from the bigger one or just do downsample with a nice kernel. This would also help for denoising by giving extra information if noise is caused by bad convergence or it's just a high frequency texture applied on the mesh. Or even help with importance map calculation for adaptive sampling. things like that. And in any case, even if we apply filter on each of those AA samples i'm not really sure the patch will give measurable improvement.

Mohamed Sakr commented

2015-07-24 12:24:57 +02:00

here is a test from PBRT2, with and without filter, Mitchell 1.5

here is a test from PBRT2, with and without filter, Mitchell 1.5 ![PBRT2_mitchell_filter.jpg](https://archive.blender.org/developer/F212115/PBRT2_mitchell_filter.jpg) ![PBRT2_no_filter.jpg](https://archive.blender.org/developer/F212117/PBRT2_no_filter.jpg)

Brecht Van Lommel commented

2015-07-24 14:58:06 +02:00

What does "without filter" mean, a box filter? The difference between those two images is too big to be explained by just a different pixel filter, did these renders use the same number of samples?

Ivan commented

2015-07-24 16:28:09 +02:00

Added subscriber: @1vanK

Sergey Sharybin commented

2015-07-24 16:43:20 +02:00

The edges seems much sharper in the PBRT example. Seems either number of samples if different in there, or it's some rather extreme edge preserving filtering in there (which is less likely because of high frequency noise appearing in the filtered image).

Please make sure you're comparing apples to apples. In terms same exact scene, same exact sampling/light pass settings and just different filter size and no extra biasing settings.

The edges seems much sharper in the PBRT example. Seems either number of samples if different in there, or it's some rather extreme edge preserving filtering in there (which is less likely because of high frequency noise appearing in the filtered image). Please make sure you're comparing apples to apples. In terms same exact scene, same exact sampling/light pass settings and just different filter size and no extra biasing settings.

Mohamed Sakr commented

2015-07-24 18:34:21 +02:00

same number of samples "32 spp", the noisy one is using a very small filter "which is actually not calculated to neighboring pixels by the filter function, as the filter size in the noisy one is 0.4, in the good one is 1.5, both is using Mitchell"

everything in that scene is the exact same, only difference is the filter radius "0.4 and 1.5"

and by the way, this is the only reason why I started this patch, as I saw how effective it is in PBRT2

also this filtering style I saw it in many renderers, like v-ray, Arnold

same number of samples "32 spp", the noisy one is using a very small filter "which is actually not calculated to neighboring pixels by the filter function, as the filter size in the noisy one is 0.4, in the good one is 1.5, both is using Mitchell" everything in that scene is the exact same, only difference is the filter radius "0.4 and 1.5" and by the way, this is the only reason why I started this patch, as I saw how effective it is in PBRT2 also this filtering style I saw it in many renderers, like v-ray, Arnold

Brecht Van Lommel commented

2015-07-24 19:12:48 +02:00

Ah, but that's not a valid comparison. You're making the filter smaller than a pixel, but PBRT is still sampling the entire pixel, which means most of the samples fall outside of it (or at least have a very small weight). So you're effectively rendering with fewer samples and it gets more noisy.

In fact this is a case where the filter importance sampling as used by Cycles would render with a lot less noise than PBRT, though a filter size of 0.4 is not something you would use in practice.

Ah, but that's not a valid comparison. You're making the filter smaller than a pixel, but PBRT is still sampling the entire pixel, which means most of the samples fall outside of it (or at least have a very small weight). So you're effectively rendering with fewer samples and it gets more noisy. In fact this is a case where the filter importance sampling as used by Cycles would render with a lot less noise than PBRT, though a filter size of 0.4 is not something you would use in practice.

Aaron Carlisle commented

2015-07-24 19:32:13 +02:00

Removed subscriber: @Blendify

Mohamed Sakr commented

2015-07-24 19:34:12 +02:00

I see, but what if we use both?
filter MIS reduces noise "variance"
but it doesn't address AA "you still see very sharp diagonal lines even if you are using 10000 spp"
here this filter can give you the interpolation between pixels "more blurry", so it enhances AA look

may be using a Gaussian filter width of 2 for MIS, and width of 1 for kernel filter? like this the blur will be only for 3x3 pixel area, which would give less blurry results, but more visually pleasing than 0 pixel interpolation colors.

I see, but what if we use both? filter MIS reduces noise "variance" but it doesn't address AA "you still see very sharp diagonal lines even if you are using 10000 spp" here this filter can give you the interpolation between pixels "more blurry", so it enhances AA look may be using a Gaussian filter width of 2 for MIS, and width of 1 for kernel filter? like this the blur will be only for 3x3 pixel area, which would give less blurry results, but more visually pleasing than 0 pixel interpolation colors.

Brecht Van Lommel commented

2015-07-24 20:15:45 +02:00

Well there's two things that should be considered separately. One is the importance sampling method which affects how quickly the image converges. If it looks visually pleasing or not does not matter, because if the implementation is correct it will look exactly the same with many AA samples.

The other is the choice of pixel filter which affects how it looks when converged. For that I think it's worth trying Blackman-Harris, which is like Gaussian but supposedly has less aliasing.

Well there's two things that should be considered separately. One is the importance sampling method which affects how quickly the image converges. If it looks visually pleasing or not does not matter, because if the implementation is correct it will look exactly the same with many AA samples. The other is the choice of pixel filter which affects how it looks when converged. For that I think it's worth trying Blackman-Harris, which is like Gaussian but supposedly has less aliasing.

Mohamed Sakr commented

2015-07-24 23:58:55 +02:00

so what shall I do now "if the MIS sampling filter is better, then this task will be useless"
I guess I remove filter from inside the kernels, and just add Blackman-Harris filter function to the code.

what I want to implement inside Cycles:
1- http:*sglab.kaist.ac.kr/~bcmoon/ which got http:*sglab.kaist.ac.kr/LinearPrediction/ "first paper in the page right now"
I think this is the paper that you referenced that is used in PRMan.

I checked already the previous paper (Adaptive rendering based on Weighted Local Regression) "WLR" , and actually improved it inside PBRT2 "Moon et al." has released the code in PBRT2, and it was really a good filter, but it was only usable for uniform noise distributions + diffuse direct.

so if you have interior scene, where PT will be horrible, this filter will generate artifacts due to bad variance/lack of correct lighting
I improved it to take into account more data, like shadows, specular, and some glossiness

2- https://mediatech.aalto.fi/publications/graphics/GPT/
this one gives less variance using less number of samples, using the gradient domain, but the cost of sample is about 2.5x of PT "it is doing the sample + other 4 samples in -x, +x, -y, +y), but the result is way better, so 1000 spp in GPT is better than 4000 spp in PT, so improvement as said in the paper is about 1.6x

3- http://jo.dreggn.org/home/, Johannes Hanika, Manifold Next Event Estimation (MNEE)
decent paper which suits very well to Cycles PT, as it will help to catch specular rays, so it will help a lot with all glass shaders, caustics, ...

4- https://disney-animation.s3.amazonaws.com/uploads/production/publication_asset/70/asset/Sorted_Deferred_Shading_For_Production_Path_Tracing.pdf
as the title says :) , and this type of work suits GPU as it contains a lot of sorting and coherence, but it needs very tight texture pixel usage "like PTex", I tested the main concept of this on PBRT2, it gave some improvements , but this is only a small improvement due to the usage of UV texturing, instead of sorting by PTex index, I was sorting by UV coordinate.

5- Embree
I check Cycles against Corona, Arnold, v-ray, ....
in a complex scene, all of the others are starting in around 6~8 seconds, Cycles goes to 18 seconds.

so what shall I do now "if the MIS sampling filter is better, then this task will be useless" I guess I remove filter from inside the kernels, and just add Blackman-Harris filter function to the code. what I want to implement inside Cycles: 1- http:*sglab.kaist.ac.kr/~bcmoon/ which got http:*sglab.kaist.ac.kr/LinearPrediction/ "first paper in the page right now" I think this is the paper that you referenced that is used in PRMan. I checked already the previous paper (Adaptive rendering based on Weighted Local Regression) "WLR" , and actually improved it inside PBRT2 "Moon et al." has released the code in PBRT2, and it was really a good filter, but it was only usable for uniform noise distributions + diffuse direct. so if you have interior scene, where PT will be horrible, this filter will generate artifacts due to bad variance/lack of correct lighting I improved it to take into account more data, like shadows, specular, and some glossiness 2- https://mediatech.aalto.fi/publications/graphics/GPT/ this one gives less variance using less number of samples, using the gradient domain, but the cost of sample is about 2.5x of PT "it is doing the sample + other 4 samples in -x, +x, -y, +y), but the result is way better, so 1000 spp in GPT is better than 4000 spp in PT, so improvement as said in the paper is about 1.6x 3- http://jo.dreggn.org/home/, Johannes Hanika, Manifold Next Event Estimation (MNEE) decent paper which suits very well to Cycles PT, as it will help to catch specular rays, so it will help a lot with all glass shaders, caustics, ... 4- https://disney-animation.s3.amazonaws.com/uploads/production/publication_asset/70/asset/Sorted_Deferred_Shading_For_Production_Path_Tracing.pdf as the title says :) , and this type of work suits GPU as it contains a lot of sorting and coherence, but it needs very tight texture pixel usage "like PTex", I tested the main concept of this on PBRT2, it gave some improvements , but this is only a small improvement due to the usage of UV texturing, instead of sorting by PTex index, I was sorting by UV coordinate. 5- Embree I check Cycles against Corona, Arnold, v-ray, .... in a complex scene, all of the others are starting in around 6~8 seconds, Cycles goes to 18 seconds.

Sergey Sharybin commented

2015-07-25 00:42:46 +02:00

Such a discussions are better to happen outside of the patch review, because it's not really related on the code and will only confuse reviewers.

Some quick feedback, rest i'd prefer to happen in bf-cycles ML.

Didn't check the paper in details yet, but which exact filtering you're referencing to? Since it's adaptive sampling technique it could only mean additional samples are not distributed in a noisy areas. Or are you referencing to an artifacts in the importance map?
Gradient domain rendering is indeed something to consider to look into. It will also be interesting to investigate which filters we can exploit the gradient input for.
That's quite specific case of participating media rendering, not sure it can be integrated in a really clean way. Maybe it could tho..
Isn't it something which helps for out-of-core rendering due to basically reducing of re-loading texture cache and avoiding mutex locks? We don't do such kind of rendering or tile cache in Cycles, so might not be really relevant.
It's not really clear what exactly means "starting in around 6-8 seconds". Starting to render? Preparation step in Cycles is not really as fast as it could be. It's also good to know which exact scene you've been rendering and how you set up your materials.

Maybe @brecht has some additional feedback to confirm/bust my understanding of the papers, but as said, prefer to switch to a ML discussion for such questions.

Such a discussions are better to happen outside of the patch review, because it's not really related on the code and will only confuse reviewers. ---- Some quick feedback, rest i'd prefer to happen in bf-cycles ML. 1. Didn't check the paper in details yet, but which exact filtering you're referencing to? Since it's adaptive sampling technique it could only mean additional samples are not distributed in a noisy areas. Or are you referencing to an artifacts in the importance map? 2. Gradient domain rendering is indeed something to consider to look into. It will also be interesting to investigate which filters we can exploit the gradient input for. 3. That's quite specific case of participating media rendering, not sure it can be integrated in a really clean way. Maybe it could tho.. 4. Isn't it something which helps for out-of-core rendering due to basically reducing of re-loading texture cache and avoiding mutex locks? We don't do such kind of rendering or tile cache in Cycles, so might not be really relevant. 5. It's not really clear what exactly means "starting in around 6-8 seconds". Starting to render? Preparation step in Cycles is not really as fast as it could be. It's also good to know which exact scene you've been rendering and how you set up your materials. Maybe @brecht has some additional feedback to confirm/bust my understanding of the papers, but as said, prefer to switch to a ML discussion for such questions.

Brecht Van Lommel commented

2015-07-25 00:43:41 +02:00

I don't know if this is the paper that PRMan uses or if what they do has even been published anywhere. But certainly there is a lot of interesting research going on in this area and this is a great paper.
Also a very good paper. The reconstruction artifacts look pretty bad to me though. I guess that's a big question with all these fancy reconstruction methods, how to make it conservative enough to avoid bad artifacts, while still keeping the speedup. I think the feedback loop with adapative sampling as in 1. is important for that.
Great paper, but it's only for certain types of caustics, and it's only making rendering of some very difficult effects bearable, but still very slow. I think there's other rendering methods that will help Cycles users a lot more in typical scenes, so I wouldn't give this high priority. I imagine manifold exploration is quite tricky to implement for general scenes and difficult to understand for users when it fails.
There some many things that can be optimized in Cycles in terms of being able to handle big scenes, this paper seems like a lot of complexity for both users and developers. For example it won't work well in interactive rendering. My opinion is still that the focus should be on making data structures more compact, doing good adaptive subdivision and making use of the OIIO texture cache (works mostly for OSL, not currently supported in SVM due to lack of derivatives).

If all that has been done and it's still not enough, only then should something like this be tried. But note that nearly every studio is doing ok without a sorting/coherence system like this, there's not going to be many Cycles users that do things at Disney scale.

Not sure what the starting refers to, do you mean those other renderers have a faster BVH build?

1. I don't know if this is the paper that PRMan uses or if what they do has even been published anywhere. But certainly there is a lot of interesting research going on in this area and this is a great paper. 2. Also a very good paper. The reconstruction artifacts look pretty bad to me though. I guess that's a big question with all these fancy reconstruction methods, how to make it conservative enough to avoid bad artifacts, while still keeping the speedup. I think the feedback loop with adapative sampling as in 1. is important for that. 3. Great paper, but it's only for certain types of caustics, and it's only making rendering of some very difficult effects bearable, but still very slow. I think there's other rendering methods that will help Cycles users a lot more in typical scenes, so I wouldn't give this high priority. I imagine manifold exploration is quite tricky to implement for general scenes and difficult to understand for users when it fails. 4. There some many things that can be optimized in Cycles in terms of being able to handle big scenes, this paper seems like a lot of complexity for both users and developers. For example it won't work well in interactive rendering. My opinion is still that the focus should be on making data structures more compact, doing good adaptive subdivision and making use of the OIIO texture cache (works mostly for OSL, not currently supported in SVM due to lack of derivatives). If all that has been done and it's still not enough, only then should something like this be tried. But note that nearly every studio is doing ok without a sorting/coherence system like this, there's not going to be many Cycles users that do things at Disney scale. 5. Not sure what the starting refers to, do you mean those other renderers have a faster BVH build?

Mohamed Sakr commented

2015-07-25 01:13:27 +02:00

about 5., yep, they got like 2.5x speed up in BVH build.

about 1., disney released the paper, and in PRMan they said they used the recent disney paper, so I think this is it.

I think 1 and 2 can be mixed!!, as they depend on the scene gradient data in general.

about 3. this is for luxury :D , may be an option for the user, it helps a lot to catch details, why I mentioned it? paper 1 is mainly used to significantly reduce samples, so instead of rendering 4k samples, you can be ok with 150~200 samples, so if these samples are catching details, they will solve a huge problem that the paper suffers "glossy/specular surfaces, caustic definitions"

about 4. it is all about texture access and ray intersection coherence, with Cycles, you are using many layers of textures using complex node setup, in interior scenes, Cycles is crawling "no irradiance, no lightcache/photon map, ..." , so to solve interior scenes "even using light portals" , you will need more than 2000 spp, and even 1 and 2 won't help much with this "high variance, etc.." , so the solution is to make texture access way faster., but this is a night mare as you said for the developer "to do something like this, CUDA kernel need to be split similar to OpenCL kernels, then do 2 sortings, 1 for intersections, 1 for BSDFs. "every ray bounce"

Embree is important here, it has kernels that improve hair intersection significantly on xeon devices "which got a wide vectors, suitable for GPU"

about @Sergey question for the used scene, it is just a product visualization scene, but the product got a few millions of triangles, other renderers initialize the whole scene including BVH in 6 seconds, Cycles takes 4 seconds in the scene data, then 12 seconds in the BVH creation "and I guess they don't use Morton dynamic BVH here"

also its build time is quite good, I know Cycles uses an old Embree kernel, but it needs an update "imagine a scene which renders in 30 minutes, will render in 10~20 minutes"

tell me where to discuss these stuff as I don't know :)
in my opinion, the above specified papers, 1 is the most significant, then 3, 5, 4, 2.

other than that, tell me what to do with the patch so I finalize it.

about 5., yep, they got like 2.5x speed up in BVH build. about 1., disney released the paper, and in PRMan they said they used the recent disney paper, so I think this is it. I think 1 and 2 can be mixed!!, as they depend on the scene gradient data in general. about 3. this is for luxury :D , may be an option for the user, it helps a lot to catch details, why I mentioned it? paper 1 is mainly used to significantly reduce samples, so instead of rendering 4k samples, you can be ok with 150~200 samples, so if these samples are catching details, they will solve a huge problem that the paper suffers "glossy/specular surfaces, caustic definitions" about 4. it is all about texture access and ray intersection coherence, with Cycles, you are using many layers of textures using complex node setup, in interior scenes, Cycles is crawling "no irradiance, no lightcache/photon map, ..." , so to solve interior scenes "even using light portals" , you will need more than 2000 spp, and even 1 and 2 won't help much with this "high variance, etc.." , so the solution is to make texture access way faster., but this is a night mare as you said for the developer "to do something like this, CUDA kernel need to be split similar to OpenCL kernels, then do 2 sortings, 1 for intersections, 1 for BSDFs. "every ray bounce" Embree is important here, it has kernels that improve hair intersection significantly on xeon devices "which got a wide vectors, suitable for GPU" about @Sergey question for the used scene, it is just a product visualization scene, but the product got a few millions of triangles, other renderers initialize the whole scene including BVH in 6 seconds, Cycles takes 4 seconds in the scene data, then 12 seconds in the BVH creation "and I guess they don't use Morton dynamic BVH here" also its build time is quite good, I know Cycles uses an old Embree kernel, but it needs an update "imagine a scene which renders in 30 minutes, will render in 10~20 minutes" tell me where to discuss these stuff as I don't know :) in my opinion, the above specified papers, 1 is the most significant, then 3, 5, 4, 2. other than that, tell me what to do with the patch so I finalize it.

Brecht Van Lommel commented

2015-07-25 01:25:13 +02:00

Regarding this patch, I guess it's your job to convince the Cycles developers that this sufficiently reduces noise in important cases to be worth the added code complexity. If that is clear, then someone can do a more detailed review of the code. Personally I'm not totally convinced that this should be added, the choice to go with filter importance sampling was not by accident, but I never did a detailed comparison so maybe I was wrong.

Currently the patch is not functioning correctly, it's rendering too blurry, so it's impossible to judge the usefulness.

Regarding this patch, I guess it's your job to convince the Cycles developers that this sufficiently reduces noise in important cases to be worth the added code complexity. If that is clear, then someone can do a more detailed review of the code. Personally I'm not totally convinced that this should be added, the choice to go with filter importance sampling was not by accident, but I never did a detailed comparison so maybe I was wrong. Currently the patch is not functioning correctly, it's rendering too blurry, so it's impossible to judge the usefulness.

Brecht Van Lommel commented

2015-07-25 01:39:56 +02:00

From a quick glance at the patch it seems that the filter importance sampling code is still active, so it may be doing double filtering? I don't understand the logic in the code well. The filter_table is an inverted CDF for importance sampling, but you don't need that for what you're trying to do. Rather I would expect uniform sampling inside a pixel, and filter weights computed from evaluating the filter function, without any CDFs or inversion.

Mohamed Sakr commented

2015-07-25 01:53:14 +02:00

@brecht, yea I know that, the above code is not the final, images here are from an updated code without MIS and CDFs, a different table.

Mohamed Sakr commented

2015-07-25 02:43:56 +02:00

this is a fast test, using Mitchell filter of width = 1, 64 spp

this is a fast test, using Mitchell filter of width = 1, 64 spp ![henry_no_filter.jpg](https://archive.blender.org/developer/F212251/henry_no_filter.jpg) ![henry_Mitchell_filter.jpg](https://archive.blender.org/developer/F212252/henry_Mitchell_filter.jpg)

Brecht Van Lommel commented

2015-07-25 03:02:00 +02:00

Thanks, but did you verify that it converges to the same result with many AA samples? It still looks more blurry to me.

Mohamed Sakr commented

2015-07-25 06:26:38 +02:00

not the exact same result "there is a difference because of the filter contributing weights to neighboring pixels", this is considered a bias, but the difference is smoother image.
let me explain the main difference:

considering an edge "like the hair" , where it is seen by some pixels, others it is not "it is very thin"

  pixels: 0  1  0

        0  0  1
        0  0  1

with filter it will be:

  pixels: 0.10 0.75 0.2

        0.05 0.25 0.9
        0.00 0.15 0.9

these values are just to view "they are totally wrong for sure" , so when you zoom in on the image in photoshop, you won't see it with a very harsh transition.
the error gets smaller with larger images "full HD, 4k, .."

here is a larger image "1600 x 1200", 1600 spp. I'm using 2 difference Mitchell filter widths "MIS of width 2, kernel filter of width 1", 1 image with kernel filter, 1 without.
the Hallway image is using the same concept, but only 64 spp, did this to show how in general it works "specular artifacts are spreading" , the good point: we can remove the specular component from the filter spread "so it affects only its own pixel, not the neighboring pixels"
you can check how edges are looking in both examples, edges with filter are looking much better, also noise is less in the filtered Hallway "regardless of the artifacts of Specular".

not the exact same result "there is a difference because of the filter contributing weights to neighboring pixels", this is considered a bias, but the difference is smoother image. let me explain the main difference: considering an edge "like the hair" , where it is seen by some pixels, others it is not "it is very thin" ``` pixels: 0 1 0 ``` 0 0 1 0 0 1 with filter it will be: ``` pixels: 0.10 0.75 0.2 ``` 0.05 0.25 0.9 0.00 0.15 0.9 these values are just to view "they are totally wrong for sure" , so when you zoom in on the image in photoshop, you won't see it with a very harsh transition. the error gets smaller with larger images "full HD, 4k, .." here is a larger image "1600 x 1200", 1600 spp. I'm using 2 difference Mitchell filter widths "MIS of width 2, kernel filter of width 1", 1 image with kernel filter, 1 without. the Hallway image is using the same concept, but only 64 spp, did this to show how in general it works "specular artifacts are spreading" , the good point: we can remove the specular component from the filter spread "so it affects only its own pixel, not the neighboring pixels" you can check how edges are looking in both examples, edges with filter are looking much better, also noise is less in the filtered Hallway "regardless of the artifacts of Specular". ![henrysss_head_1600spp_no_filter.jpg](https://archive.blender.org/developer/F212317/henrysss_head_1600spp_no_filter.jpg) ![henrysss_head_1600spp_Mitchell_filter.jpg](https://archive.blender.org/developer/F212318/henrysss_head_1600spp_Mitchell_filter.jpg) ![Hallway_Mitchell_filter.jpg](https://archive.blender.org/developer/F212319/Hallway_Mitchell_filter.jpg) ![Hallway_no_filter.jpg](https://archive.blender.org/developer/F212322/Hallway_no_filter.jpg)

Brecht Van Lommel commented

2015-07-25 12:23:26 +02:00

In #45519#323686, @MohamedSakr wrote:
not the exact same result "there is a difference because of the filter contributing weights to neighboring pixels", this is considered a bias, but the difference is smoother image.

No, it should converge to exactly the same result because both methods get values from neighbouring pixels. The filter importance sampling will sample an area bigger than one pixel which is really the same thing, it just has different noise characteristics.

All this can be done unbiased, I don't think there's any inherent bias in using a pixel filter.

> In #45519#323686, @MohamedSakr wrote: > not the exact same result "there is a difference because of the filter contributing weights to neighboring pixels", this is considered a bias, but the difference is smoother image. No, it should converge to exactly the same result because both methods get values from neighbouring pixels. The filter importance sampling will sample an area bigger than one pixel which is really the same thing, it just has different noise characteristics. All this can be done unbiased, I don't think there's any inherent bias in using a pixel filter.

Mohamed Sakr commented

2015-07-25 13:20:21 +02:00

hmm, explain more, there is something that I'm missing here that I don't understand.
I think it is biased because when camera casts a ray at pixel (x, y), the contribution should be only for that pixel, in this filter, the contribution is done to all surrounding pixels too.

hmm, explain more, there is something that I'm missing here that I don't understand. I think it is biased because when camera casts a ray at pixel (x, y), the contribution should be only for that pixel, in this filter, the contribution is done to all surrounding pixels too.

Brecht Van Lommel commented

2015-07-25 14:11:26 +02:00

See figures 3, 4, 5 in the filter importance sampling paper: http://lgdv.cs.fau.de/get/785. I can't really explain it better than them.

We are trying to rendering an image with a particular pixel filter, which may be bigger than a pixel. If that's wrong or right is a matter of taste, it depends on what kind of camera you're trying to simulate, what kind of screen you will be displaying the image on, how soft you want things to look, etc.

But when a user says they want to use a pixel filter of some size, then we try to render that. So if the filter is bigger than a pixel we want to use samples from outside that pixel, that's not biased, it's what the user asked for.

When we talk about bias in rendering we usually talk about biased/unbiased estimators: https://en.wikipedia.org/wiki/Bias_of_an_estimator

In simple terms, we are doing unbiased rendering if we can take many renders with different seeds and average them, and then get the correct result. That works with unbiased path tracing, but for example not with irradiance caching. In this pixel filter case, both methods are unbiased if implemented correctly.

See figures 3, 4, 5 in the filter importance sampling paper: http://lgdv.cs.fau.de/get/785. I can't really explain it better than them. We are trying to rendering an image with a particular pixel filter, which may be bigger than a pixel. If that's wrong or right is a matter of taste, it depends on what kind of camera you're trying to simulate, what kind of screen you will be displaying the image on, how soft you want things to look, etc. But when a user says they want to use a pixel filter of some size, then we try to render that. So if the filter is bigger than a pixel we want to use samples from outside that pixel, that's not biased, it's what the user asked for. When we talk about bias in rendering we usually talk about biased/unbiased estimators: https://en.wikipedia.org/wiki/Bias_of_an_estimator In simple terms, we are doing unbiased rendering if we can take many renders with different seeds and average them, and then get the correct result. That works with unbiased path tracing, but for example not with irradiance caching. In this pixel filter case, both methods are unbiased if implemented correctly.

Mohamed Sakr commented

2015-07-25 14:32:29 +02:00

oh :D , for a long time, unbiased in my mind was = anything that uses connections (PT, BDPT, MLT, ..) with bruteforce, no smoothing.
so irradiance, lightcache, photonmapping, ... are the typical biased algorithms (they use interpolations, merge), as the results of them is just interpolations "sppm is consistent though, but still it is biased".

so I thought as we are averaging pixel contributions, then this is a bias :)
BTW, I guess it is still a bias, as it is considered "merge at camera point"

I checked the paper, thanks for the link!!, I'm wondering where do you check for raster_x and raster_y inside kernel_camera.h camera_sample(), from the paper it appears that samples can get generated anywhere "possibly outside the whole buffer".

oh :D , for a long time, unbiased in my mind was = anything that uses connections (PT, BDPT, MLT, ..) with bruteforce, no smoothing. so irradiance, lightcache, photonmapping, ... are the typical biased algorithms (they use interpolations, merge), as the results of them is just interpolations "sppm is consistent though, but still it is biased". so I thought as we are averaging pixel contributions, then this is a bias :) BTW, I guess it is still a bias, as it is considered "merge at camera point" I checked the paper, thanks for the link!!, I'm wondering where do you check for raster_x and raster_y inside kernel_camera.h camera_sample(), from the paper it appears that samples can get generated anywhere "possibly outside the whole buffer".

Brecht Van Lommel commented

2015-07-25 15:25:44 +02:00

In #45519#323795, @MohamedSakr wrote:
I'm wondering where do you check for raster_x and raster_y inside kernel_camera.h camera_sample(), from the paper it appears that samples can get generated anywhere "possibly outside the whole buffer".

Yes, samples can be generated outside the buffer. I'm not sure what you mean by "checking" raster_x and raster_y. They are computed by importance sampling the filter function and can be arbitrarily far outside the pixel, which works fine without any further checks.

> In #45519#323795, @MohamedSakr wrote: > I'm wondering where do you check for raster_x and raster_y inside kernel_camera.h camera_sample(), from the paper it appears that samples can get generated anywhere "possibly outside the whole buffer". Yes, samples can be generated outside the buffer. I'm not sure what you mean by "checking" raster_x and raster_y. They are computed by importance sampling the filter function and can be arbitrarily far outside the pixel, which works fine without any further checks.

Mohamed Sakr commented

2015-07-25 15:36:20 +02:00

ah, I see, this is the benefit for MIS filter :) "writing to the current pixel"

Mohamed Sakr commented

2015-07-27 01:26:04 +02:00

so to finalize the code, I have to change a little in buffers "as described earlier, so buffer will be a little larger by a margin of pixels depending on filter size", this also will give a benefit later "when implementing image denoising algorithm from paper (1)", as it has to access/write to neighboring pixels as well.

about the code complexity, I will try to keep it as straight forward as possible, possibly with some #define macros.
for now, the kernel behaves correctly. will update and provide diff soon.

so to finalize the code, I have to change a little in buffers "as described earlier, so buffer will be a little larger by a margin of pixels depending on filter size", this also will give a benefit later "when implementing image denoising algorithm from paper (1)", as it has to access/write to neighboring pixels as well. about the code complexity, I will try to keep it as straight forward as possible, possibly with some #define macros. for now, the kernel behaves correctly. will update and provide diff soon.

Dalai Felinto commented

2019-12-23 18:42:37 +01:00

Added subscriber: @dfelinto

Dalai Felinto commented

2019-12-23 18:42:37 +01:00

Changed status from 'Confirmed' to: 'Archived'

Dalai Felinto closed this issue

2019-12-23 18:42:37 +01:00

Dalai Felinto commented

2019-12-23 18:42:37 +01:00

Hi, thanks for your patch.

We are undergoing a Tracker Curfew where we are automatically closing old patches.

If you think the patch is still relevant please update and re-submit it. For new features make sure there is a clear design from the user level perspective.

Hi, thanks for your patch. We are undergoing a [Tracker Curfew ](https://code.blender.org/?p=3861) where we are automatically closing old patches. If you think the patch is still relevant please update and re-submit it. For new features make sure there is a clear design from the user level perspective.

Sign in to join this conversation.

No Label

Download

What's New

Blender Studio

Manual

Developers Blog

Documentation

Benchmark

Blender Conference

Development Fund

One-time Donations

Cycles Image Reconstruction kernel #45519