Cycles: Implement Dwivedi guiding for path-traced subsurface scattering

Cycles has supported path-traced subsurface scattering for a while, but while it's more accurate than other approaches, the increase in noise makes it an expensive option. To improve this, this patch implements Dwivedi guiding, a technique that is based on zero-variance random walk theory from particle physics and helps to produce shorter random walks with more consistent throughput. The idea behind this is that in non-white materials, each scattering event inside the medium reduces the path throughput. Therefore, the darker the material is, the lower the contribution of paths that travel far from the origin is. In order to reduce variance, Dwivedi guiding uses modified direction and distance sampling functions that favor paths which go back towards the medium interface. By carefully selecting these sampling distributions, variance can be greatly reduced, and as a neat side effect shorter paths are produced, which speeds up the process. One limitation of just blindly applying this is that the guiding is derived from the assumption of a medium that covers an infinite half-space. Therefore, at corners or thin geometry where this does not hold, the algorithm might lead to fireflies. To avoid this, the implementation here uses MIS to combine the classic and guided sampling. Since each of those works on one of the three color channels, the final estimator combines six sampling techniques. This results in some unintuitive math, but I tried to structure it in a way that makes some sense. Another improvement is that in areas where the other side of the mesh is close (e.g. ears), the algorithm has a chance to switch to guiding towards the other side. This chance is based on how deep the random walk is inside the object, and once again MIS is applied to the decision, giving a total of nine techniques. Combining all this, the noise of path-traced subsurface scattering is reduced significantly. In my testing with the Rain character model and a simple lighting setup, the path-traced SSS is now actually less noisy than the Christensen-Burley approximation at same render time while of course still being significantly more realistic. Differential Revision: https://developer.blender.org/D9932
2021-01-14 20:55:52 +01:00 · 2021-01-14 20:55:52 +01:00 · 2f6d62bf88
parent 67c8d97db3
commit 2f6d62bf88
1 changed files with 205 additions and 41 deletions
--- a/intern/cycles/kernel/kernel_subsurface.h
+++ b/intern/cycles/kernel/kernel_subsurface.h
@ -336,38 +336,95 @@ ccl_device_noinline void subsurface_scatter_multi_setup(
 ccl_device void subsurface_random_walk_remap(const float A,
                                             const float d,
                                             float *sigma_t,
-                                             float *sigma_s)
+                                             float *alpha)
 {
  /* Compute attenuation and scattering coefficients from albedo. */
-  const float a = 1.0f - expf(A * (-5.09406f + A * (2.61188f - A * 4.31805f)));
+  *alpha = 1.0f - expf(A * (-5.09406f + A * (2.61188f - A * 4.31805f)));
  const float s = 1.9f - A + 3.5f * sqr(A - 0.8f);

  *sigma_t = 1.0f / fmaxf(d * s, 1e-16f);
-  *sigma_s = *sigma_t * a;
 }

 ccl_device void subsurface_random_walk_coefficients(const ShaderClosure *sc,
                                                    float3 *sigma_t,
-                                                    float3 *sigma_s,
+                                                    float3 *alpha,
                                                    float3 *weight)
 {
  const Bssrdf *bssrdf = (const Bssrdf *)sc;
  const float3 A = bssrdf->albedo;
  const float3 d = bssrdf->radius;
  float sigma_t_x, sigma_t_y, sigma_t_z;
-  float sigma_s_x, sigma_s_y, sigma_s_z;
+  float alpha_x, alpha_y, alpha_z;

-  subsurface_random_walk_remap(A.x, d.x, &sigma_t_x, &sigma_s_x);
-  subsurface_random_walk_remap(A.y, d.y, &sigma_t_y, &sigma_s_y);
-  subsurface_random_walk_remap(A.z, d.z, &sigma_t_z, &sigma_s_z);
+  subsurface_random_walk_remap(A.x, d.x, &sigma_t_x, &alpha_x);
+  subsurface_random_walk_remap(A.y, d.y, &sigma_t_y, &alpha_y);
+  subsurface_random_walk_remap(A.z, d.z, &sigma_t_z, &alpha_z);

  *sigma_t = make_float3(sigma_t_x, sigma_t_y, sigma_t_z);
-  *sigma_s = make_float3(sigma_s_x, sigma_s_y, sigma_s_z);
+  *alpha = make_float3(alpha_x, alpha_y, alpha_z);

  /* Closure mixing and Fresnel weights separate from albedo. */
  *weight = safe_divide_color(bssrdf->weight, A);
 }

+/* References for Dwivedi sampling:
+ *
+ * [1] "A Zero-variance-based Sampling Scheme for Monte Carlo Subsurface Scattering"
+ * by Jaroslav Křivánek and Eugene d'Eon (SIGGRAPH 2014)
+ * https://cgg.mff.cuni.cz/~jaroslav/papers/2014-zerovar/
+ *
+ * [2] "Improving the Dwivedi Sampling Scheme"
+ * by Johannes Meng, Johannes Hanika, and Carsten Dachsbacher (EGSR 2016)
+ * https://cg.ivd.kit.edu/1951.php
+ *
+ * [3] "Zero-Variance Theory for Efficient Subsurface Scattering"
+ * by Eugene d'Eon and Jaroslav Křivánek (SIGGRAPH 2020)
+ * https://iliyan.com/publications/RenderingCourse2020
+ */
+
+ccl_device_forceinline float eval_phase_dwivedi(float v, float phase_log, float cos_theta)
+{
+  /* Eq. 9 from [2] using precomputed log((v + 1) / (v - 1))*/
+  return 1.0f / ((v - cos_theta) * phase_log);
+}
+
+ccl_device_forceinline float sample_phase_dwivedi(float v, float phase_log, float rand)
+{
+  /* Based on Eq. 10 from [2]: v - (v + 1) * pow((v - 1) / (v + 1), rand)
+   * Since we're already precomputing phase_log = log((v + 1) / (v - 1)) for the evaluation,
+   * we can implement the power function like this. */
+  return v - (v + 1) * expf(-rand * phase_log);
+}
+
+ccl_device_forceinline float diffusion_length_dwivedi(float alpha)
+{
+  /* Eq. 67 from [3] */
+  return 1.0f / sqrtf(1.0f - powf(alpha, 2.44294f - 0.0215813f * alpha + 0.578637f / alpha));
+}
+
+ccl_device_forceinline float3 direction_from_cosine(float3 D, float cos_theta, float randv)
+{
+  float sin_theta = safe_sqrtf(1.0f - cos_theta * cos_theta);
+  float phi = M_2PI_F * randv;
+  float3 dir = make_float3(sin_theta * cosf(phi), sin_theta * sinf(phi), cos_theta);
+
+  float3 T, B;
+  make_orthonormals(D, &T, &B);
+  return dir.x * T + dir.y * B + dir.z * D;
+}
+
+ccl_device_forceinline float3 subsurface_random_walk_pdf(float3 sigma_t,
+                                                         float t,
+                                                         bool hit,
+                                                         float3 *transmittance)
+{
+  float3 T = volume_color_transmittance(sigma_t, t);
+  if (transmittance) {
+    *transmittance = T;
+  }
+  return hit ? T : sigma_t * T;
+}
+
 #ifdef __KERNEL_OPTIX__
 ccl_device_inline /* inline trace calls */
 #else
@ -390,10 +447,24 @@ ccl_device_noinline
    return 0;
  }

-  /* Convert subsurface to volume coefficients. */
-  float3 sigma_t, sigma_s;
+  /* Convert subsurface to volume coefficients.
+   * The single-scattering albedo is named alpha to avoid confusion with the surface albedo. */
+  float3 sigma_t, alpha;
  float3 throughput = make_float3(1.0f, 1.0f, 1.0f);
-  subsurface_random_walk_coefficients(sc, &sigma_t, &sigma_s, &throughput);
+  subsurface_random_walk_coefficients(sc, &sigma_t, &alpha, &throughput);
+  float3 sigma_s = sigma_t * alpha;
+
+  /* Theoretically it should be better to use the exact alpha for the channel we're sampling at
+   * each bounce, but in practise there doesn't seem to be a noticeable difference in exchange
+   * for making the code significantly more complex and slower (if direction sampling depends on
+   * the sampled channel, we need to compute its PDF per-channel and consider it for MIS later on).
+   *
+   * Since the strength of the guided sampling increases as alpha gets lower, using a value that
+   * is too low results in fireflies while one that's too high just gives a bit more noise.
+   * Therefore, the code here uses the highest of the three albedos to be safe. */
+  float diffusion_length = diffusion_length_dwivedi(max3(alpha));
+  /* Precompute term for phase sampling. */
+  float phase_log = logf((diffusion_length + 1) / (diffusion_length - 1));

  /* Setup ray. */
 #ifdef __SPLIT_KERNEL__
@ -414,68 +485,161 @@ ccl_device_noinline

  /* Random walk until we hit the surface again. */
  bool hit = false;
+  bool have_opposite_interface = false;
+  float opposite_distance = 0.0f;
+
+  /* Todo: Disable for alpha>0.999 or so? */
+  const float guided_fraction = 0.75f;

  for (int bounce = 0; bounce < BSSRDF_MAX_BOUNCES; bounce++) {
    /* Advance random number offset. */
    state->rng_offset += PRNG_BOUNCE_NUM;

-    if (bounce > 0) {
-      /* Sample scattering direction. */
-      const float anisotropy = 0.0f;
-      float scatter_u, scatter_v;
-      path_state_rng_2D(kg, state, PRNG_BSDF_U, &scatter_u, &scatter_v);
-      ray->D = henyey_greenstrein_sample(ray->D, anisotropy, scatter_u, scatter_v, NULL);
-    }
-
    /* Sample color channel, use MIS with balance heuristic. */
    float rphase = path_state_rng_1D(kg, state, PRNG_PHASE_CHANNEL);
-    float3 albedo = safe_divide_color(sigma_s, sigma_t);
    float3 channel_pdf;
-    int channel = kernel_volume_sample_channel(albedo, throughput, rphase, &channel_pdf);
-
-    /* Distance sampling. */
-    float rdist = path_state_rng_1D(kg, state, PRNG_SCATTER_DISTANCE);
+    int channel = kernel_volume_sample_channel(alpha, throughput, rphase, &channel_pdf);
    float sample_sigma_t = kernel_volume_channel_get(sigma_t, channel);
-    float t = -logf(1.0f - rdist) / sample_sigma_t;
+    float randt = path_state_rng_1D(kg, state, PRNG_SCATTER_DISTANCE);

-    ray->t = t;
+    /* We need the result of the raycast to compute the full guided PDF, so just remember the
+     * relevant terms to avoid recomputing them later. */
+    float backward_fraction = 0.0f;
+    float forward_pdf_factor = 0.0f;
+    float forward_stretching = 1.0f;
+    float backward_pdf_factor = 0.0f;
+    float backward_stretching = 1.0f;
+
+    /* For the initial ray, we already know the direction, so just do classic distance sampling. */
+    if (bounce > 0) {
+      /* Decide whether we should use guided or classic sampling. */
+      bool guided = (path_state_rng_1D(kg, state, PRNG_LIGHT_TERMINATE) < guided_fraction);
+
+      /* Determine if we want to sample away from the incoming interface.
+       * This only happens if we found a nearby opposite interface, and the probability for it
+       * depends on how close we are to it already.
+       * This probability term comes from the recorded presentation of [3]. */
+      bool guide_backward = false;
+      if (have_opposite_interface) {
+        /* Compute distance of the random walk between the tangent plane at the starting point
+         * and the assumed opposite interface (the parallel plane that contains the point we
+         * found in our ray query for the opposite side). */
+        float x = clamp(dot(ray->P - sd->P, -sd->N), 0.0f, opposite_distance);
+        backward_fraction = 1.0f / (1.0f + expf((opposite_distance - 2 * x) / diffusion_length));
+        guide_backward = path_state_rng_1D(kg, state, PRNG_TERMINATE) < backward_fraction;
+      }
+
+      /* Sample scattering direction. */
+      float scatter_u, scatter_v;
+      path_state_rng_2D(kg, state, PRNG_BSDF_U, &scatter_u, &scatter_v);
+      float cos_theta;
+      if (guided) {
+        cos_theta = sample_phase_dwivedi(diffusion_length, phase_log, scatter_u);
+        /* The backwards guiding distribution is just mirrored along sd->N, so swapping the
+         * sign here is enough to sample from that instead. */
+        if (guide_backward) {
+          cos_theta = -cos_theta;
+        }
+      }
+      else {
+        cos_theta = 2.0f * scatter_u - 1.0f;
+      }
+      ray->D = direction_from_cosine(sd->N, cos_theta, scatter_v);
+
+      /* Compute PDF factor caused by phase sampling (as the ratio of guided / classic).
+       * Since phase sampling is channel-independent, we can get away with applying a factor
+       * to the guided PDF, which implicitly means pulling out the classic PDF term and letting
+       * it cancel with an equivalent term in the numerator of the full estimator.
+       * For the backward PDF, we again reuse the same probability distribution with a sign swap.
+       */
+      forward_pdf_factor = 2.0f * eval_phase_dwivedi(diffusion_length, phase_log, cos_theta);
+      backward_pdf_factor = 2.0f * eval_phase_dwivedi(diffusion_length, phase_log, -cos_theta);
+
+      /* Prepare distance sampling.
+       * For the backwards case, this also needs the sign swapped since now directions against
+       * sd->N (and therefore with negative cos_theta) are preferred. */
+      forward_stretching = (1.0f - cos_theta / diffusion_length);
+      backward_stretching = (1.0f + cos_theta / diffusion_length);
+      if (guided) {
+        sample_sigma_t *= guide_backward ? backward_stretching : forward_stretching;
+      }
+    }
+
+    /* Sample direction along ray. */
+    float t = -logf(1.0f - randt) / sample_sigma_t;
+
+    /* On the first bounce, we use the raycast to check if the opposite side is nearby.
+     * If yes, we will later use backwards guided sampling in order to have a decent
+     * chance of connecting to it.
+     * Todo: Maybe use less than 10 times the mean free path? */
+    ray->t = (bounce == 0) ? max(t, 10.0f / (min3(sigma_t))) : t;
    scene_intersect_local(kg, ray, ss_isect, sd->object, NULL, 1);
    hit = (ss_isect->num_hits > 0);

    if (hit) {
 #ifdef __KERNEL_OPTIX__
      /* t is always in world space with OptiX. */
-      t = ss_isect->hits[0].t;
+      ray->t = ss_isect->hits[0].t;
 #else
      /* Compute world space distance to surface hit. */
      float3 D = ray->D;
      object_inverse_dir_transform(kg, sd, &D);
      D = normalize(D) * ss_isect->hits[0].t;
      object_dir_transform(kg, sd, &D);
-      t = len(D);
+      ray->t = len(D);
 #endif
    }

+    if (bounce == 0) {
+      /* Check if we hit the opposite side. */
+      if (hit) {
+        have_opposite_interface = true;
+        opposite_distance = dot(ray->P + ray->t * ray->D - sd->P, -sd->N);
+      }
+      /* Apart from the opposite side check, we were supposed to only trace up to distance t,
+       * so check if there would have been a hit in that case. */
+      hit = ray->t < t;
+    }
+
+    /* Use the distance to the exit point for the throughput update if we found one. */
+    if (hit) {
+      t = ray->t;
+    }
+
    /* Advance to new scatter location. */
    ray->P += t * ray->D;

-    /* Update throughput. */
-    float3 transmittance = volume_color_transmittance(sigma_t, t);
-    float pdf = dot(channel_pdf, (hit) ? transmittance : sigma_t * transmittance);
-    throughput *= ((hit) ? transmittance : sigma_s * transmittance) / pdf;
+    float3 transmittance;
+    float3 pdf = subsurface_random_walk_pdf(sigma_t, t, hit, &transmittance);
+    if (bounce > 0) {
+      /* Compute PDF just like we do for classic sampling, but with the stretched sigma_t. */
+      float3 guided_pdf = subsurface_random_walk_pdf(forward_stretching * sigma_t, t, hit, NULL);
+
+      if (have_opposite_interface) {
+        /* First step of MIS: Depending on geometry we might have two methods for guided
+         * sampling, so perform MIS between them. */
+        float3 back_pdf = subsurface_random_walk_pdf(backward_stretching * sigma_t, t, hit, NULL);
+        guided_pdf = mix(
+            guided_pdf * forward_pdf_factor, back_pdf * backward_pdf_factor, backward_fraction);
+      }
+      else {
+        /* Just include phase sampling factor otherwise. */
+        guided_pdf *= forward_pdf_factor;
+      }
+
+      /* Now we apply the MIS balance heuristic between the classic and guided sampling. */
+      pdf = mix(pdf, guided_pdf, guided_fraction);
+    }
+
+    /* Finally, we're applying MIS again to combine the three color channels.
+     * Altogether, the MIS computation combines up to nine different estimators:
+     * {classic, guided, backward_guided} x {r, g, b} */
+    throughput *= (hit ? transmittance : sigma_s * transmittance) / dot(channel_pdf, pdf);

    if (hit) {
      /* If we hit the surface, we are done. */
      break;
    }
-
-    /* Russian roulette. */
-    float terminate = path_state_rng_1D(kg, state, PRNG_TERMINATE);
-    float probability = min(max3(fabs(throughput)), 1.0f);
-    if (terminate >= probability) {
-      break;
-    }
-    throughput /= probability;
  }

  kernel_assert(isfinite_safe(throughput.x) && isfinite_safe(throughput.y) &&