Cycles volume : fast empty space optimization by generating a tight mesh around the volume.
ClosedPublic

Authored by Kévin Dietrich (kevindietrich) on Feb 4 2018, 1:01 AM.

Details

Summary

From the GSOC ideas page. I never thought about doing such an optimization, and I was so impressed with the idea that I had an urge to do it :)

This implements a basic marching cubes algorithm to generate a mesh around a volume given an isovalue.

There are two parameters to the algorithm that are yet to be exposed to the UI : isovalue and cube size. There are no general values, so it is best to let the user plug whatever values are best fit for their volumes, as it is possible to have small gaps/holes in the generated mesh. Also the number of triangles generated is proportional to the resolution of the volume and the cube size.

For now boundaries are not handled, which means that if the volume touches a side of its domain, then there will be a huge hole in the mesh.

(The generated mesh can of course be rendered by using a diffuse shader on the domain.)

Diff Detail

Repository
rB Blender

This gives me a 2.2x speedup on this file :

Great to see the idea actually works!

I wasn't thinking about the marching cubes algorithm actually, but rather axis aligned quads between empty and non-empty voxels. And then you'd make that smarter by considering e.g. 8x8x8 cubes instead, and merging neighbouring quads and deduplicating vertices to reduce memory usage. For OpenVDB files that would then also naturally work by building around leaf nodes, without having to read the actual voxel data.

Some other things to make this robust:

  • All used grids should be taken into account, not just density. For example there can be fire with zero smoke density.
  • The code for this should be somewhere in the mesh update code, using the voxel attributes of a mesh.
  • Cube size should be hardcoded, not need to bother users with this. Ideally bounding mesh memory usage could be like < 10% of the grid itself, but we'll have to measure where the best trade-off is.
  • A threshold below which grid values are set to zero should ideally be applied by the smoke simulator, to save disk space when writing cache files. If that works well enough we don't need a renderer setting.
  • There need to be 2 voxels padding for cubic interpolation, 1 for linear interpolation. If we support volume displacement in the future, and additional user specified amount of padding would need to be supported as well.

I wasn't thinking about the marching cubes algorithm actually, but rather axis aligned quads between empty and non-empty voxels. And then you'd make that smarter by considering e.g. 8x8x8 cubes instead, and merging neighbouring quads and deduplicating vertices to reduce memory usage. For OpenVDB files that would then also naturally work by building around leaf nodes, without having to read the actual voxel data.

This reminds me that I had a similar idea for OpenGL rendering of VDB grids. But instead of using leaf nodes, I used level 2 nodes: each level 2 node would be turned into a cube and then a 3D texture would be created based on the buffers of the leaf nodes inside it. I still have some of this code in my volume object branch.

We could mix this idea with that of the marching cube algorithm. We march an 8x8x8 cube and see if any corners are in the volume. If one corner is in the volume then we create polygons for each of the three opposite cube faces, if two corners are in the volume, we create polygons for the two opposite cube faces, and if three or four corners are in the volumes, we create a single polygon for the opposite cube face. Hopefully we shouldn't have to handle more cases: if more than four corners are in the volume, then surrounding cubes will take care of creating vertices for the corners that are not in volume. Then it would be just a matter of deduplicating vertices.

  • A threshold below which grid values are set to zero should ideally be applied by the smoke simulator, to save disk space when writing cache files. If that works well enough we don't need a renderer setting.

There is a hardcoded threshold for when we write .vdb files to decide between active and inactive voxels, but not for the Blender specific pointcache format. Maybe we should expose this threshold in the smoke simulation UI?

  • There need to be 2 voxels padding for cubic interpolation, 1 for linear interpolation. If we support volume displacement in the future, and additional user specified amount of padding would need to be supported as well.

Right. Padding should also take care of the boundary issue I currently have. However, the 8x8x8 cube idea won't work for OpenVDB grids without accessing the voxels, as padding may include areas that are outside the allocated leaf nodes, so we would still need to read the entire data, or perhaps just part of it (we could then free the data, and reload as needed during rendering).

This reminds me that I had a similar idea for OpenGL rendering of VDB grids. But instead of using leaf nodes, I used level 2 nodes: each level 2 node would be turned into a cube and then a 3D texture would be created based on the buffers of the leaf nodes inside it. I still have some of this code in my volume object branch.

Right, that kind of sparse tiled texture storage would be good for both OpenGL and Cycles. In practice I think you'd need to pack tiles into a single texture, to avoid limits on the number of bound textures. And a dense grid with each cell pointing to one or no tile.

We could mix this idea with that of the marching cube algorithm. We march an 8x8x8 cube and see if any corners are in the volume. If one corner is in the volume then we create polygons for each of the three opposite cube faces, if two corners are in the volume, we create polygons for the two opposite cube faces, and if three or four corners are in the volumes, we create a single polygon for the opposite cube face. Hopefully we shouldn't have to handle more cases: if more than four corners are in the volume, then surrounding cubes will take care of creating vertices for the corners that are not in volume. Then it would be just a matter of deduplicating vertices.

I think we'd still need to consider all 8x8x8 voxels in the cube to ensure we are not cutting out any detail? But then this technique would indeed give you tighter bounds. My only concern here is performance, we want bounding mesh construction to be really fast, but this may be entirely worth it if many samples are taken.

There is a hardcoded threshold for when we write .vdb files to decide between active and inactive voxels, but not for the Blender specific pointcache format. Maybe we should expose this threshold in the smoke simulation UI?

Yes, I think so. This threshold could then be applied when exporting the grids to OpenGL or Cycles, or maybe earlier in the smoke simulation if there's a good place for it.

Right. Padding should also take care of the boundary issue I currently have. However, the 8x8x8 cube idea won't work for OpenVDB grids without accessing the voxels, as padding may include areas that are outside the allocated leaf nodes, so we would still need to read the entire data, or perhaps just part of it (we could then free the data, and reload as needed during rendering).

I don't think padding requires reading the voxels? For a small padding, you could just push the quads 1 or 2 voxels outwards in a way that they still form a watertight mesh. If you're padding more than the cube size you need to do some kind of dilation, but not for individual voxels.

I rewrote the algorithm to make use of 8x8x8 cubes since the marching cube idea with a cube size greater than one does not really work; for it to work we would need to interpolate all the voxels around the corners of the cube within a distance equivalent to the size of the cube; which is not efficient.

The current algorithm somewhat downsamples the volumes to one that is an 8th of the original one such that each voxel in the downsampled version correspond to an 8x8x8 voxel cube in the original volume. We do not actually downsample the volume, we just map the coordinates of the active voxels to their 1/8 counterparts (see code for details). It is basically a 3D map describing where there is data. And here is the result :

Overall the algorithm is quite fast, and trivially supports looking up multiple grids (velocity, colors, etc.). I think such an approach would translate fairly easily to the creation of a tiled 3D image. And also it will translate fairly well to OpenVDB, as all we would need is the minimum coordinate of each leaf node.

The interior geometry is screwing up the render though :

Right now I am trying to make the algorithm sowmewhat smarter by not creating the interior geometry, which is trivial, but there is a bug : some faces are missing. I tried multiple volumes, and it seems that the faces from the first few nodes are missing, but I haven't found where the bug is.

And the volume somehow disappears from the render (I guess we need a completely closed mesh).

Also my vertex deduplication scheme is not effective anymore. So, I have a bit of bug hunting to do. In the meantime I am going to update the patch if you want to have a look at the new implementation. The way I am getting the image data is not nice though.


And now that I think of it, do we actually need padding? The volume data is not truncated, it is just the mesh that is "truncated". So if you sample the volume where you first hit on the mesh, the volume data is still here. In any case, I think padding can be simply implemented by moving vertices, as a post-process, along their normal by a factor of sqrt(3) * voxel size * interpolation size, since their normal should be along the diagonal of the voxels.

Update patch as a checkpoint.

Great, this is how I imagined to work. Some code comments, even though this is work in progress and you might already be aware of some of them.

And the volume somehow disappears from the render (I guess we need a completely closed mesh).

If the mesh is not completely closed you'd still see something I think. Perhaps the normals should point in the opposite direction (flip vertex indices in triangle)?.

And now that I think of it, do we actually need padding? The volume data is not truncated, it is just the mesh that is "truncated". So if you sample the volume where you first hit on the mesh, the volume data is still here. In any case, I think padding can be simply implemented by moving vertices, as a post-process, along their normal by a factor of sqrt(3) * voxel size * interpolation size, since their normal should be along the diagonal of the voxels.

The padding is needed because in texture filtering we interpolate from the 2x2x2 or 3x3x3 voxels surrounding the position. The filter for a position that is just outside the grid can cover a voxel inside the grid. Often it's not particularly noticeable, it mainly happens with dense volumes that don't have a smooth falloff towards the boundary.

I think doing the padding as a post process is fine for padding less than 4 voxels, which is all we need right now for texture filtering.

For volume displacement (which is not supported yet) we will need more padding, and displacing the quads very far along the normal will create a mess. That could be solved by dilating the index_map grid. When we do this in the future I think the displacement along the normal will still be part of the solution as well. If you want to pad by 11 voxels for example, you'd get a tighter mesh by dilating the 8x8x8 cubes once, and then displacing the quads by the remaining 3 voxel padding along the normal.

intern/cycles/render/image.cpp
87

I think you could return the DeviceMemory pointer here. That will give you the data type and width/height/depth. The DeviceMemory is all that is passed to the device/kernel code, so the information in there should be sufficient.

intern/cycles/render/mesh.cpp
579–580

This should be clearing more. Basically all the mesh and subd topology data, as well as any mesh attributes except for the voxel types.

2002

This should loop over all voxel attributes:

foreach(Attribute& attr, mesh->attributes.attributes) {
    if(attr.type == ATTR_ELEMENT_VOXEL) {
intern/cycles/render/mesh_volume.cpp
27

Change to size_t x, size_t y, size_t z.

94–96

This should be rounded up to make resolution that is not a multiple of 8 work:

const size_t x = divide_up(params->resolution.x, 8);
const size_t y = divide_up(params->resolution.y, 8);
const size_t z = divide_up(params->resolution.z, 8);

Also use size_t for the total size to support very large grids.

112–116

Change int to size_t for x, index_x, index, .. .

184

size_t voxel_index

334

Same here, this should also handle all voxel attributes in a more generic way.

373

We should get the resolution from the image manager for each voxel grid. Handling grids with different resolutions is more complicated, and we don't have to support that immediately, but should at least cancel building of the mesh instead of crashing then.

381

size_t voxel_index

460–462

In the end these stats should be printed with VLOG(1) <<, but for debugging now it doesn't matter yet.

Kévin Dietrich (kevindietrich) marked 10 inline comments as done.Feb 24 2018, 4:34 PM

The missing quads issue was due to the mesh's shader array not being updated, so clearing it before creating the new mesh fixes the issue. (I guess they had an invalid shader or a transparent one?)

Since we are creating cubes the algorithm is now adding normals to the mesh as well, as they are simply perpendicular to the faces it's trivial to add them.

Padding is done as a post-process as discussed, however the mesh is a bit rounded:

I think this is due to the fact that vertices whose normals are perpendicular to the faces are displaced a little too much as we use the diagonal of a unit cube as a coefficient.

However, I have a new issue, if I do a final render the mesh is scaled down:

In some other files, even in the viewport render there is some matrix issue:

Maybe some matrix attribute is missing? Or I am not computing the vertex positions in the right space. Or both.

Also, in both cases the volume is not rendered. By doing a bit of debugging it seems this time there is some issues with clearing the mesh's attributes. Previously the volume would not render because the normals were not pointing in the right direction as you said.

It appears that if I delete non-voxel attributes from the attributes list, whether I assign the face normals manually through the algorithm or if I let cycles compute them, nothing is rendered apart from the plane and the emitter.

However, if I do not clear any of the attributes, I get random corrupted renders:

With a debug build it asserts in the kernel "Non-finite sum in path_radiance_clamp_and_sum!".

I am going to update the patch for you to judge on the non-voxel attributes handling/freeing.

Update patch as checkpoint.

Here's a couple fixes:

  • Creates volume mesh before the object manager applies transforms, computes mesh bounds, etc.
  • Modify attribute remove code so it doesn't make copies of Attributes internally, caused image slot issues.
  • Some tweaks to the padding code.

This seems to solve the scaling problems and now renders volumes correctly for me in a simple test.

Fix for adaptive domains. Determine start point, cell size and resolution from transform and images we already have.

Not intending to take over the patch by the way, was just curious to try it on some test files here. This one is 2.1x faster.

For the padding it may be better to loop over the quads and then displace the vertices along the quad face normal? We'd have to avoid displacing a vertex twice along the same direction though.

About padding, we can do it directly on the integer vertex coordinates like this: P624.

However, now I realize that just moving the vertex coordinates after the mesh has been generated doesn't always work. Specifically with this kind of topology:

----
|  |
-------
   |  |
   ----

Not sure what the best solution is for that.

Thanks for the fixes!

I did some renders to check on speed differences, and I am getting some artifacts. In the first set of renders, the shadow of the artifacts is visible below the torus. In these files I have between ~1.3x and ~1.6x speedups.

 
 
 
 
 
 

The artifacts seem to be related to the ray direction/angle of intersection, and they only seem to affect corners of the cubes.


As for the padding issue that you raise, I think we are doing it the complicated way. I see two ways to more appropriately approach padding:

  1. We merge all grids into a boolean one with 0s for non-active voxels, 1s for active voxels, then we dilate it by the padding size, and we use that boolean grid as input for the current algorithm. Since the current algorithm is already merging the grids into one, we can simply do it like that :
for(int px = x - pad_size; px < x + pad_size; ++px) {
    for(int py = y - pad_size; py < y + pad_size; ++py) {
        for(int pz = z - pad_size; pz < z + pad_size; ++pz) {
            builder.add_node(px, py, pz);
        }
    }
}

Although it's more expensive, it will nicely take care of the cases where we don't really need padding, e.g. when there are only voxels in less than half of the 8x8x8 cube, and it will be as tight as possible. However, for OpenVDB we can't do it without accessing the voxel data.

  1. We simply dilate index_map according to the padding size (once if padding size < 8, twice if < 16, etc...), which will give us more padding than needed.

We can use the first idea for dense grids where we have no choice but to access voxel data, and the second for OpenVDB grids.

For the artifacts, try increasing Light Paths > Transparency Max. It seems quite possible to exceed the default 8. We should probably increase the default a lot, or ignore it for volumes like this.

For padding, yes I think what you propose is ok.

  • Use a byte grid for the index_map.
  • Add padding when adding nodes.
  • Cleanup.

This getting close to ready I think, it's working well here.

Even with a high transparency max the faces are still showing up in the noise, before disappearing with more samples. I'm looking into path termination and Russian roulette changes to improve this.

intern/cycles/render/mesh_volume.cpp
91

Include util/util_vector.h and use vector instead of std::vector. We generally import STL datatypes into the Cycles namespace, and in this case also use a custom allocator.

179

Add something like static const int CUBE_SIZE = 8; to replace 8 everywhere.

192

Use vector<char> grid;, and grid.resize(px * py * pz, 0);, no manual memory management if we can avoid it.

412–417

We should not have these standard grids hardcoded. Instead we can make it generic and store it in a vector for fast access. Something like:

struct VoxelAttributeGrid {
    float *data;
    int channels;
}
vector<VoxelAttributeGrid> voxel_grids;

...

    VoxelAttributeGrid voxel_grid;
    voxel_grid.data = static_cast<float *>(image_memory->host_pointer);
    voxel_grid.channels = image_memory->data_elements;
    voxel_grids.push_back(voxel_grid);
  • Handle comments from review.

Looks good to me, thanks a lot!

This revision is now accepted and ready to land.Feb 28 2018, 11:36 PM

Thanks for this Kévin and Brecht.

I'm not sure if this is ready for non-developer tests. Anyway, here some artifacts in this file, frame 150 in my test:

Patched on 4403ca8

@YAFU (YAFU), this is more than ready for testing, actually it is ready to be commited to master. For the artifacts, you should try increasing the Light Paths > Transparency Max setting, I set it to 16 in my tests files.

Artifacts should be fixed now after rB7f86afec9d67: Cycles: don't count volume boundaries as transparent bounces., even without increasing transparency max.

Ah, cool!
Here with Max Transparency = 16:

Code looks good to me! Nice work!

intern/cycles/render/mesh_volume.cpp
397

If we keep this for master (for debugging), I suggest to add a comment to make it clear for everyone why this is here.

This revision was automatically updated to reflect the committed changes.