Page MenuHome

Sanitize size handling in write code
Closed, ResolvedPublicDESIGN

Description

Current handling of size (in bytes) in writer code is rather messy, to say the least: different part of the code uses int, uint and size_t. BHead itself stores size in int...

While int should be enough in most cases (it allows chunks of 2GB at most), we are now hitting some rare issues, see e.g. T78529: Blend file corrupted during save caused by high Cubemap Size.

I think we should at the very least use size_t everywhere in functions, and assert/try to handle the issue when actual size exceeds BHead's int capacity?

Ultimately it might be nice to allow bigger chunks (using int64_t in BHead)? But I am not sure how we could handle that in a compatible way?

Event Timeline

Bastien Montagne (mont29) changed the task status from Needs Triage to Confirmed.Aug 5 2020, 6:01 PM
Bastien Montagne (mont29) created this task.
Bastien Montagne (mont29) changed the subtype of this task from "Report" to "Design".

Looked into this and there doesn't seem to be any elegant options seeing as the BHead.len is always used as the final length irrespective of the kind of data written.

Increase limit to 4GB

Firstly, we could make BHead.len unsigned, as long as the new files don't use large allocations, they will load in older Blender versions.

In the case they do, older Blender versions will see this as len < 0 and stop loading the file.

Mentioning this as as far as I can see, all the alternatives are quite involved.


Encode Large Data in the Existing Format

There are some other tricks which could work but will most likely make the code messy/unmaintainable, although if they could be done in a manageable way, we could use them in the case of over 2GB chunks being written.
This would need to be done in way that older Blender versions could read back, skipping the large allocations *. If that can't be properly supported we might as well just make all the BHead variables 64bit and break forwards compatibility for older Blender versions.

We could for example add a new DATA code, only use for >2GB blocks, older Blender versions will skip it, new versions can read/write chunks multiple BHead's that don't exceed INT_MAX.

I considered using some kind of tag to show the BHead should be treated differently, in a way that causes it to be ignored, while there are a few options, BHead.nr can be used for this as it's ignored on read at the moment, so this could hint at different behavior.

However we still need to skip the ignored data for older Blender versions, so unless we do something really strange (writing data after the ENDB chunk for example), I think it's not such a good option.


Other Limits

  • GZip read/write uses int args for size, this would need to handle >INT_MAX operations in multiple chunks.
  • Packed files currently stores size as int's.
  • RNA doesn't currently support numbers larger than int (to access packed file size for example).

Conclusions:

I'm not convinced it's practical to support >2GB BHead support without breaking forward compatibility.

Short term we could make BHead.len unsigned, then update the BHead for a major release (3.0 for e.g) to support 64bit BHead's, with the ability to optionally save blend files for older versions.


* skipping the large allocations might not be stable and cause crashes on load, older Blender versions would need to allow for these struct-members to be NULL.

Bastien Montagne (mont29) closed this task as Resolved.Sep 20 2020, 9:28 PM
Bastien Montagne (mont29) claimed this task.

Sanitized code a bit in rB5ea1049e7520: Sanitize type 'size' parameters in our read/write file code, think we'll have to go with that for now.