Sanitize size handling in write code #79561

New Issue

Bastien Montagne · 2020-08-05T18:01:06+02:00

Bastien Montagne commented

2020-08-05 18:01:06 +02:00

Current handling of size (in bytes) in writer code is rather messy, to say the least: different part of the code uses int, uint and size_t. BHead itself stores size in int...

While int should be enough in most cases (it allows chunks of 2GB at most), we are now hitting some rare issues, see e.g. #78529 (Blend file corrupted during save caused by high Cubemap Size).

I think we should at the very least use size_t everywhere in functions, and assert/try to handle the issue when actual size exceeds BHead's int capacity?

Ultimately it might be nice to allow bigger chunks (using int64_t in BHead)? But I am not sure how we could handle that in a compatible way?

Current handling of size (in bytes) in writer code is rather messy, to say the least: different part of the code uses `int`, `uint` and `size_t`. `BHead` itself stores size in `int`... While `int` should be enough in most cases (it allows chunks of 2GB at most), we are now hitting some rare issues, see e.g. #78529 (Blend file corrupted during save caused by high Cubemap Size). I think we should at the very least use `size_t` everywhere in functions, and assert/try to handle the issue when actual size exceeds `BHead`'s `int` capacity? Ultimately it might be nice to allow bigger chunks (using `int64_t` in `BHead`)? But I am not sure how we could handle that in a compatible way?

Bastien Montagne commented

2020-08-05 18:01:06 +02:00

Changed status from 'Needs Triage' to: 'Confirmed'

Bastien Montagne commented

2020-08-05 18:01:06 +02:00

Added subscribers: @mont29, @brecht, @dfelinto, @ideasman42

Campbell Barton commented

2020-08-06 03:27:46 +02:00

Looked into this and there doesn't seem to be any elegant options seeing as the BHead.len is always used as the final length irrespective of the kind of data written.

Increase limit to 4GB

Firstly, we could make BHead.len unsigned, as long as the new files don't use large allocations, they will load in older Blender versions.

In the case they do, older Blender versions will see this as len < 0 and stop loading the file.

Mentioning this as as far as I can see, all the alternatives are quite involved.

Encode Large Data in the Existing Format

There are some other tricks which could work but will most likely make the code messy/unmaintainable, although if they could be done in a manageable way, we could use them in the case of over 2GB chunks being written.
This would need to be done in way that older Blender versions could read back, skipping the large allocations *. If that can't be properly supported we might as well just make all the BHead variables 64bit and break forwards compatibility for older Blender versions.

We could for example add a new DATA code, only use for >2GB blocks, older Blender versions will skip it, new versions can read/write chunks multiple BHead's that don't exceed INT_MAX.

I considered using some kind of tag to show the BHead should be treated differently, in a way that causes it to be ignored, while there are a few options, BHead.nr can be used for this as it's ignored on read at the moment, so this could hint at different behavior.

However we still need to skip the ignored data for older Blender versions, so unless we do something really strange (writing data after the ENDB chunk for example), I think it's not such a good option.

Other Limits

GZip read/write uses int args for size, this would need to handle >INT_MAX operations in multiple chunks.
Packed files currently stores size as int's.
RNA doesn't currently support numbers larger than int (to access packed file size for example).

Conclusions:

I'm not convinced it's practical to support >2GB BHead support without breaking forward compatibility.

Short term we could make BHead.len unsigned, then update the BHead for a major release (3.0 for e.g) to support 64bit BHead's, with the ability to optionally save blend files for older versions.

* skipping the large allocations might not be stable and cause crashes on load, older Blender versions would need to allow for these struct-members to be NULL.

Looked into this and there doesn't seem to be any elegant options seeing as the `BHead.len` is always used as the final length irrespective of the kind of data written. **Increase limit to 4GB** Firstly, we could make `BHead.len` unsigned, as long as the new files don't use large allocations, they will load in older Blender versions. In the case they do, older Blender versions will see this as `len < 0` and stop loading the file. Mentioning this as as far as I can see, all the alternatives are quite involved. ---- **Encode Large Data in the Existing Format** There are some other tricks which could work but will most likely make the code messy/unmaintainable, although if they could be done in a manageable way, we could use them in the case of over 2GB chunks being written. This would need to be done in way that older Blender versions could read back, skipping the large allocations *. If that can't be properly supported we might as well just make all the `BHead` variables 64bit and break forwards compatibility for older Blender versions. We could for example add a new `DATA` code, only use for >2GB blocks, older Blender versions will skip it, new versions can read/write chunks multiple BHead's that don't exceed `INT_MAX`. I considered using some kind of tag to show the `BHead` should be treated differently, in a way that causes it to be ignored, while there are a few options, `BHead.nr` can be used for this as it's ignored on read at the moment, so this could hint at different behavior. However we still need to skip the ignored data for older Blender versions, so unless we do something really strange (writing data after the `ENDB` chunk for example), I think it's not such a good option. ---- **Other Limits** - GZip read/write uses int args for size, this would need to handle >INT_MAX operations in multiple chunks. - Packed files currently stores size as int's. - RNA doesn't currently support numbers larger than int (to access packed file size for example). ---- **Conclusions:** I'm not convinced it's practical to support >2GB `BHead` support without breaking forward compatibility. Short term we could make `BHead.len` unsigned, then update the `BHead` for a major release (3.0 for e.g) to support 64bit BHead's, with the ability to optionally save blend files for older versions. ---- `*` skipping the large allocations might not be stable and cause crashes on load, older Blender versions would need to allow for these struct-members to be NULL.

Bastien Montagne commented

2020-09-20 21:28:40 +02:00

Changed status from 'Confirmed' to: 'Resolved'

Bastien Montagne closed this issue

2020-09-20 21:28:40 +02:00

Bastien Montagne self-assigned this 2020-09-20 21:28:40 +02:00

Bastien Montagne commented

2020-09-20 21:28:40 +02:00

Sanitized code a bit in 5ea1049e75, think we'll have to go with that for now.

Sanitized code a bit in 5ea1049e75, think we'll have to go with that for now.

Thomas Dinges added this to the 2.91 milestone 2023-02-08 16:21:41 +01:00

Sign in to join this conversation.

No Label

Download

What's New

Blender Studio

Manual

Developers Blog

Documentation

Benchmark

Blender Conference

Development Fund

One-time Donations

Sanitize size handling in write code #79561