How to use & transition to std::filesystem? #90379

Closed
opened 2021-08-02 15:39:56 +02:00 by Julian Eisel · 21 comments
Member

Motivation

C++17 introduces the std::filesystem library which contains plenty of useful functionality. Operating system specific handling is abstracted away into a standard library, there are very useful and carefully designed types & functions to interact with. For example it introduces a proper path object type that is way more convenient and secure to use than the C-string manipulation we do in Blender currently.

It would be great to be able to use this as it would make file system interaction much easier, better tested/maintained, better designed (arguably ;) ), better documented and actually standardized (hence more familiar for devs coming from other C++ software). Sybren and I would like to use it ASAP for asset system related code.

Platform Compatibility

According to the [C++17 compiler support overview ]] we can't use std::filesystem with our current minimum requirements. We'd need to drop macOS 10.13 and 10.14 and possibly Visual Studio versions until 2017 15.7 (current minimum version [ https:*wiki.blender.org/wiki/Building_Blender#Compiler_Versions | only states 2017 ).

Notes:

  • We can't use XCode 11 until macOS 10.15. In 10.14 it can only be used by creating an Apple developer account and manually downloading it.
  • 10.14 is the last version of macOS to support 32 Bit. We can expect a decent amount of users to stick to this release for 32 bit games and applications.
  • The VFX Reference Platform draft for 2022 foresees macOS 10.15 as Minimum Deployment Target.

Conclusion: We may be able to update our requirements before too long, but as of now it's too early. In other words, we can't use std::filesystem yet.

std::filesystem vs. BLI functions

We have other options than using std::filesystem directly. But there's another discussion point:
Blender already has library functions for file system and path handling, all written in C in BLI. Concerns were raised about using std::filesystem while the BLI APIs also exist with entirely different implementations. Porting the API would probably take quite some effort & time, and could cause dangerous regressions.


Proposal

Short proposal:

  • Use the following, light-weight, std::filesystem compatible implementation of the API: https://github.com/gulrak/filesystem
  • Encourage usage of that for new C++ code.
  • Cover BLI file system & path APIs well with unit testing.
  • Gradually port BLI APIs to use it for internal logic.

We could change the BLI API designs a bit, e.g. rather than passing paths as character arrays, have an opaque path handle that uses ghc::filesystem::path internally. Path operations would have to be done through the API then (which may have to be extended a bit). Generally we should aim for getting the API as close as possible to std::filesystem.

### Motivation C++17 introduces the `std::filesystem` library which contains plenty of useful functionality. Operating system specific handling is abstracted away into a standard library, there are very useful and carefully designed types & functions to interact with. For example it introduces a proper `path` object type that is way more convenient and secure to use than the C-string manipulation we do in Blender currently. It would be great to be able to use this as it would make file system interaction much easier, better tested/maintained, better designed (arguably ;) ), better documented and actually standardized (hence more familiar for devs coming from other C++ software). Sybren and I would like to use it ASAP for asset system related code. ### Platform Compatibility According to the [C++17 compiler support overview ]] we can't use `std::filesystem` with our current minimum requirements. We'd need to drop macOS 10.13 and 10.14 and possibly Visual Studio versions until 2017 15.7 (current minimum version [[ https:*wiki.blender.org/wiki/Building_Blender#Compiler_Versions | only states 2017 ](https:*en.cppreference.com/w/cpp/compiler_support/17)). Notes: * We can't use XCode 11 until macOS 10.15. In 10.14 it can only be used by creating an Apple developer account and manually downloading it. * 10.14 is the last version of macOS to support 32 Bit. We can expect a decent amount of users to stick to this release for 32 bit games and applications. * The [VFX Reference Platform ](https://vfxplatform.com/) draft for 2022 foresees macOS 10.15 as Minimum Deployment Target. **Conclusion:** We may be able to update our requirements before too long, but as of now it's too early. In other words, we can't use `std::filesystem` yet. ### std::filesystem vs. BLI functions We have other options than using `std::filesystem` directly. But there's another discussion point: Blender already has library functions for file system and path handling, all written in C in BLI. Concerns were raised about using `std::filesystem` while the BLI APIs also exist with entirely different implementations. Porting the API would probably take quite some effort & time, and could cause dangerous regressions. --- ### Proposal Short proposal: * Use the following, light-weight, `std::filesystem` compatible implementation of the API: https://github.com/gulrak/filesystem * Encourage usage of that for new C++ code. * Cover BLI file system & path APIs well with unit testing. * Gradually port BLI APIs to use it for internal logic. We could change the BLI API designs a bit, e.g. rather than passing paths as character arrays, have an opaque path handle that uses `ghc::filesystem::path` internally. Path operations would have to be done through the API then (which may have to be extended a bit). Generally we should aim for getting the API as close as possible to `std::filesystem`.
Author
Member

Added subscriber: @JulianEisel

Added subscriber: @JulianEisel
Julian Eisel changed title from How to use & transition to `std::filesystem`? to How to use & transition to std::filesystem? 2021-08-02 15:41:00 +02:00

Added subscriber: @dr.sybren

Added subscriber: @dr.sybren

As an avid user of Python's pathlib module, I'm strongly in favour of this proposal. IMO we should move forward with this, unless there are concrete, technical issues that first needs resolving.

As an avid user of [Python's pathlib](https://docs.python.org/3/library/pathlib.html) module, I'm strongly in favour of this proposal. IMO we should move forward with this, unless there are concrete, technical issues that first needs resolving.
Member

Added subscriber: @HooglyBoogly

Added subscriber: @HooglyBoogly
Member

Added subscriber: @ankitm

Added subscriber: @ankitm
Member

Proposal 2

This has a high cost in terms of download/ space/ system configuration, but the pro is that we get std::filesystem right away!
Use LLVM toolchain on every mac dev station.
I have been using the same for over a year now and Blender compiles fine, runs fine, and there's leak sanitizer which Apple toolchain doesn't have for the same compiler version.
We'd need to ship some additional dylibs but that has become easier after D11997 :)
Also, some #defines would be needed before including filesystem header to turn off platform symbol availability checks.

Proposal 3

The argument against boost::filesystem was that third party libs shouldn't be used for core features.. but ghc seems fine for that.. how so?

## Proposal 2 This has a high cost in terms of download/ space/ system configuration, but the pro is that we get `std::filesystem` right away! Use LLVM toolchain on every mac dev station. I have been using the same for over a year now and Blender compiles fine, runs fine, and there's leak sanitizer which Apple toolchain doesn't have for the same compiler version. We'd need to ship some additional dylibs but that has become easier after [D11997](https://archive.blender.org/developer/D11997) :) Also, some `#define`s would be needed before including filesystem header to turn off platform symbol availability checks. ## Proposal 3 The argument against `boost::filesystem` was that third party libs shouldn't be used for core features.. but `ghc` seems fine for that.. how so?
Member

Added subscriber: @LazyDodo

Added subscriber: @LazyDodo
Member

In #90379#1200559, @ankitm wrote:

Proposal 3

The argument against boost::filesystem was that third party libs shouldn't be used for core features.. but ghc seems fine for that.. how so?

ghc is header only and small enough to host inside /extern, so we can rely on it being available, while putting boost in there be undesirable due to its size and somewhat strange buildsystem.

> In #90379#1200559, @ankitm wrote: > ## Proposal 3 > The argument against `boost::filesystem` was that third party libs shouldn't be used for core features.. but `ghc` seems fine for that.. how so? ghc is header only and small enough to host inside `/extern`, so we can rely on it being available, while putting boost in there be undesirable due to its size and somewhat strange buildsystem.
Author
Member

In #90379#1200559, @ankitm wrote:

Proposal 3

The argument against boost::filesystem was that third party libs shouldn't be used for core features.. but ghc seems fine for that.. how so?

I'm not sure if the argument is really that 3rd party libs shouldn't be used for core features. I think it's more about boost specifically, for which consensus seems to be that we want to move away from it. Also, making boost a hard requirement (if it's not shipped in extern/) is rather problematic.

> In #90379#1200559, @ankitm wrote: > ## Proposal 3 > The argument against `boost::filesystem` was that third party libs shouldn't be used for core features.. but `ghc` seems fine for that.. how so? I'm not sure if the argument is really that 3rd party libs shouldn't be used for core features. I think it's more about boost specifically, for which consensus seems to be that we want to move away from it. Also, making boost a hard requirement (if it's not shipped in `extern/`) is rather problematic.
Member

@JulianEisel only downside I see in ghc; it relies on system headers (ie Windows.h) so i'd use it very cautiously, ie using it in an implementation cpp of assetbrowser is fine, using its types inside a very common blenlib/kernel header would be problematic both in terms of Windows.h doing some wonky stuff (like defining mix/max macro's that'll collide with stl's versions of them) and longer build times.

@JulianEisel only downside I see in ghc; it relies on system headers (ie `Windows.h`) so i'd use it very cautiously, ie using it in an implementation cpp of assetbrowser is fine, using its types inside a very common blenlib/kernel header would be problematic both in terms of `Windows.h` doing some wonky stuff (like defining mix/max macro's that'll collide with stl's versions of them) and longer build times.
Member

Changed status from 'Needs Triage' to: 'Confirmed'

Changed status from 'Needs Triage' to: 'Confirmed'
Author
Member

The GHC readme says:

For Windows it needs Windows.h and it might be a good idea to define WIN32_LEAN_AND_MEAN or NOMINMAX prior to including filesystem.hpp or fs_std.hpp headers to reduce pollution of your global namespace and compile time. They are not defined by ghc::filesystem to allow combination with contexts where the full Windows.h is needed, e.g. for UI elements.

So I'd say we just create a wrapper header, BLI_filesystem.h or so and define WIN32_LEAN_AND_MEAN, NOMINMAX, NOGDI, NOCOMM, etc. in there. That also makes access easier, since you don't have to edit the local cmake file to include extern/ghc (or whatever it will be called), as long as BLI is included.

The GHC readme says: > For Windows it needs Windows.h and it might be a good idea to define WIN32_LEAN_AND_MEAN or NOMINMAX prior to including filesystem.hpp or fs_std.hpp headers to reduce pollution of your global namespace and compile time. They are not defined by ghc::filesystem to allow combination with contexts where the full Windows.h is needed, e.g. for UI elements. So I'd say we just create a wrapper header, `BLI_filesystem.h` or so and define `WIN32_LEAN_AND_MEAN`, `NOMINMAX`, `NOGDI`, `NOCOMM`, etc. in there. That also makes access easier, since you don't have to edit the local cmake file to include `extern/ghc` (or whatever it will be called), as long as BLI is included.
Author
Member

Just confirmed one suspicion: std::filesystem::path uses std::basic_string internally (at least in the ghc implementation). But that means every time we construct a path from a char *, there is a heap allocation (except when there is short string optimization of course, but paths are typically longer).

Not sure if this is a deal breaker for using std::filepath (or equivalent) internally for the BLI functions. As long as these functions are sparingly used in performance sensitive areas it should be fine - and I guess this applies to file system operations generally anyway.
There is still the option to have an opaque path handle to pass around in C, this would just require quite a bit more work.

Just confirmed one suspicion: `std::filesystem::path` uses `std::basic_string` internally (at least in the `ghc` implementation). But that means every time we construct a path from a `char *`, there is a heap allocation (except when there is short string optimization of course, but paths are typically longer). Not sure if this is a deal breaker for using `std::filepath` (or equivalent) internally for the BLI functions. As long as these functions are sparingly used in performance sensitive areas it should be fine - and I guess this applies to file system operations generally anyway. There is still the option to have an opaque path handle to pass around in C, this would just require quite a bit more work.
Member

In #90379#1201005, @JulianEisel wrote:
The GHC readme says:

For Windows it needs Windows.h and it might be a good idea to define WIN32_LEAN_AND_MEAN or NOMINMAX prior to including filesystem.hpp or fs_std.hpp headers to reduce pollution of your global namespace and compile time. They are not defined by ghc::filesystem to allow combination with contexts where the full Windows.h is needed, e.g. for UI elements.

So I'd say we just create a wrapper header, BLI_filesystem.h or so and define WIN32_LEAN_AND_MEAN, NOMINMAX, NOGDI, NOCOMM, etc. in there. That also makes access easier, since you don't have to edit the local cmake file to include extern/ghc (or whatever it will be called), as long as BLI is included.

That could work, but you'll have to undefine them afterwards otherwise it''ll cause issues when other code tries to make those same defines, the only way to do this with a minimum amount of warnings is the ugliness we have in bli_task_hh

> In #90379#1201005, @JulianEisel wrote: > The GHC readme says: >> For Windows it needs Windows.h and it might be a good idea to define WIN32_LEAN_AND_MEAN or NOMINMAX prior to including filesystem.hpp or fs_std.hpp headers to reduce pollution of your global namespace and compile time. They are not defined by ghc::filesystem to allow combination with contexts where the full Windows.h is needed, e.g. for UI elements. > So I'd say we just create a wrapper header, `BLI_filesystem.h` or so and define `WIN32_LEAN_AND_MEAN`, `NOMINMAX`, `NOGDI`, `NOCOMM`, etc. in there. That also makes access easier, since you don't have to edit the local cmake file to include `extern/ghc` (or whatever it will be called), as long as BLI is included. That could work, but you'll have to undefine them afterwards otherwise it''ll cause issues when other code tries to make those same defines, the only way to do this with a minimum amount of warnings is the ugliness we have in `bli_task_hh`
Author
Member

I did some initial work in D12117: Initial work for std::filesystem transition. Would appreciate some feedback there.

I did some initial work in [D12117: Initial work for std::filesystem transition](https://archive.blender.org/developer/D12117). Would appreciate some feedback there.
Author
Member

Using Gulrak/Filesystem seems pretty uncontroversial at this point.

Main question to move forward seems to be how to handle the BLI stuff: Is it fine to just introduce the filesystem library (as bli::filesystem) and use that for new C++ code? That is, planning to either transition the BLI APIs to use filesystem internally, or porting code using the API to C++ and bli::filesystem directly.

Using [Gulrak/Filesystem ](https://github.com/gulrak/filesystem) seems pretty uncontroversial at this point. Main question to move forward seems to be how to handle the BLI stuff: Is it fine to just introduce the filesystem library (as `bli::filesystem`) and use that for new C++ code? That is, planning to either transition the BLI APIs to use `filesystem` internally, or porting code using the API to C++ and `bli::filesystem` directly.
Member

I think we use the the blender:: namespace instead of bli::.

In my opinion it's easier to use the new API in one place at a time, rather than refactoring parts of the existing API. Easier to test and the transition would probably be less messy? Just my 2 cents though.

I think we use the the `blender::` namespace instead of `bli::`. In my opinion it's easier to use the new API in one place at a time, rather than refactoring parts of the existing API. Easier to test and the transition would probably be less messy? Just my 2 cents though.
Member
[D12210: Use LLVM toolchain to get std::filesystem](https://archive.blender.org/developer/D12210) NVM
Author
Member

Changed status from 'Confirmed' to: 'Archived'

Changed status from 'Confirmed' to: 'Archived'
Author
Member

Quick update here, basically things are difficult...
Sybren and I were using Gulrak/Filesystem for asset stuff, and had too many cross platform issues with character encoding to justify sticking to it. It's like, 9.5 times out of 10, using paths would result in a failing unicode test on some platform (usually Windows since we were on Linux) and it still wasn't clear how to use the API to avoid that. That's fine if stuff is unit tested carefully, but we can't rely on that*that// much.

It's unclear if this would get any better with std::filesystem. While Gulrak/Filesystem by default interprets everything as UTF-8 encoded, std::filesystem does not. Needs testing.
However, another issue that worries me a bit is, Blender has different path semantics than std::filesystem. Namely the // prefix we use would be interpreted differently and be another possible source of bugs. std::filesystem::path is careful in its rules on how to interpret strings as paths (e.g. to identify a root name like C:), in Blender things are all over the place. This just makes transitioning even harder.

So for the time being, I'd say let's not use std::filesystem. Some more experiments might be useful, but I'll put this task on ice for now.

Quick update here, basically things are difficult... Sybren and I were using [Gulrak/Filesystem ](https:*github.com/gulrak/filesystem) for asset stuff, and had too many cross platform issues with character encoding to justify sticking to it. It's like, 9.5 times out of 10, using paths would result in a failing unicode test on some platform (usually Windows since we were on Linux) and it still wasn't clear how to use the API to avoid that. That's fine if stuff is unit tested carefully, but we can't rely on that*that// much. It's unclear if this would get any better with `std::filesystem`. While Gulrak/Filesystem by default interprets everything as UTF-8 encoded, `std::filesystem` does not. Needs testing. However, another issue that worries me a bit is, Blender has different path semantics than `std::filesystem`. Namely the `//` prefix we use would be interpreted differently and be another possible source of bugs. `std::filesystem::path` is careful in its rules on how to interpret strings as paths (e.g. to identify a root name like `C:`), in Blender things are all over the place. This just makes transitioning even harder. So for the time being, I'd say let's not use `std::filesystem`. Some more experiments might be useful, but I'll put this task on ice for now.

I think it would be good to have a blender::filesystem then, which would behave the way it should behave. It can then be left as an implementation detail whether it just detects the special cases and delegates the rest to std::filesystem or implements everything itself.

I think it would be good to have a `blender::filesystem` then, which would behave the way it should behave. It can then be left as an implementation detail whether it just detects the special cases and delegates the rest to `std::filesystem` or implements everything itself.
Sign in to join this conversation.
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
5 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: blender/blender#90379
No description provided.