Page MenuHome

How to use & transition to std::filesystem?
Closed, ArchivedPublicTO DO

Description

Motivation

C++17 introduces the std::filesystem library which contains plenty of useful functionality. Operating system specific handling is abstracted away into a standard library, there are very useful and carefully designed types & functions to interact with. For example it introduces a proper path object type that is way more convenient and secure to use than the C-string manipulation we do in Blender currently.

It would be great to be able to use this as it would make file system interaction much easier, better tested/maintained, better designed (arguably ;) ), better documented and actually standardized (hence more familiar for devs coming from other C++ software). Sybren and I would like to use it ASAP for asset system related code.

Platform Compatibility

According to the C++17 compiler support overview we can't use std::filesystem with our current minimum requirements. We'd need to drop macOS 10.13 and 10.14 and possibly Visual Studio versions until 2017 15.7 (current minimum version only states 2017).

Notes:

  • We can't use XCode 11 until macOS 10.15. In 10.14 it can only be used by creating an Apple developer account and manually downloading it.
  • 10.14 is the last version of macOS to support 32 Bit. We can expect a decent amount of users to stick to this release for 32 bit games and applications.
  • The VFX Reference Platform draft for 2022 foresees macOS 10.15 as Minimum Deployment Target.

Conclusion: We may be able to update our requirements before too long, but as of now it's too early. In other words, we can't use std::filesystem yet.

std::filesystem vs. BLI functions

We have other options than using std::filesystem directly. But there's another discussion point:
Blender already has library functions for file system and path handling, all written in C in BLI. Concerns were raised about using std::filesystem while the BLI APIs also exist with entirely different implementations. Porting the API would probably take quite some effort & time, and could cause dangerous regressions.


Proposal

Short proposal:

  • Use the following, light-weight, std::filesystem compatible implementation of the API: https://github.com/gulrak/filesystem
  • Encourage usage of that for new C++ code.
  • Cover BLI file system & path APIs well with unit testing.
  • Gradually port BLI APIs to use it for internal logic.

We could change the BLI API designs a bit, e.g. rather than passing paths as character arrays, have an opaque path handle that uses ghc::filesystem::path internally. Path operations would have to be done through the API then (which may have to be extended a bit). Generally we should aim for getting the API as close as possible to std::filesystem.

Revisions and Commits

Event Timeline

Julian Eisel (Severin) renamed this task from How to use & transition to `std::filesystem`? to How to use & transition to std::filesystem?.Aug 2 2021, 3:39 PM
Julian Eisel (Severin) created this task.
Julian Eisel (Severin) updated the task description. (Show Details)
Julian Eisel (Severin) updated the task description. (Show Details)

As an avid user of Python's pathlib module, I'm strongly in favour of this proposal. IMO we should move forward with this, unless there are concrete, technical issues that first needs resolving.

Proposal 2

This has a high cost in terms of download/ space/ system configuration, but the pro is that we get std::filesystem right away!
Use LLVM toolchain on every mac dev station.
I have been using the same for over a year now and Blender compiles fine, runs fine, and there's leak sanitizer which Apple toolchain doesn't have for the same compiler version.
We'd need to ship some additional dylibs but that has become easier after D11997 :)
Also, some #defines would be needed before including filesystem header to turn off platform symbol availability checks.

Proposal 3

The argument against boost::filesystem was that third party libs shouldn't be used for core features.. but ghc seems fine for that.. how so?

Proposal 3

The argument against boost::filesystem was that third party libs shouldn't be used for core features.. but ghc seems fine for that.. how so?

ghc is header only and small enough to host inside /extern, so we can rely on it being available, while putting boost in there be undesirable due to its size and somewhat strange buildsystem.

Proposal 3

The argument against boost::filesystem was that third party libs shouldn't be used for core features.. but ghc seems fine for that.. how so?

I'm not sure if the argument is really that 3rd party libs shouldn't be used for core features. I think it's more about boost specifically, for which consensus seems to be that we want to move away from it. Also, making boost a hard requirement (if it's not shipped in extern/) is rather problematic.

@Julian Eisel (Severin) only downside I see in ghc; it relies on system headers (ie Windows.h) so i'd use it very cautiously, ie using it in an implementation cpp of assetbrowser is fine, using its types inside a very common blenlib/kernel header would be problematic both in terms of Windows.h doing some wonky stuff (like defining mix/max macro's that'll collide with stl's versions of them) and longer build times.

Ankit Meel (ankitm) changed the task status from Needs Triage to Confirmed.Aug 3 2021, 8:18 AM

The GHC readme says:

For Windows it needs Windows.h and it might be a good idea to define WIN32_LEAN_AND_MEAN or NOMINMAX prior to including filesystem.hpp or fs_std.hpp headers to reduce pollution of your global namespace and compile time. They are not defined by ghc::filesystem to allow combination with contexts where the full Windows.h is needed, e.g. for UI elements.

So I'd say we just create a wrapper header, BLI_filesystem.h or so and define WIN32_LEAN_AND_MEAN, NOMINMAX, NOGDI, NOCOMM, etc. in there. That also makes access easier, since you don't have to edit the local cmake file to include extern/ghc (or whatever it will be called), as long as BLI is included.

Just confirmed one suspicion: std::filesystem::path uses std::basic_string internally (at least in the ghc implementation). But that means every time we construct a path from a char *, there is a heap allocation (except when there is short string optimization of course, but paths are typically longer).

Not sure if this is a deal breaker for using std::filepath (or equivalent) internally for the BLI functions. As long as these functions are sparingly used in performance sensitive areas it should be fine - and I guess this applies to file system operations generally anyway.
There is still the option to have an opaque path handle to pass around in C, this would just require quite a bit more work.

The GHC readme says:

For Windows it needs Windows.h and it might be a good idea to define WIN32_LEAN_AND_MEAN or NOMINMAX prior to including filesystem.hpp or fs_std.hpp headers to reduce pollution of your global namespace and compile time. They are not defined by ghc::filesystem to allow combination with contexts where the full Windows.h is needed, e.g. for UI elements.

So I'd say we just create a wrapper header, BLI_filesystem.h or so and define WIN32_LEAN_AND_MEAN, NOMINMAX, NOGDI, NOCOMM, etc. in there. That also makes access easier, since you don't have to edit the local cmake file to include extern/ghc (or whatever it will be called), as long as BLI is included.

That could work, but you'll have to undefine them afterwards otherwise it''ll cause issues when other code tries to make those same defines, the only way to do this with a minimum amount of warnings is the ugliness we have in bli_task_hh

I did some initial work in D12117: Initial work for std::filesystem transition. Would appreciate some feedback there.

Bastien Montagne (mont29) changed the subtype of this task from "Report" to "To Do".

Using Gulrak/Filesystem seems pretty uncontroversial at this point.

Main question to move forward seems to be how to handle the BLI stuff: Is it fine to just introduce the filesystem library (as bli::filesystem) and use that for new C++ code? That is, planning to either transition the BLI APIs to use filesystem internally, or porting code using the API to C++ and bli::filesystem directly.

I think we use the the blender:: namespace instead of bli::.

In my opinion it's easier to use the new API in one place at a time, rather than refactoring parts of the existing API. Easier to test and the transition would probably be less messy? Just my 2 cents though.

Julian Eisel (Severin) closed this task as Archived.Jan 28 2022, 12:58 PM

Quick update here, basically things are difficult...
Sybren and I were using Gulrak/Filesystem for asset stuff, and had too many cross platform issues with character encoding to justify sticking to it. It's like, 9.5 times out of 10, using paths would result in a failing unicode test on some platform (usually Windows since we were on Linux) and it still wasn't clear how to use the API to avoid that. That's fine if stuff is unit tested carefully, but we can't rely on that that much.

It's unclear if this would get any better with std::filesystem. While Gulrak/Filesystem by default interprets everything as UTF-8 encoded, std::filesystem does not. Needs testing.
However, another issue that worries me a bit is, Blender has different path semantics than std::filesystem. Namely the // prefix we use would be interpreted differently and be another possible source of bugs. std::filesystem::path is careful in its rules on how to interpret strings as paths (e.g. to identify a root name like C:), in Blender things are all over the place. This just makes transitioning even harder.

So for the time being, I'd say let's not use std::filesystem. Some more experiments might be useful, but I'll put this task on ice for now.

I think it would be good to have a blender::filesystem then, which would behave the way it should behave. It can then be left as an implementation detail whether it just detects the special cases and delegates the rest to std::filesystem or implements everything itself.