Faster I/O for OBJ, PLY, STL: Design #68936

New Issue

Dalai Felinto · 2019-08-20T22:48:35+02:00

Dalai Felinto commented

2019-08-20 22:48:35 +02:00

This task is to track the progress made on the Fast IO project & also for design discussions.

Proposal on wiki: https://wiki.blender.org/wiki/User:Ankitm/GSoC_2020_Proposal_IO_Perf
Weekly & Daily Reports: https://wiki.blender.org/wiki/User:Ankitm
Final report: https://wiki.blender.org/wiki/User:Ankitm/GSoC_2020/Final_report
Devtalk Thread for community Feedback: https://devtalk.blender.org/t/gsoc-2020-faster-io-for-obj-stl-ply-feedback/13528
Code: D8754, D8753, https://developer.blender.org/diffusion/B/browse/soc-2020-io-performance/source/blender/io/wavefront_obj/intern/

Student: @ankitm
Mentors: @dr.sybren @howardt

Exporter's Design: #68936#962546
Importer's Design: #68936#982751

Status Tracker:

OBJ Exporter:
Setup UI, buttons, operators, and relevant functions to call
Vertex, vertex normals, faces, texture coordinates
Animation (multiple frames), Progress logging in console
triangulate
Transforms in axes, Scale transform
Curves as meshes.
curves as NURBS.
modifiers,
Material library
Grouping

Evaluation 1: I hope to reach halfway in the OBJ importer by evaluation 1.

OBJ Importer ( Nearly the same as above ):
Vertex, vertex normals, faces, texture coordinates
Material Library
Curves,
Modifiers. Grouping,
Experiment with IO methods to see which one works the fastest
Refactor
Profile, Benchmark, Document

To be written in detail later:

STL Exporter/ Importer ASCII
Evaluation 2
STL Exporter/ Importer Binary
Profile, Benchmark, Document
PLY Exporter/ Importer ASCII
PLY Exporter/ Importer Binary
Profile, Benchmark, Document
Evaluation 3

This task is to track the progress made on the Fast IO project & also for design discussions. - Proposal on wiki: https://wiki.blender.org/wiki/User:Ankitm/GSoC_2020_Proposal_IO_Perf - Weekly & Daily Reports: https://wiki.blender.org/wiki/User:Ankitm - Final report: https://wiki.blender.org/wiki/User:Ankitm/GSoC_2020/Final_report - Devtalk Thread for community Feedback: https://devtalk.blender.org/t/gsoc-2020-faster-io-for-obj-stl-ply-feedback/13528 - Code: [D8754](https://archive.blender.org/developer/D8754), [D8753](https://archive.blender.org/developer/D8753), https://developer.blender.org/diffusion/B/browse/soc-2020-io-performance/source/blender/io/wavefront_obj/intern/ Student: @ankitm Mentors: @dr.sybren @howardt Exporter's Design: #68936#962546 Importer's Design: #68936#982751 **Status Tracker:** - [x] OBJ Exporter: - [x] Setup UI, buttons, operators, and relevant functions to call - [x] Vertex, vertex normals, faces, texture coordinates - [x] Animation (multiple frames), Progress logging in console - [x] triangulate - [x] Transforms in axes, Scale transform - [x] Curves as meshes. - [x] curves as NURBS. - [x] modifiers, - [x] Material library - [x] Grouping **Evaluation 1:** I hope to reach halfway in the OBJ importer by evaluation 1. - [x] OBJ Importer ( Nearly the same as above ): - [x] Vertex, vertex normals, faces, texture coordinates - [x] Material Library - [x] Curves, - [x] Modifiers. Grouping, - [x] Experiment with IO methods to see which one works the fastest - [x] Refactor - [x] Profile, Benchmark, Document To be written in detail later: - [ ] STL Exporter/ Importer ASCII **Evaluation 2** - [ ] STL Exporter/ Importer Binary - [ ] Profile, Benchmark, Document - [ ] PLY Exporter/ Importer ASCII - [ ] PLY Exporter/ Importer Binary - [ ] Profile, Benchmark, Document **Evaluation 3**

Dalai Felinto commented

2019-08-20 22:48:35 +02:00

Added subscriber: @dfelinto

Janne Aliu commented

2019-08-29 23:51:16 +02:00

Added subscriber: @Jaydead

yoann commented

2019-10-11 11:03:20 +02:00

Added subscriber: @softyoda

Howard Trickey commented

2020-05-15 11:38:59 +02:00

Added subscriber: @howardt

Ankit Meel changed title from ~~Faster I/O~~ to Faster I/O for OBJ, PLY, STL

2020-05-15 20:06:06 +02:00

Ankit Meel self-assigned this 2020-05-15 20:06:06 +02:00

Ankit Meel commented

2020-05-15 20:06:06 +02:00

Added subscribers: @ankitm, @dr.sybren

dark999 commented

2020-05-16 00:04:21 +02:00

Added subscriber: @dark999

Ankit Meel changed title from ~~Faster I/O for OBJ, PLY, STL~~ to Faster I/O for OBJ, PLY, STL: Design

2020-05-28 15:18:47 +02:00

Ankit Meel commented

2020-05-28 15:18:47 +02:00

Removed subscriber: @dfelinto

Alexey commented

2020-05-30 02:29:48 +02:00

Added subscriber: @AlexeyPerminov

Chris Kohl commented

2020-06-01 02:46:47 +02:00

Added subscriber: @ckohl_art

Ankit Meel commented

2020-06-02 11:52:29 +02:00

An indicator of how data structures & writers will scale up was committed today: 485cc4330a
In obj_exporter.cc, the struct OBJ_data_to_export is there, which contains all what needs to be written & nothing else. It is filled up in the same file itself & obj_file_handler.cc flushes it out.

Is that feasible ? What can be improved ?
I've used vectors a lot. I think reserve would save some time there, instead of push_back.
If we decide to do it like:
- fill some part of the struct
- Write it to the file
- fill it more data, free the used.
- write it to the file.
  With some efforts step 2 & 3 can be made to work together (non-sequentially). Any input on that ?

An indicator of how data structures & writers will scale up was committed today: 485cc4330a In `obj_exporter.cc`, the struct `OBJ_data_to_export` is there, which contains all what needs to be written & nothing else. It is filled up in the same file itself & `obj_file_handler.cc` flushes it out. * Is that feasible ? What can be improved ? * I've used vectors a lot. I think `reserve` would save some time there, instead of `push_back`. * If we decide to do it like: - fill some part of the struct - Write it to the file - fill it more data, free the used. - write it to the file. With some efforts step 2 & 3 can be made to work together (non-sequentially). Any input on that ?

name commented

2020-06-03 16:44:16 +02:00

Added subscriber: @name

Ankit Meel commented

2020-06-22 19:03:29 +02:00

https://docs.google.com/document/d/1sYSHF8g63F7zwTjJkoACESZvpQRhImlEPifilmqXaWU/
OBJ Exporter Design Document

The exporter’s working
The exporter tries to separate the file writer and the calculation of numbers/names as much as possible. The Writer contains the syntax, the conditionals which decide whether an element/ property should be written or not. OBJMesh, OBJCurve which are wrappers around Mesh or Curve which give the required data to writers.

If multiple frame export is specified, only then the filename is edited to add the frame to the filename. The current frame is exported by default. From the ViewLayer, objects are filtered out based on export settings, and Object type. Only OB_MESH and OB_CURVE are supported.

An OBJWriter is instantiated which writes object geometry sequentially to the OBJ file. In most cases, OBJWriter queries the item for the total number of elements and provides a buffer that is filled by the iterand’s methods, without being concerned about how the writer uses it. Exceptions to this are smooth shading groups and UV indices.
OBJ indices which are one-based are handled by the writer, not by the OBJCurve/Mesh.
An .MTL file is also created in the beginning and every Object’s material is appended to the file when the corresponding OBJMesh is the iterand. IOW, the MTL file is opened and closed as many times are there are objects in the scene. MTLWriter uses MaterialWrap class for getting material data.

After this all the Curves which should be exported in parametric form (export settings), are written. Their vertex coordinates, degree, parameters need no dynamic allocations. The calculations are done on the fly. OBJCurve class is used for this.

Data Structures

OBJWriter: It manages the

FILE*, output file.
index_offsets_ : all objects index into a flat list of vertices or normals. So these offsets keep track of how many vertices/ normals have been written already.
Methods like write_vertex_coords, write_nurbs_curve, update_index_offsets etc.

OBJMesh Contains non-owning pointers to Object, Mesh, axes transform as specified by the export settings, and two lists which have to be stored since they’re needed at different times by the writer. They are: smooth groups and UV vertex indices. The methods of this class give vertex (UV) coordinates, edge indices, vertex normals, smooth group indices, polygon indices, object names, material names etc., to the caller Due to the aforementioned dynamically allocated lists, the OBJMesh objects are freed right after they’re written to the file, instead of waiting for the default destructor of blender::Vector to free the memory.

OBJCurve is just like OBJMesh and has methods for vertex coordinates, curve degree, number of points, etc.

MTLMaterial:
It stores

material name,
Ns, Ka, Kd, Ks, Ke, Ni, d, illum that correspond to values specular exponent, metallic, diffuse color, specular, emission color, 1 - opacity, illumination. See https://en.wikipedia.org/wiki/Wavefront_.obj_file#Basic_materials
map_Kd, map_Ks, map_Ke, map_d, map_refl, map_Ns, map_Bump. All of them have three parameters:
Path to a texture file
Scale
Translation
To avoid code duplication, a generic struct tex_map_Kx is used which has the three items listed above in addition to dest_node_id which contains the socket ID to which this texture map should be connected.
map_Bump_strength an extra property for map_Bump (“Normal Map Strength”) in addition to the three listed above.

MaterialWrap is used for extracting material data by traversing Object’s shader nodetree. It fills up data in MTLMaterial containers (that are also used in the importer). For fast lookups of linked nodes, linked sockets, it uses nodes::NodetreeRef. If nodetree is not present, values like ambient color, diffuse color, alpha, etc., are taken from Material of the object. For images, this node structure is expected ideally.

Mapping (location and scale) → Image Texture (filepath) → Normal bump (optional) (bump strength) → p-BSDF (colors, alpha, metallic etc) → Material output (optional).

Notes
While the export process is fairly straightforward, I’ll note some things.

New meshes are created in the following cases

if triangulation of polygons is enabled,
If the Object is a NURBS Curve and export parameters specify conversion to Mesh.

Smooth groups are calculated from sharp edges only if specified. Smooth flag is written in every combination of export parameters if a polygon is smooth shaded.

Normals: if a face is shaded smooth, only then its vertex normals are exported. Otherwise, only face normals are written. If smooth groups are enabled, this still remains the same. What changes is the smooth group which becomes another number instead of the default “1”. So
So for e.g., Smooth shading disabled,

# vertex coordinates list 
s off # not smooth shaded
f 1*1 2*1 3*1 4*1
s 1 # default smooth group
f 5*2 6*3 7*4 8*5
s off # not smooth shaded
f 9*6 10*6 11*6 12*6

Smooth shading enabled

# vertex coordinates list 
s off # not smooth shaded
f 1*1 2*1 3*1 4*1
s 4 # smooth group changed
f 5*2 6*3 7*4 8*5
s off # not smooth shaded
f 9*6 10*6 11*6 12*6

The way normals and normal indices (in face elements) are written require that polygons are iterated in the same order. Thus sorting them and trying to for e.g., separate smooth shaded and non-smooth shaded polygons to save a few lines is not possible (without allocating more memory).

The same is true for UV vertex indices (Vector<Vector<uint>>) because this structure depends on the polygon index. So unless the original polygon index is stored, one shouldn’t reorder polygons and access UV indices using the new polygon indices.

The same is not true for vertices since MLoop->v remains correct even if vertices are written by say looping over MVert backward!

Vertex deform groups: Suppose a cube has all four vertices of only a face assigned a deform group. Since its adjacent four faces also share at least one vertex (two vertices are shared, in fact), they also get assigned the same group. Only the opposite face which shares no vertex with the original face has no deform group assigned. IOW, the group which has the most number of vertices of a polygon is the group we write to the file.

To denote the absence of any group, we take the same route as smooth groups: “g off”. Other writers may write “g default” or “g (null)” also.

Only loose edges are written to the file. This is checked by the ME_LOOSEEDGE flag.

How do we add STL, PLY in this ?
Later on, OBJMesh can have a parent class which has the methods for the least common denominator all formats need. Vertex coords, object name are examples. Then individual formats will have a derived class that has format specific calculator functions: UV coords in OBJ, edge list in PLY.

Similarly, OBJWriter can have a superclass for opening files, modifying filenames according to file format and frame number etc. And format specific derived classes which have the required syntax.

https://docs.google.com/document/d/1sYSHF8g63F7zwTjJkoACESZvpQRhImlEPifilmqXaWU/ **OBJ Exporter Design Document** **The exporter’s working** The exporter tries to separate the file writer and the calculation of numbers/names as much as possible. The Writer contains the syntax, the conditionals which decide whether an element/ property should be written or not. `OBJMesh`, `OBJCurve` which are wrappers around `Mesh` or `Curve` which give the required data to writers. If multiple frame export is specified, only then the filename is edited to add the frame to the filename. The current frame is exported by default. From the `ViewLayer`, objects are filtered out based on export settings, and `Object` type. Only `OB_MESH` and `OB_CURVE` are supported. An `OBJWriter` is instantiated which writes object geometry sequentially to the OBJ file. In most cases, `OBJWriter` queries the item for the total number of elements and provides a buffer that is filled by the iterand’s methods, without being concerned about how the writer uses it. Exceptions to this are smooth shading groups and UV indices. OBJ indices which are one-based are handled by the writer, not by the `OBJCurve`/`Mesh`. An .MTL file is also created in the beginning and every `Object`’s material is appended to the file when the corresponding OBJMesh is the iterand. IOW, the MTL file is opened and closed as many times are there are objects in the scene. `MTLWriter` uses `MaterialWrap` class for getting material data. After this all the `Curve`s which should be exported in parametric form (export settings), are written. Their vertex coordinates, degree, parameters need no dynamic allocations. The calculations are done on the fly. `OBJCurve` class is used for this. **Data Structures** `OBJWriter`: It manages the - `FILE*`, output file. - `index_offsets_` : all objects index into a flat list of vertices or normals. So these offsets keep track of how many vertices/ normals have been written already. - Methods like `write_vertex_coords`, `write_nurbs_curve`, `update_index_offsets` etc. `OBJMesh` Contains non-owning pointers to `Object`, `Mesh`, axes transform as specified by the export settings, and two lists which have to be stored since they’re needed at different times by the writer. They are: smooth groups and UV vertex indices. The methods of this class give vertex (UV) coordinates, edge indices, vertex normals, smooth group indices, polygon indices, object names, material names etc., to the caller Due to the aforementioned dynamically allocated lists, the `OBJMesh` objects are freed right after they’re written to the file, instead of waiting for the default destructor of `blender::Vector` to free the memory. `OBJCurve` is just like `OBJMesh` and has methods for vertex coordinates, curve degree, number of points, etc. `MTLMaterial`: It stores - material name, - Ns, Ka, Kd, Ks, Ke, Ni, d, illum that correspond to values specular exponent, metallic, diffuse color, specular, emission color, 1 - opacity, illumination. See https://en.wikipedia.org/wiki/Wavefront_.obj_file#Basic_materials - `map_Kd, map_Ks, map_Ke, map_d, map_refl, map_Ns, map_Bump`. All of them have three parameters: - Path to a texture file - Scale - Translation To avoid code duplication, a generic struct `tex_map_Kx` is used which has the three items listed above in addition to `dest_node_id` which contains the socket ID to which this texture map should be connected. `map_Bump_strength` an extra property for map_Bump (“Normal Map Strength”) in addition to the three listed above. `MaterialWrap` is used for extracting material data by traversing `Object`’s shader nodetree. It fills up data in `MTLMaterial` containers (that are also used in the importer). For fast lookups of linked nodes, linked sockets, it uses `nodes::NodetreeRef`. If nodetree is not present, values like ambient color, diffuse color, alpha, etc., are taken from `Material` of the object. For images, this node structure is expected ideally. Mapping (location and scale) → Image Texture (filepath) → Normal bump (optional) (bump strength) → p-BSDF (colors, alpha, metallic etc) → Material output (optional). **Notes** While the export process is fairly straightforward, I’ll note some things. New meshes are created in the following cases - if triangulation of polygons is enabled, - If the Object is a NURBS Curve and export parameters specify conversion to Mesh. Smooth groups are calculated from sharp edges only if specified. Smooth *flag* is written in every combination of export parameters if a polygon is smooth shaded. Normals: if a face is shaded smooth, only then its vertex normals are exported. Otherwise, only face normals are written. If smooth groups are enabled, this still remains the same. What changes is the smooth group which becomes another number instead of the default “1”. So So for e.g., Smooth shading disabled, ``` # vertex coordinates list s off # not smooth shaded f 1*1 2*1 3*1 4*1 s 1 # default smooth group f 5*2 6*3 7*4 8*5 s off # not smooth shaded f 9*6 10*6 11*6 12*6 ``` Smooth shading enabled ``` # vertex coordinates list s off # not smooth shaded f 1*1 2*1 3*1 4*1 s 4 # smooth group changed f 5*2 6*3 7*4 8*5 s off # not smooth shaded f 9*6 10*6 11*6 12*6 ``` The way normals and normal indices (in face elements) are written require that polygons are iterated in the same order. Thus sorting them and trying to for e.g., separate smooth shaded and non-smooth shaded polygons to save a few lines is not possible (without allocating more memory). The same is true for UV vertex indices (`Vector<Vector<uint>>`) because this structure depends on the polygon index. So unless the original polygon index is stored, one shouldn’t reorder polygons and access UV indices using the new polygon indices. The same is not true for vertices since `MLoop->v` remains correct even if vertices are written by say looping over `MVert` backward! Vertex deform groups: Suppose a cube has all four vertices of only a face assigned a deform group. Since its adjacent four faces also share at least one vertex (two vertices are shared, in fact), they also get assigned the same group. Only the opposite face which shares no vertex with the original face has no deform group assigned. IOW, the group which has the most number of vertices of a polygon is the group we write to the file. To denote the absence of any group, we take the same route as smooth groups: “g off”. Other writers may write “g default” or “g (null)” also. Only loose edges are written to the file. This is checked by the `ME_LOOSEEDGE` flag. ------------------ How do we add STL, PLY in this ? Later on, `OBJMesh` can have a parent class which has the methods for the least common denominator all formats need. Vertex coords, object name are examples. Then individual formats will have a derived class that has format specific calculator functions: UV coords in OBJ, edge list in PLY. Similarly, `OBJWriter` can have a superclass for opening files, modifying filenames according to file format and frame number etc. And format specific derived classes which have the required syntax.

Ankit Meel commented

2020-07-21 11:49:43 +02:00

https://docs.google.com/document/d/17Uzl47OljjoKgaMbukiLHUVGQP220lPTmPS-atb65mw/
OBJ Importer design document

The importer’s working:
After receiving the import parameters from the operator, a parser is instantiated to read the whole file line by line and store the data in containers that we call “Geometry”. They are suited to storing the data in the OBJ format and are not Blender Objects. Mesh type and NURBS type Geometry is supported.
After storing the geometry, the material library used by the file is read and all the material definitions are stored in MTLMaterial containers. Since the order of materials may not match the objects in the OBJ file, a blender::Map is used for faster lookups, with keys being the material name.
From the Geometry and MTLMaterial, Blender Mesh or Curve Objects are created. The materials are added to the created `Object. All the objects are added to a single import collection.
Total import time is printed in the console.

Data Structures:
Geometry stores the geometry data of an individual item, but so far hasn't been converted into a Blender object. It stores:

Geometry name,
Geometry type: that helps the parser differentiate between Mesh and Curves. GEOM_MESH, GEOM_CURVE are supported so far.
Vertex indices: they index into the full list of vertex coordinates,
UV vertex indices: they index into the full list of UV vertex coordinates.
Normals are ignored and calculated based on smooth group flags.
MEdge list to store edges that do not belong to a polygon.
Face elements (FaceElem): this struct stores one face’s smooth shading boolean flag, deform (vertex) group name that this face belongs to, and a list of FaceCorners. A FaceCorner contains one vertex’s vertex index & UV vertex index.
NurbsElem : It keeps data of one NURBS spline: vertex indices that index into the global list of coordinates, parm values, & group name to which this curve belongs to (it can also serve as an object name).
tot_loops, tot_normals, tot_uv_vertices: utility numbers that make it easy, later on, to specify limits of for loops or size of memory blocks when the data is not in a contiguous array or may have duplicates.

GlobalVertices:
Originally, the vertex coordinates were stored in a Geometry instance and thus they were not accessible to the other instances which can be problematic when an object in the OBJ file is initialized after the list of vertices is written. So this struct stores all vertex coordinates and UV vertex coordinates and is available to all Geometry instances. An instance stores its vertex coordinates using indices indexing into this global list. If in the future, normals need to be added here, they can be done.

OBJParser: It opens/ closes the OBJ file stream, and has a public method to read the file and store the contents in the given list of Geometry containers. This class is a friend of Geometry since it edits the whole struct itself. So setters and getters would add code complexity.

VertexIndexOffset:
This class also has a number that stores how many vertices belong to other objects so that the faces in upcoming objects can refer to their vertices locally, not in the global list. The correction is to bring a number from global list to an Object's local vertex list for two items: MLoop.v and MEdge.v1/2.

MTLParser:
Same as OBJParser, but for an MTL file. The MTL filename is acquired from the OBJ file so parsing the OBJ file first is necessary. It stores the materials in MTLMaterial containers in a blender::Map

The central caller to parser and object converters. obj_exporter.cc
The code here receives all the import settings from IO_obj.c and initializes

The two parsers,
Appropriate blender::Map or blender::Vector for storing MTLMaterial and Geometry.
A struct GlobalVertices that has three vectors for vertex coordinates, UV vertex coordinates & normals (this may be removed)

File Parsers
The parser doesn't store the whole OBJ/ MTL file in memory in the beginning. It reads it line by line and stores the incoming data to appropriate fields of Geometry, index_offset & MTLMaterial.
index_offsetis used to store how many vertices belong to previous Geometry instances. This helps loop vertex indices that index into a Mesh object's own vertex indices, ranging from 0 to (total vertices - 1).
For eg:

  mloop->v = curr_face.face_corners[loop_of_poly_idx].vert_index; // `vert_index` should be indexing into a mesh’s own vertices, not into the global list of coordinates.

There are some utility functions here:

split_line_key_rest : to separate the line identifier of a line in OBJ/ MTL files from the data in the rest of the line. Examples of such identifiers are v, vn, vt, usemtl, #, etc.
split_by_char to break down a string into smaller ones, delimited by a character. Useful for space-separated and / separated strings. This was a nice time gain as compared to the “>>” operator which is convenient but slow. Possible optimization: Avoid string allocations and use references in this splitting function if possible. The parser takes over 70% of the import time, & this function is the biggest time sink.
copy_string_to_float and copy_string_to_int that convert a string ( or Span of strings) to a numeric type using std::stoi/std::stof and catch exceptions if the string is not a number. A fallback value is expected from the caller so that bad data can be handled by the caller after the conversion.

After the parser is done reading, the file is closed, index_offset is gone, & we have all the Geometry and MTLMaterial instances ready to be converted to Blender objects.

Mesh and Curves Creation
MeshFromGeometry/CurveFromGeometry make an OB_MESH/OB_CURVE type Object. It relies on the parser to get correct indices. The method is straightforward: allocate an appropriately sized mesh and change its vertex coordinates, edge's vertex indices, loops, and polygons. Also, set smooth shading flags if required and call normal calculation methods. Also, add Material and its nodetree to the Object here. Its members:

an UniqueObjectPtr : std::unique_ptr to an Object with a custom deleter to free the Object with Blender’s deallocator.
A mover() that returns std::move(<the object>) so that it can be added to collections later on & thus ownership is transferred.
Possible optimization: remove obsolete Geometry instances after a Mesh block has been created from them to reduce memory pressure.

Material creation:
Using the MTLMaterial filled by the parser, Blender Material is created and a Node tree with only Principled-BSDF node, texture, vector and normal map nodes, is added to the Material.
For this a class ShaderNodetreeWrap is used which receives a MTLMaterial reference and offers a bNodeTree that must to be transferred via a public function: bNodeTree *get_nodetree. This class is responsible for creating nodes, setting socket values, linking nodes, positioning nodes & loading Image for texture nodes.

Adding objects to collections
OBJImportCollection::OBJImportCollection() makes a new collection to put all the Objects in.
OBJImportCollection::add_object_to_collection public method: the newly created Object is added to the collection made above.

The list of Geometry, MTLMaterial, and global vertices, etc is freed now, and the total time taken is printed.

https://docs.google.com/document/d/17Uzl47OljjoKgaMbukiLHUVGQP220lPTmPS-atb65mw/ **OBJ Importer design document** **The importer’s working:** After receiving the import parameters from the operator, a parser is instantiated to read the whole file line by line and store the data in containers that we call “Geometry”. They are suited to storing the data in the OBJ format and are not Blender `Objects`. `Mesh` type and NURBS type Geometry is supported. After storing the geometry, the material library used by the file is read and all the material definitions are stored in `MTLMaterial` containers. Since the order of materials may not match the objects in the OBJ file, a `blender::Map` is used for faster lookups, with keys being the material name. From the `Geometry` and `MTLMaterial`, Blender `Mesh` or `Curve` `Object`s are created. The materials are added to the created `Object. All the objects are added to a single import collection. Total import time is printed in the console. **Data Structures:** `Geometry` stores the geometry data of an individual item, but so far hasn't been converted into a Blender object. It stores: - Geometry name, - Geometry type: that helps the parser differentiate between Mesh and Curves. GEOM_MESH, GEOM_CURVE are supported so far. - Vertex indices: they index into the full list of vertex coordinates, - UV vertex indices: they index into the full list of UV vertex coordinates. - Normals are ignored and calculated based on smooth group flags. - `MEdge` list to store edges that do not belong to a polygon. - Face elements (`FaceElem`): this struct stores one face’s smooth shading boolean flag, deform (vertex) group name that this face belongs to, and a list of `FaceCorners`. A `FaceCorner` contains one vertex’s vertex index & UV vertex index. - `NurbsElem` : It keeps data of one NURBS spline: vertex indices that index into the global list of coordinates, parm values, & group name to which this curve belongs to (it can also serve as an object name). - tot_loops, tot_normals, tot_uv_vertices: utility numbers that make it easy, later on, to specify limits of for loops or size of memory blocks when the data is not in a contiguous array or may have duplicates. `GlobalVertices`: Originally, the vertex coordinates were stored in a `Geometry` instance and thus they were not accessible to the other instances which can be problematic when an object in the OBJ file is initialized after the list of vertices is written. So this struct stores all vertex coordinates and UV vertex coordinates and is available to all `Geometry` instances. An instance stores its vertex coordinates using indices indexing into this global list. If in the future, normals need to be added here, they can be done. `OBJParser` : It opens/ closes the OBJ file stream, and has a public method to read the file and store the contents in the given list of `Geometry` containers. This class is a friend of `Geometry` since it edits the whole struct itself. So setters and getters would add code complexity. `VertexIndexOffset`: This class also has a number that stores how many vertices belong to other objects so that the faces in upcoming objects can refer to their vertices locally, not in the global list. The correction is to bring a number from global list to an Object's local vertex list for two items: `MLoop.v` and `MEdge.v1/2`. `MTLParser`: Same as `OBJParser`, but for an MTL file. The MTL filename is acquired from the OBJ file so parsing the OBJ file first is necessary. It stores the materials in `MTLMaterial` containers in a `blender::Map` **The central caller to parser and object converters.** `obj_exporter.cc` The code here receives all the import settings from `IO_obj.c` and initializes - The two parsers, - Appropriate `blender::Map` or `blender::Vector` for storing `MTLMaterial` and `Geometry`. - A struct `GlobalVertices` that has three vectors for vertex coordinates, UV vertex coordinates & normals (this may be removed) **File Parsers** The parser doesn't store the whole OBJ/ MTL file in memory in the beginning. It reads it line by line and stores the incoming data to appropriate fields of `Geometry`, `index_offset` & `MTLMaterial`. `index_offset`is used to store how many vertices belong to previous `Geometry` instances. This helps loop vertex indices that index into a Mesh object's own vertex indices, ranging from 0 to (total vertices - 1). For eg: ``` mloop->v = curr_face.face_corners[loop_of_poly_idx].vert_index; // `vert_index` should be indexing into a mesh’s own vertices, not into the global list of coordinates. ``` There are some utility functions here: - `split_line_key_rest` : to separate the line identifier of a line in OBJ/ MTL files from the data in the rest of the line. Examples of such identifiers are `v, vn, vt, usemtl, #,` etc. - `split_by_char` to break down a string into smaller ones, delimited by a character. Useful for space-separated and `/` separated strings. This was a nice time gain as compared to the “>>” operator which is convenient but slow. Possible optimization: Avoid string allocations and use references in this splitting function if possible. The parser takes over 70% of the import time, & this function is the biggest time sink. - `copy_string_to_float` and `copy_string_to_int` that convert a string ( or `Span` of strings) to a numeric type using `std::stoi`/`std::stof` and catch exceptions if the string is not a number. A fallback value is expected from the caller so that bad data can be handled by the caller after the conversion. After the parser is done reading, the file is closed, `index_offset` is gone, & we have all the `Geometry` and `MTLMaterial` instances ready to be converted to Blender objects. **Mesh and Curves Creation** `MeshFromGeometry`/`CurveFromGeometry` make an `OB_MESH`/`OB_CURVE` type `Object`. It relies on the parser to get correct indices. The method is straightforward: allocate an appropriately sized mesh and change its vertex coordinates, edge's vertex indices, loops, and polygons. Also, set smooth shading flags if required and call normal calculation methods. Also, add `Material` and its nodetree to the `Object` here. Its members: - an `UniqueObjectPtr` : `std::unique_ptr` to an `Object` with a custom deleter to free the `Object` with Blender’s deallocator. - A `mover()` that returns `std::move(<the object>)` so that it can be added to collections later on & thus ownership is transferred. Possible optimization: remove obsolete Geometry instances after a Mesh block has been created from them to reduce memory pressure. **Material creation:** Using the `MTLMaterial` filled by the parser, Blender `Material` is created and a Node tree with only Principled-BSDF node, texture, vector and normal map nodes, is added to the Material. For this a class `ShaderNodetreeWrap` is used which receives a `MTLMaterial` reference and offers a `bNodeTree` that must to be transferred via a public function: `bNodeTree *get_nodetree`. This class is responsible for creating nodes, setting socket values, linking nodes, positioning nodes & loading Image for texture nodes. **Adding objects to collections** `OBJImportCollection`::`OBJImportCollection()` makes a new collection to put all the `Objects` in. `OBJImportCollection`::`add_object_to_collection` public method: the newly created `Object` is added to the collection made above. The list of `Geometry`, `MTLMaterial`, and global vertices, etc is freed now, and the total time taken is printed.

Jacques Lucke commented

2020-07-23 12:02:28 +02:00

Added subscriber: @JacquesLucke

Matt commented

2020-08-17 18:26:33 +02:00

Added subscriber: @mattli911

Matt commented

2020-08-17 18:26:33 +02:00

Are there any plans to improve FBX I/O or mainly OBJ/other formats? Does Blender want to steer away from FBX, or keep supporting it?
I'm eagerly waiting on these I/O Improvements, since it can be quite painfully slow when importing Zbrush type meshes/etc, or exporting collapsed highpoly geo to another DCC/Marmoset Toolbag.

I'm not sure why, but importing an FBX into blender seems to take 2-4x longer than most any other DCC I've used I feel like. I can be waiting 5/10+ Minutes sometimes to import a highres mesh/file. Where in other DCCs it might take 30 seconds or 1-2 minutes.

Are there any plans to improve FBX I/O or mainly OBJ/other formats? Does Blender want to steer away from FBX, or keep supporting it? I'm eagerly waiting on these I/O Improvements, since it can be quite painfully slow when importing Zbrush type meshes/etc, or exporting collapsed highpoly geo to another DCC/Marmoset Toolbag. I'm not sure why, but importing an FBX into blender seems to take 2-4x longer than most any other DCC I've used I feel like. I can be waiting 5/10+ Minutes sometimes to import a highres mesh/file. Where in other DCCs it might take 30 seconds or 1-2 minutes.

Ankit Meel commented

2020-08-17 20:43:55 +02:00

@mattli911 AFAIK it's not planned to rewrite FBX I/O in C++ in the near future. If someone works on it, addressing the large number of bugs would be priority & later on trying to improve the speed in python itself.

Matt commented

2020-08-17 20:49:48 +02:00

In #68936#997458, @ankitm wrote:
@mattli911 AFAIK it's not planned to rewrite FBX I/O in C++ in the near future. If someone works on it, addressing the large number of bugs would be priority & later on trying to improve the speed in python itself.

Ah , ok. I don't HAVE to use FBX either i guess. I wonder if USD/any other formats will give a much better speed boost?

> In #68936#997458, @ankitm wrote: > @mattli911 AFAIK it's not planned to rewrite FBX I/O in C++ in the near future. If someone works on it, addressing the large number of bugs would be priority & later on trying to improve the speed in python itself. Ah , ok. I don't HAVE to use FBX either i guess. I wonder if USD/any other formats will give a much better speed boost?

Ankit Meel commented

2020-08-17 21:00:35 +02:00

You should definitely give USD, Alembic & Collada a try as per your needs.

You should definitely give [USD, Alembic & Collada](https://developer.blender.org/diffusion/B/browse/master/source/blender/io/ ) a try as per your needs. - https://devtalk.blender.org/t/2020-06-05-tangent-animation-labs-blender-universal-scene-description/13661 - https://code.blender.org/2020/06/changes-to-the-alembic-exporter/

Sybren A. Stüvel commented

2020-08-17 22:52:11 +02:00

@mattli911 This is not a forum, and not the place for asking general questions about future plans for Blender. DevTalk is better for that.

@mattli911 This is not a forum, and not the place for asking general questions about future plans for Blender. [DevTalk](https://devtalk.blender.org/) is better for that.

Matt commented

2020-08-18 03:37:38 +02:00

Removed subscriber: @mattli911