Apricot BGE VBO patch and a fix for hipoly objects. #17523

Closed
opened 2008-08-25 14:19:48 +02:00 by Samuel Anjam · 16 comments

%%%This patch is built on apricot branch since it contains newer rasterizer code than the trunk. Thus this patch won't work on trunk.

In addition to the VBOs, I made an important fix for hipoly objects. Previously if an object had more than 65K indices, an new display array would be created for EACH FACE instead of creating a new display array and filling that one up. This makes rendering REALLY slow for faces that exceed the 65000th indice.

The VBOs are detected and used automatically if the hardware supports them.%%%

%%%This patch is built on apricot branch since it contains newer rasterizer code than the trunk. Thus this patch won't work on trunk. In addition to the VBOs, I made an important fix for hipoly objects. Previously if an object had more than 65K indices, an new display array would be created for EACH FACE instead of creating a new display array and filling that one up. This makes rendering REALLY slow for faces that exceed the 65000th indice. The VBOs are detected and used automatically if the hardware supports them.%%%
Author

Changed status to: 'Open'

Changed status to: 'Open'
Author

%%%Fixed a bug which caused random crashes on windows. The color VBO was being over allocated and malicious data was loaded into the vbo from an array which was expected to be longer.%%%

%%%Fixed a bug which caused random crashes on windows. The color VBO was being over allocated and malicious data was loaded into the vbo from an array which was expected to be longer.%%%

%%%OK, some initial comments from reading the patch:

  • Please add some empty lines in InitVboSlot, it's hard to read like this.
  • In InitVboSlot, I wouldn't use array->m_vboSlot->m_verts but just float *verts, only makes it more confusing since it is deleted at the end anyway.
  • Is a separate UpdateMeshSlotData function needed? It could do these things from IndexPrimitives too, I think, though an extra "animated" value would need to be added to RAS_MeshSlot.
  • For animated objects, the normals must be updated too.
  • I wouldn't call glVertexPointer(3, GL_FLOAT, 0, 0); and such for cleanup, there's no need, it has to be set to something else anyway, NULL would also be invalid regardless. Or is there a reason?

Most importantly, are you seeing performance improvements compared to display lists, and if so, how much?
%%%

%%%OK, some initial comments from reading the patch: * Please add some empty lines in InitVboSlot, it's hard to read like this. * In InitVboSlot, I wouldn't use array->m_vboSlot->m_verts but just float *verts, only makes it more confusing since it is deleted at the end anyway. * Is a separate UpdateMeshSlotData function needed? It could do these things from IndexPrimitives too, I think, though an extra "animated" value would need to be added to RAS_MeshSlot. * For animated objects, the normals must be updated too. * I wouldn't call glVertexPointer(3, GL_FLOAT, 0, 0); and such for cleanup, there's no need, it has to be set to something else anyway, NULL would also be invalid regardless. Or is there a reason? Most importantly, are you seeing performance improvements compared to display lists, and if so, how much? %%%
Author

%%%* I will add the empty lines to InitVboSlot

  • You have a point there. I changed some of the major code and forgot to make that function smarter.
  • The animated flag would be very useful, I didn't go around adding that because I don't really have an idea where that should be done (setting it continuously in RenderMeshSlot sounds like a bad idea...), haven't gotten around the code that much yet.
  • I thought that only the coordinates were updated... Didn't see anything else in BL_MeshDeformer::Apply. Are the normals recalculated somewhere else? Once this has been cleared up, I'll add the normal update.
  • The clear up is there because once I add the check if memory is available in the GPU to the InitVboSlot function, it is possible that the VBOs don't get fully initialized and in case this happens, it falls back to non vbo mode for the meshes that didn't get their vbos fully initialized. Now when you render something with vbos, obviously you have to bind them first and if by any chance you render something non vbo after that, it might crash if some vbos are bound. Thats why all of them are unbound. This shouldn't make the rendering any slower I think.

Display lists are still the fastest method of rendering and they beat vbos by a little. But the thing with display lists is that they take much more memory than vbos and they are certainly not suitable for the type of scenes that are found in todays games. Display lists are optimal for rendering really big amounts of small objects, but nowadays pretty much everything major is rendered with vbos. This is an industry standard and I think that it would be wise to follow this standard in the BGE too.

Also display lists aren't suitable for animations and transparent objects due to their static nature. VBO data can be reuploaded, thus supporting animation and zsorting.

VBOs compared to non vbo rendering:
VBOs are optimal for high poly scenes. The performance doesn't differ that much with low poly scenes (especially when the GLSL shaders are enabled, lots of pixel calculations compared to vertex calcs). With 150K triangles I get about 52fps in the non vbo mode and 112fps with vbos. One guy said in the testing thread that he got 169fps with 700K+ triangles + my vbo patch and 13fps without my vbo patch.%%%

%%%* I will add the empty lines to InitVboSlot * You have a point there. I changed some of the major code and forgot to make that function smarter. * The animated flag would be very useful, I didn't go around adding that because I don't really have an idea where that should be done (setting it continuously in RenderMeshSlot sounds like a bad idea...), haven't gotten around the code that much yet. * I thought that only the coordinates were updated... Didn't see anything else in BL_MeshDeformer::Apply. Are the normals recalculated somewhere else? Once this has been cleared up, I'll add the normal update. * The clear up is there because once I add the check if memory is available in the GPU to the InitVboSlot function, it is possible that the VBOs don't get fully initialized and in case this happens, it falls back to non vbo mode for the meshes that didn't get their vbos fully initialized. Now when you render something with vbos, obviously you have to bind them first and if by any chance you render something non vbo after that, it might crash if some vbos are bound. Thats why all of them are unbound. This shouldn't make the rendering any slower I think. Display lists are still the fastest method of rendering and they beat vbos by a little. But the thing with display lists is that they take much more memory than vbos and they are certainly not suitable for the type of scenes that are found in todays games. Display lists are optimal for rendering really big amounts of small objects, but nowadays pretty much everything major is rendered with vbos. This is an industry standard and I think that it would be wise to follow this standard in the BGE too. Also display lists aren't suitable for animations and transparent objects due to their static nature. VBO data can be reuploaded, thus supporting animation and zsorting. VBOs compared to non vbo rendering: VBOs are optimal for high poly scenes. The performance doesn't differ that much with low poly scenes (especially when the GLSL shaders are enabled, lots of pixel calculations compared to vertex calcs). With 150K triangles I get about 52fps in the non vbo mode and 112fps with vbos. One guy said in the testing thread that he got 169fps with 700K+ triangles + my vbo patch and 13fps without my vbo patch.%%%

%%%I can see the advantage for animated and transparent objects yes, avoiding to upload everything again. But I didn't find a performance difference with this patch on an apricot level. Those statistics you give, is that with or without display lists?

I'm guessing that at least in my graphics card's driver, the display list is compiled to pretty much the same thing as a VBO is. If that is the case, I also don't understand display lists would use that much more memory?%%%

%%%I can see the advantage for animated and transparent objects yes, avoiding to upload everything again. But I didn't find a performance difference with this patch on an apricot level. Those statistics you give, is that with or without display lists? I'm guessing that at least in my graphics card's driver, the display list is compiled to pretty much the same thing as a VBO is. If that is the case, I also don't understand display lists would use that much more memory?%%%
Author

%%%The apricot levels use the GLSL system extensively so you can't really tell the difference between vbo and non vbo rendering since the pixel calculations cover it up. (shaders are heavy). You start to see the difference when you are trying to render several high poly objects (20K+ faces) and also when you have something like 200K faces visible on the screen at the same time. VBOs are all about processing large amounts of vertex data. The apricot levels are fairly low poly and use shaders extensively, so the optimization doesn't show up that much.

Here is a document about why people should use vbos: http://spec.unipv.it/gwpg/gpc.static/vbo_whitepaper.html%%%

%%%The apricot levels use the GLSL system extensively so you can't really tell the difference between vbo and non vbo rendering since the pixel calculations cover it up. (shaders are heavy). You start to see the difference when you are trying to render several high poly objects (20K+ faces) and also when you have something like 200K faces visible on the screen at the same time. VBOs are all about processing large amounts of vertex data. The apricot levels are fairly low poly and use shaders extensively, so the optimization doesn't show up that much. Here is a document about why people should use vbos: http://spec.unipv.it/gwpg/gpc.static/vbo_whitepaper.html%%%
Author

%%%oh, and no, display lists weren't enabled during the tests ;).%%%

%%%oh, and no, display lists weren't enabled during the tests ;).%%%

%%%I've tried artificial high poly scenes too, no difference. Even on an animated high poly mesh it didn't matter for me, probably because the armature deform was the bottleneck there anyway. Those 150K triangles you tested, was that with display lists?

I'd just like to know if this patch improves performance, on some graphics cards or even if just a bit :). Still it's useful to add this I think, even if it doesn't give immediate benefits, but I'm just hoping it does, in some way I'm not thinking of.%%%

%%%I've tried artificial high poly scenes too, no difference. Even on an animated high poly mesh it didn't matter for me, probably because the armature deform was the bottleneck there anyway. Those 150K triangles you tested, was that with display lists? I'd just like to know if this patch improves performance, on some graphics cards or even if just a bit :). Still it's useful to add this I think, even if it doesn't give immediate benefits, but I'm just hoping it does, in some way I'm not thinking of.%%%
Author

%%%Thats really weird... Which GPU do you have? Are you sure that it supports vbos? (should support if you have GLSL shaders...)

With my computer and on this guys computer the vbos give a nice boost. I got GeForce 7400 Go.

Please try with this scene: http://rapidshare.com/files/140338836/multipoly.blend.html

Also I have made the changes you requested for except the one for the removal of UpdateMeshSlot, still working on that ;).%%%

%%%Thats really weird... Which GPU do you have? Are you sure that it supports vbos? (should support if you have GLSL shaders...) With my computer and on this guys computer the vbos give a nice boost. I got GeForce 7400 Go. Please try with this scene: http://rapidshare.com/files/140338836/multipoly.blend.html Also I have made the changes you requested for except the one for the removal of UpdateMeshSlot, still working on that ;).%%%

%%%Well, display lists disabled in the new patch compared to display lists enabled without it doesn't improve performance for me. It is exactly the same in fact, 289 fps for both, on a Nvidia 8800. I think the difference would be in the drivers then, some might not be as smart in compiling display lists as others, because I can't think of another reason why this should make a performance difference.

Also, the big performance difference this other person is seeing is likely because of the bug that is fixed in this patch with creating buckets.%%%

%%%Well, display lists disabled in the new patch compared to display lists enabled without it doesn't improve performance for me. It is exactly the same in fact, 289 fps for both, on a Nvidia 8800. I think the difference would be in the drivers then, some might not be as smart in compiling display lists as others, because I can't think of another reason why this should make a performance difference. Also, the big performance difference this other person is seeing is likely because of the bug that is fixed in this patch with creating buckets.%%%
Author

%%%Eh, you are not supposed to test it with display lists. VBOs are supposed to outperform vertex arrays but not display lists and it doesn't really make any difference to display lists if you compile them with vbos or without. The VBO patch is not meant to outperform the display lists but to provide an alternative to them which is almost as fast, more dynamic and more elegant, standard rendering method which everyone is using nowadays.

But oh well, I attached a separate patch for the material bucket fix only. Atleast you can apply that ;).%%%

%%%Eh, you are not supposed to test it with display lists. VBOs are supposed to outperform vertex arrays but not display lists and it doesn't really make any difference to display lists if you compile them with vbos or without. The VBO patch is not meant to outperform the display lists but to provide an alternative to them which is almost as fast, more dynamic and more elegant, standard rendering method which everyone is using nowadays. But oh well, I attached a separate patch for the material bucket fix only. Atleast you can apply that ;).%%%

%%%Hi Samuel.

I updated your patch (BUGFIX001.patch) to current SVN. Since 2.49 is feature frozen it was commited to the branch bb_dev instead. There lays the BGE code that will make it's way in trunk after 2.49 official release.

Thanks for this work. I want to implement full glsl skinning for deformed meshes and VBO is a very good start.

I couldn't get your test files though. Could you send them again?%%%

%%%Hi Samuel. I updated your patch (BUGFIX001.patch) to current SVN. Since 2.49 is feature frozen it was commited to the branch bb_dev instead. There lays the BGE code that will make it's way in trunk after 2.49 official release. Thanks for this work. I want to implement full glsl skinning for deformed meshes and VBO is a very good start. I couldn't get your test files though. Could you send them again?%%%
Author

%%%Hi.

I'm afraid that the test files are long gone. My laptops motherboard burned up and I sent it to warranty repair, they didn't get it repaired and didn't even send the laptop back to me so I couldn't get HD from it. Even if I did, I don't know would I still have those files.

It shouldn't be that difficult to make some simple test files though ;). Just add something with a lot of vertices, about 100K should start to show some significant improvement on pretty much all up to date GPUs or GPUs starting from the GeForce 6xxx generation.%%%

%%%Hi. I'm afraid that the test files are long gone. My laptops motherboard burned up and I sent it to warranty repair, they didn't get it repaired and didn't even send the laptop back to me so I couldn't get HD from it. Even if I did, I don't know would I still have those files. It shouldn't be that difficult to make some simple test files though ;). Just add something with a lot of vertices, about 100K should start to show some significant improvement on pretty much all up to date GPUs or GPUs starting from the GeForce 6xxx generation.%%%
Member

%%%Hi, trying to update the status of the patch tracker. Please respond if this patch is still viable/useful and needs review or if it can be closed.%%%

%%%Hi, trying to update the status of the patch tracker. Please respond if this patch is still viable/useful and needs review or if it can be closed.%%%

Changed status from 'Open' to: 'Archived'

Changed status from 'Open' to: 'Archived'
Mitchell Stokes self-assigned this 2014-07-01 06:22:51 +02:00

We have VBOs in master (based on this patch I believe).

We have VBOs in master (based on this patch I believe).
Sign in to join this conversation.
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
5 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: blender/blender#17523
No description provided.