makesrna crashes during the build with LTO on s390x architecture (Linux) #80639
Labels
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
4 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: blender/blender#80639
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
System Information
Operating system: Linux (Fedora)
Graphics card: N/A
Blender Version
Broken: 2.90 and master
Worked: without link time optimization (LTO)
Short description of error
makesrna crashes during the build with enabled LTO. See https://bugzilla.redhat.com/show_bug.cgi?id=1874398#c6
Linux distribution like Fedora have enabled LTO by default (https://fedoraproject.org/wiki/LTOByDefault) exposing the failure.
Exact steps for others to reproduce the error
Yes, it does happen with blender 2.90 as well. I believe this is another case when enabled LTO reveals some real bug in the source code.
build output
running under gdb gives:
Added subscriber: @Luya-Tshimbalanga
Added subscriber: @LazyDodo
Changed status from 'Needs Triage' to: 'Needs User Info'
I'm seemingly unable to reproduce on
x86_64-redhat-linux
. Is this issue isolated tos390x-redhat-linux-gnu
or does it reproduce on other architectures as well?The issue is isolated on s390x since 2.83.5 as tested on scratch build. Other 64-bits architectures are unaffected.
s390x is not a supported platform for us, we don't have the hardware nor the development environment to do anything here, asan on x64 doesn't seem to turn up anything here either.
So i'm unsure how much help we can be here for an actual fix , I'm happy to sling some sh...uh things at the wall and see what sticks though
Given it is crashing during a free, more specifically in this bit of the code you could just try taking out the abort by either removing the line of code or by turning the cmake option
WITH_ASSERT_ABORT
tooff
, which will hopefully sidestep the crash (but the root cause will remain, pray an hope it's not going to rear its head elsewhere)I'd be real uncomfortable with his kind of compiler behavior and would highly recommend building/running our unit tests on this platform to rule out any further issues.
I will let the Fedora team specializing to S390x architecture know. One of team will hopefully take a look and provide suggestion.
Added subscriber: @sharkcz
with the
abort()
removed fromMEM_lockfree_freeN()
I see another crashAfter Fedora enabled LTO globally we have already seen various issues in the source codes, like relying on undefined behaviour, violating strict aliasing rules or similar, leading to runtime crashes. And platforms like s390x are good in uncovering that, the compiler has a different "view" on the source code. The only developer-visible difference should be that s390x is big endian (like ppc64 is). I can provide access to our public s390x machine if needed.
Changed status from 'Needs User Info' to: 'Archived'
I'm not doubting that the s390x/LTO/big-endian are good at finding issues, the thing is like i mentioned the s390x is not a supported platform for blender and we cannot reproduce the issue on the ones that we do support, and while we do happily accept patches fixing issues for such unsupported platforms we do not have the resources to spend to go out and debug issues on such environments.
Now if the report was more specific going "hey over here at this specific spot, you're doing the stupid and relying on undefined behavior" it be a different story, but as of now this is not an actionable issue for us and i'll have to close your ticket.
Added subscriber: @ideasman42
@sharkcz I was curious if this issue pointed to bugs/bad assumptions.
If you can point to an actual error in the code I'd be interested to look into it.
OTOH, I tried building with LTO enabled and the generated warnings regarding string-size where false positives, there are some
lto-type-mismatch
warnings too, however those were inBLI_thrad_lock/unlock
threading code which isn't running in makesrns.Did you try building with/without
-fno-strict-aliasing
?makesrna crashes during the buildto makesrna crashes during the build with LTO on s390x architecture (Linux)@ideasman42 , I haven't looked into the details yet, it's on my to-do list. But my educated guess is there should an issue somewhere. It's based on the results we are getting after globally enabling LTO in Fedora.
It looks to me that blender already sets
-fno-strict-aliasing
andmakesrna
still crashes when I added that option explicitly.Changed status from 'Archived' to: 'Resolved'