Fix for Python failure with .blend files loaded from paths including non-ASCII characters #35176
Labels
No Label
Interest
Alembic
Interest
Animation & Rigging
Interest
Asset Browser
Interest
Asset Browser Project Overview
Interest
Audio
Interest
Automated Testing
Interest
Blender Asset Bundle
Interest
BlendFile
Interest
Collada
Interest
Compatibility
Interest
Compositing
Interest
Core
Interest
Cycles
Interest
Dependency Graph
Interest
Development Management
Interest
EEVEE
Interest
EEVEE & Viewport
Interest
Freestyle
Interest
Geometry Nodes
Interest
Grease Pencil
Interest
ID Management
Interest
Images & Movies
Interest
Import Export
Interest
Line Art
Interest
Masking
Interest
Metal
Interest
Modeling
Interest
Modifiers
Interest
Motion Tracking
Interest
Nodes & Physics
Interest
OpenGL
Interest
Overlay
Interest
Overrides
Interest
Performance
Interest
Physics
Interest
Pipeline, Assets & IO
Interest
Platforms, Builds & Tests
Interest
Python API
Interest
Render & Cycles
Interest
Render Pipeline
Interest
Sculpt, Paint & Texture
Interest
Text Editor
Interest
Translations
Interest
Triaging
Interest
Undo
Interest
USD
Interest
User Interface
Interest
UV Editing
Interest
VFX & Video
Interest
Video Sequencer
Interest
Virtual Reality
Interest
Vulkan
Interest
Wayland
Interest
Workbench
Interest: X11
Legacy
Blender 2.8 Project
Legacy
Milestone 1: Basic, Local Asset Browser
Legacy
OpenGL Error
Meta
Good First Issue
Meta
Papercut
Meta
Retrospective
Meta
Security
Module
Animation & Rigging
Module
Core
Module
Development Management
Module
EEVEE & Viewport
Module
Grease Pencil
Module
Modeling
Module
Nodes & Physics
Module
Pipeline, Assets & IO
Module
Platforms, Builds & Tests
Module
Python API
Module
Render & Cycles
Module
Sculpt, Paint & Texture
Module
Triaging
Module
User Interface
Module
VFX & Video
Platform
FreeBSD
Platform
Linux
Platform
macOS
Platform
Windows
Priority
High
Priority
Low
Priority
Normal
Priority
Unbreak Now!
Status
Archived
Status
Confirmed
Status
Duplicate
Status
Needs Info from Developers
Status
Needs Information from User
Status
Needs Triage
Status
Resolved
Type
Bug
Type
Design
Type
Known Issue
Type
Patch
Type
Report
Type
To Do
No Milestone
No project
No Assignees
9 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: blender/blender#35176
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
%%%Test platform: 64-bit Windows Vista, with the active code page set to 932 (Japanese Shift_JIS)
Test Blender revision: 56458
Problem: Freestyle fails when .blend files are loaded from paths including non-ASCII characters (e.g., Japanese).
File loading is okay (thanks to the trunk revision 56454), but Freestyle rendering results in a Python error shown below:
UnicodeDecodeError: 'mbcs' codec can't decode bytes in position 0--1: No mapping for the Unicode character exists in the target code page.
The error comes from Py_CompileString() called from python_script_exec() in source/blender/python/intern/bpy_interface.c.
The second argument of Py_CompileString() must be a byte array in the default file system encoding (Py_FileSystemDefaultEncoding).
In the test platform, Py_FileSystemDefaultEncoding is "mbcs" (i.e., Windows-specific class of encodings).
Apparently, G.main->name is a byte array in the UTF-8 encoding (please confirm this).
This variable indicates the fully qualified name of the .blend file being opened and manipulated by Blender.
The .blend file name is then embedded in a string (fn_dummy) to be passed as the 2nd argument of Py_CompileString().
In summary, the UnicodeDecodeError above is resulting from an attempt in Py_CompileString() to decode a UTF-8 byte array as if
the byte array were in the MBCS encoding.
Attached patch is intended to fix the reported issue. Code review and testing are much appreciated.%%%
Changed status to: 'Open'
%%%Campbell is the expert on this, so assigning to him.
G.main->name is indeed UTF-8, and the python documentation mentions Py_CompileString expects a string in the file system encoding, so this change makes sense to me. PyRun_File in the same function looks like it needs the same conversion to the file system encoding?
Probably it would be good to split this into a utility function that converts a char* from UTF-8 to filesystem default encoding so it can be reused in different places.
%%%
%%%Hi Brecht and Campbell,
Thank you for the confirmation and positive response on this matter. Following the code review comments from Brecht, I have duly updated the patch set and made it more comprehensive. There are actually three places in the code base where file names in the file system default encoding are necessary (instead of UTF-8 as it is so far). It is noted that the identified string encoding inconsistency affects the following ways of Python script execution.
The attached patch accounts for all these execution methods, allowing: (a) .blend files and Python scripts to reside in paths including non-ASCII characters; and (b) text datablocks to have non-ASCII ID names.
Since the trunk revision 56454 is likely included in the upcoming 2.67 release and file loading from non-ASCII paths will be okay, the reported issue will affect many Blender applications including Freestyle rendering. A timely fix of the issue would hence be greatly appreciated.%%%
%%%About the PyC_EncodeFSDefault implementation. I found there exists a PyUnicode_EncodeFSDefault in the python API, and is already used in Blender. The implementation is here: http://hg.python.org/cpython/file/56ca8eb5207a/Objects/unicodeobject.c
Perhaps we should just always call that instead to be sure it does the right thing, since your version seems to be a bit different? I'm not sure of how much using e.g. "surrogateescape" instead of "strict" matters in practice, but maybe it matters in some corners cases.
ALso the strcpy should be replaced by BLI_strncpy.%%%
%%%Hi Brecht and Campbell,
Thanks for the comments. I agree that PyUnicode_EncodeFSDefault would be better. Also BLI_strncpy is used instead of strcpy. I hope the updated patch set is okay now.%%%
Added subscriber: @IRIEShinsuke
Any update on this? Many users want the problem to be resolved.
I keep seeing many Japanese Blender users suffering from this issue.
For ease of further tests and code review I have just added D267. Any action on the patch set is much appreciated.
Checking on this, note that filepaths in blender aren't strictly utf-8, for rna filepaths which are not utf8 we need to support these already.
I am afraid I missed the point. What do you mean by being not strictly utf-8?
The only requirement for the patch is that G.main->name and the ID name of text data blocks be encoded in UTF-8.
Are there cases where they are not encoded in UTF-8? If so, how do we know the encoding?
I had a conversation with Campbell on IRC. Here is a brief summary of the discussion.
The main concern is that the PyC_EncodeFSDefault() function I wrote may return NULL when the given string is not valid as a UTF-8 string, so that the conversion to the file system default encoding fails. Since callers of the function may not expect NULL, it would be nicer to return some string instead of NULL. It is recalled that the input string to PyC_EncodeFSDefault() is bpy.data.filepath + os.sep + text.name and what may cause the failure of encoding conversion is the first part. Two options were discussed:
Another concern is that some paths cannot be opened with byte strings at all (cf. BLI_fopen() and intern/utfconv/utf_winfunc.c). This may happen and there is no obvious solution for it.
Correction:
I meant on Windows.
Added subscriber: @MartijnBerger
Added subscriber: @Lockal
I have updated D267 based on the discussions with Campbell concerning a junk UTF-8 string as input. Now the helper function PyC_EncodeFSDefault() (added by the patch) always return a valid string rather than NULL, so that callers of the function won't fail even when they don't expect NULL.
The chosen solution here is just to return the input string as it is when encoding conversion cannot be done.
I have examined several other fall back solutions, including:
for conversion from bytes to unicode, and
for conversion from unicode to bytes. The Latin-1 decoder and encoder pass through all bytes, so they are equivalent to a simple copy of the input string to output. The UnicodeEscape and RawUnicodeEscape functions involve the additional handling of backslash escaped characters such as \xNN and \uNNNN. Since backslash is a directory separator on Windows, using these functions may further complicate error conditions.
Since a junk UTF-8 string cannot be a valid path anyway, it looks like the only possibility is to give up encoding conversion and fall back to BLI_strncpy().
Fix for Freestyle failure with .blend files loaded from paths including non-ASCII charactersto Fix for Python failure with .blend files loaded from paths including non-ASCII charactersJust changed the title, since the problem being addressed here is not Freestyle specific but related to Python in general.
I've been looking into this bug and oddly enough I can't redo it (my fs encoding is
mbcs
), but I've tried to use text which isnt mbcs compatible and its not givine a python error.I'd still rather avoid adding new string conversion functions - since we can get
bpy.data.filepath
already.Note, we already ran into similar problems here.
http://bugs.python.org/issue9713
Committed change to merge text compile into a single function.
Attached a patch which I think fixes the problem for the text editor, by using
Py_CompileStringObject
so we can convert the string into a PyObject first (using the same method used to getbpy.data.filepath
).@kjym3 could you check if this works for you?
unicode_text.diff
Note, this doesn't deal with
bpy_interface.c
. - just compiling text.@ideasman42
The changes by unicode_text.diff look okay. Now that we rely on Py_CompileStringObject() that takes a file name in the form of a Unicode object, we don't have to deal with the file system default encoding.
For Freestyle, we have to make similar changes to python_script_exec() called from BPY_text_exec().
It is noted that Py_CompileStringObject() is new in Python 3.4. Windows binaries of this version are not in the lib svn repository. I just built Python 3.4c1 myself from the tarball using VS 2008 to test the patch.
Below find a revised version of your patch set, including your original changes plus mine to get Freestyle working with non-ASCII file paths.
unicode_text_v2.diff
So OK to postpone this fix until 2.71? (or whenever we bundle Py3.4).
Yes (personally I prefer to have this fix asap though).
Added subscriber: @ThomasDinges
As we won't include Python 3.4 for Blender 2.70, removing Blender 2.70 here.
This issue was referenced by blender/blender-addons-contrib@4d1a109dde
This issue was referenced by
4d1a109dde
Changed status from 'Open' to: 'Resolved'
Closed by commit
4d1a109dde
.Just realized that the problem has been partly left not addressed. Please, consider reviewing the patch just added to the task.
Changed status from 'Resolved' to: 'Open'
py-exec-fix-test.blend
Here is a quick .blend file for testing. Just press the "Run Script" button in the text editor. Without the proposed fix you will see a stack trace printed in the console (after the completion of the script execution).
@kjym3. I can't redo the bug (MSVC2013, Windows7)
I tried to run and to register and both work fine. - print "hello" with no exception.
EDIT, somehow Missed D595, looks good.
Changed status from 'Open' to: 'Resolved'
Committed, resolved.
Great, thanks!
Added subscriber: @TonyMullen
I am experiencing a related problem in the latest buildbot build as of Jun 19, (
b49e6d0
) (and all previous builds as far back as 2.65 that I've tried) on Japanese Windows 8 on a Vaio Ultrabook. Python fails on startup and I am unable to find any workaround to run Blender with its menus, etc intact (I've tried putting the Blender install directly under C:\ so as to avoid Japanese characters in the path, but no luck).This also gives me the same error as described above in the console:
UnicodeDecodeError: 'mbcs' codec can't decode bytes in position 0--1: No mapping for the Unicode character exists in the target code page.
Should I open a new report for this?
Tony
Fwiw, I'm adding the screenshot I took of the Traceback.
Exactly the same error occurs when the name of the user running Blender contains non-ASCII MBCS-compatible characters such as those in Japanese. Patch D604 is intended to address this issue. The reported problem is different from what #35176 was meant to address, so filing another bug report is appreciated.
Ok, thanks. Created a separate bug report.