i18n messages extraction script: fix handling of C unicode-escapes.

rB1f5647c07d15 introduced for the first time a unicode escape in strings
to be translated, directly extracted from C-code itself.

This revealed that this case was not properly handled by current code,
for now we work around using `raw_unicode_escape` encoding/decoding of
python.
This commit is contained in:
Bastien Montagne 2021-02-22 18:29:52 +01:00
parent 46bdf6d59f
commit 32073993a8
1 changed files with 3 additions and 1 deletions

View File

@ -735,7 +735,9 @@ def dump_src_messages(msgs, reports, settings):
_clean_str = re.compile(settings.str_clean_re).finditer
def clean_str(s):
return "".join(m.group("clean") for m in _clean_str(s))
# The encode/decode to/from 'raw_unicode_escape' allows to transform the C-type unicode hexadecimal escapes
# (like '\u2715' for the '×' symbol) back into a proper unicode character.
return "".join(m.group("clean") for m in _clean_str(s)).encode('raw_unicode_escape').decode('raw_unicode_escape')
def dump_src_file(path, rel_path, msgs, reports, settings):
def process_entry(_msgctxt, _msgid):