Page MenuHome

ppc64el, Debian: video sequence editor (VSE) preview corrupt
Closed, ArchivedPublic

Description

System Information
Operating system: Debian buster
Graphics card: Radeon RX 580

Blender Version
Broken: tried the packages from Debian buster, buster-backports and manually building the package from Debian unstable 2.83.5+dfsg-2
Worked: N/A

Short description of error
Blender appears to work. In the video sequence editor, preview window, video clips have vertical lines through the picture. Images (e.g. from JPEG files) are displayed correctly, only video clips are broken. I saw other bugs such as T19794 about endian problems. I'm testing on POWER9, a little endian PPC64 environment.

Exact steps for others to reproduce the error

  1. Start blender from the command line with the test Blend file as an argument
  2. Look at the preview pane: it is corrupted, vertical bars through the picture

Event Timeline

Sounds like a video driver problem.

Note that I'm not sure any active developers have access to this hardware, even so.

This report is missing:

  • Exact steps to redo the error.
  • A blend file (unless there are simple steps to redo this from the default startup).

Since this is an error others may have trouble redoing:

  • Does this happen for all video?
  • Does this happen for image sequences?
  • Can you test different codecs/formats to narrow down where the problem might be?

Yes, I'm happy to test it and create a sample blend file demonstrating the problem

I notice it happens to the videos in the sequence but the images are rendered properly

I'm a developer, I usually work in other C++ projects, for example reSIProcate but if you can tell me where to focus, I don't mind looking at the code and applying any changes

The workstation I'm using is one of the Talos II systems, it is open hardware.

I can try a different GPU if there are known problems with the RX 580. I prefer to use only the open source amdgpu driver. I was thinking about getting one of the AMD Big Navi cards in November but it is not clear if they will work immediately on Linux systems.

I tried building 2.90.0 as a Debian package, it also has the same bug.

I attach a screenshot showing how the video looks in the preview

Notice the PNG at the bottom with the text is displayed correctly but the video is rendered with vertical bars

Richard Antalik (ISS) changed the task status from Needs Triage to Needs Information from User.Sep 18 2020, 4:36 PM

This doesn't look like GPU issue, since whole preview region is pushed to GPU as one texture.
Also I am not sure if this would be endianness issue, since x86 is also little endian.

Does it look like this if you add movie strip and don't adjust any strip properties? Can you try to load movie into other editors like movie clip editor or image editor?

As an endianness issue, I was concerned that some part of the code may see that it is running on ppc and wrongly assume all ppc == big endian. I've seen similar problems in other code.

ppc can be either little or big endian and some developers assume one implies the other.

Regarding endianness it seems we rely on __BIG_ENDIAN__ and __LITTLE_ENDIAN__ symbols to be defined correctly, but they are not defined in GCC. I don't build on Linux so I don't know if this has to be defined explicitly somewhere in make files. @Ray molenkamp (LazyDodo) Are you familiar with this?

I can see that little endian is detected at the configure step and the gcc compiler is called with -DLITTLE_ENDIAN on the command line.

However, maybe there could be one accidental block of code like this that nobody checked before:

#ifdef __ppc...
// wrongly assume big endian

Here is a sample blend file that reproduces the problem

There is a single video strip

This was created in Blender 2.90.0 on Debian ppc64el

Another problem I sometimes see when porting code is that the size of data types is not correct for every architecture, for example, the gcc on that architecture uses 64 bits for a type but the developer has guessed it is 32 bits and hardcoded that somewhere. I feel that in this case, that is less likely than an endianess issue but can't be completely sure.

Regarding endianness it seems we rely on __BIG_ENDIAN__ and __LITTLE_ENDIAN__ symbols to be defined correctly, but they are not defined in GCC. I don't build on Linux so I don't know if this has to be defined explicitly somewhere in make files. @Ray molenkamp (LazyDodo) Are you familiar with this?

This is controlled through the main cmakelists.txt in this section and should be defined when the compiler is invoked.

Using find, I found three places in the source tree where the logic is not right for ppc64el

btSerializer.h

SDL_endian.h is good for linux but not for BSD and others perhaps

PacketMath.h has an example of assuming the size of a type of ppc64el

Here is the command I used, other searches for endian and byte_order might also be useful:

find blender/ -type f -exec egrep -C10 -Hi '(_ppc|_power)' {} \;|less

Are any of those things listed above relevant to the preview panel in the video editor?

none of those are used in the VSE, for all i know ffmpeg is delivering the frame to us like this, a developer will have to look at this, throwing random things at the wall going "could it be this?" doesn't seem like a good use of anyone's time.

Can you check if other software using ffmpeg plays this properly? (video editors, players)

Note that this report is still missing:

  • Test files to redo the error.
  • Details about which codecs work and which fail, please try with different video formats to see if this is related to a spesific codec.

I provided a test file, please see the earlier comment "Here is a sample blend file that reproduces the problem", it is a link to the files on Gitlab.

I ran ffplay from the command line with each of the video files used in the Blend file.

ffplay can display them without any problems.

I tried using an updated ffmpeg 4.3.1 backport, that didn't make any difference

I provided a test file, please see the earlier comment "Here is a sample blend file that reproduces the problem", it is a link to the files on Gitlab.

Missed the link, uploaded here.

Looking at the preview window with different zoom settings changes the way it is corrupted, notice the gaps between bars is zoomed

I tried some videos with other codecs. The cat video attached is H.264, the others I tried are VP9 and WMV2. All have the same problem so it is not specific to the H.264 codec.

I continued testing today and I found that the same problem appears in the render output. I tried each render engine (Eevee, Cycles, Workbench) and it is always the same.

I tried using the Compositor as well and the video node in the compositor has the same problem.

Richard Antalik (ISS) changed the task status from Needs Information from User to Needs Information from Developers.Sep 28 2020, 5:35 AM

I am really not sure what the problem here could be.

All info here would point to one switch, which I would assume to work correctly. To check you can apply following patch, load video in VSE and check console output:

diff --git a/source/blender/imbuf/intern/anim_movie.c b/source/blender/imbuf/intern/anim_movie.c
index f5ae602946e..a05f98330bd 100644
--- a/source/blender/imbuf/intern/anim_movie.c
+++ b/source/blender/imbuf/intern/anim_movie.c
@@ -801,6 +801,7 @@ static void ffmpeg_postprocess(struct anim *anim)
   }

   if (ENDIAN_ORDER == B_ENDIAN) {
+    printf("Detected big endian machine\n");
     int *dstStride = anim->pFrameRGB->linesize;
     uint8_t **dst = anim->pFrameRGB->data;
     const int dstStride2[4] = {dstStride[0], 0, 0, 0};
@@ -847,6 +848,7 @@ static void ffmpeg_postprocess(struct anim *anim)
     }
   }
   else {
+    printf("Detected little endian machine\n");
     int *dstStride = anim->pFrameRGB->linesize;
     uint8_t **dst = anim->pFrameRGB->data;
     const int dstStride2[4] = {-dstStride[0], 0, 0, 0};

I can't see any other code directly manipulating pixels there.

I had already checked that code with gdb breakpoints. It always takes the second code path.

853	    uint8_t *dst2[4] = {dst[0] + (anim->y - 1) * dstStride[0], 0, 0, 0};
(gdb) 
855	    sws_scale(anim->img_convert_ctx,
(gdb) 
864	  if (need_aligned_ffmpeg_buffer(anim)) {
(gdb) 
874	  if (filter_y) {
(gdb) 
ffmpeg_fetchibuf (tc=<optimized out>, position=<optimized out>, anim=<optimized out>) at ./source/blender/imbuf/intern/anim_movie.c:1235
1235	  anim->last_pts = anim->next_pts;
(gdb)

I have debug symbol packages installed but some values are optimized out, I can recompile without optimizing if necessary

Could it be related to alignment or sizeof(some type) on this platform?

when I zoom in a lot so the pixels become quite big, I can see that the ratio between the vertical strips of image to the vertical strips of alpha is 1:15, so there is one vertical line of pixels in every 16 pixels

not sure if this is relevant, but there could be something around the call to sws_scale, for example
https://code.mythtv.org/trac/ticket/12888

The hardcoded 32 bit alignment also appears to be something to avoid, I found commits like this in other projects
https://github.com/FFmpeg/FFmpeg/commit/f30a41a6086eb8c10f66090739a2a4f8491c3c7a

It is a magic number, 32 in the call to av_frame_get_buffer and a few other places, the number 31 also appears in need_aligned_ffmpeg_buffer. I'm not sure how prevalent this is throughout Blender but it may be a good idea to

a) make a commit replacing all the magic numbers with a macro

b) make another commit replacing the macro with 0 in cases where alignment can be guessed by av_frame_get_buffer

I guess changing hard coded alignment to automatic would be good practice, but I don't think this is cause of the issue, because buffer is actually 32 bit aligned. In case with example video av_frame_get_buffer(anim->pFrameRGB, 32) should not be executed anyway.

Since images works OK and also other editors have same issue, bug must come from anim_movie.c file and there isn't much going on - basically sws_scale fills the buffer and it is passed to editor. You say that ffplay works correctly however. Is Blender actually build with that ffmpeg library you tested?

I could add more debug prints there(mainly check result of av_frame_alloc) and "force" sws_scale to pass image unchanged to see what happens if you have patience, but this is mostly guesswork at my part.

I'm happy to tweak things

I do quite a lot of C++ in other projects, for example:
https://github.com/resiprocate/resiprocate/graphs/contributors

but I'm not familiar with the Blender code base.

If you want me to query values in gdb breakpoints, test changes that you put on a branch or if you want to create some extra unit tests for me to run then I'm happy to collaborate with you until we get this fixed.

I checked with ldd and I can see that Blender links against the ffmpeg dynamic libs on my system

I tried first with the ffmpeg 4.1.6 packages from Debian stable (buster), then I built a backport package of ffmpeg 4.3.1 and I recompiled against that

https://packages.qa.debian.org/f/ffmpeg.html

Happy new year, I have some new observations:

I built a package of 2.90.1-1~bpo10+1 on Debian buster, same problem

On the same machine, I also have OBS. OBS uses ffmpeg, like Blender, behind the scenes. Similar problem in OBS, but not quite the same:

  • video appears to be OK in OBS
  • rendering a PNG in OBS is also OK
  • rendering a JPEG in OBS: I see vertical bars, like the problem in Blender

Notice that this situation is inverted: in Blender, the images are rendered correctly, whether they are PNG or JPEG, they are always OK. The video is corrupted in Blender. In OBS, it is the opposite: images, if they are JPEG, corrupted.

Thanks for new information, and I am sorry I haven't responded earlier, since you agreed to run a test.

Please apply this patch

And open .blend file using provided video file from archive:

Then paste console output here or to https://hastebin.com and provide link.

It's just simple dump of 16 pixels in line from buffer in key places, correct values for every pixel should be: 63 64 127 128.

Your patch changes two files, the second file, source/blender/sequencer/intern/render.c is not in my tree

$ ls source/blender
blendthumb       bmesh           editors            imbuf      python
blenfont         CMakeLists.txt  freestyle          io         render
blenkernel       compositor      functions          makesdna   shader_fx
blenlib          datatoc         gpencil_modifiers  makesrna   simulation
blenloader       depsgraph       gpu                modifiers  windowmanager
blentranslation  draw            ikplugin           nodes

I'm using the 2.90.1 build on Debian, can you please confirm which branch you made the patch for?

I'll try the patch anyway, only applied to source/blender/imbuf/intern/anim_movie.c it is compiling now

$ blender T80912_test.blend 
Read prefs: /home/daniel/.config/blender/2.90/config/userpref.blend
Read blend: ./T80912_test.blend
pFrameRGB buffer:
63	64	127	128	
63	64	127	128	
63	64	127	128	
63	64	127	128	
63	64	127	128	
63	64	127	128	
63	64	127	128	
63	64	127	128	
63	64	127	128	
63	64	127	128	
63	64	127	128	
63	64	127	128	
63	64	127	128	
63	64	127	128	
63	64	127	128	
63	64	127	128	


Buffer after sws_scale:
63	64	127	128	
63	64	127	128	
63	64	127	128	
63	64	127	128	
63	64	127	128	
63	64	127	128	
63	64	127	128	
63	64	127	128	
63	64	127	128	
63	64	127	128	
63	64	127	128	
63	64	127	128	
63	64	127	128	
63	64	127	128	
63	64	127	128	
63	64	127	128	

Your patch changes two files, the second file, source/blender/sequencer/intern/render.c is not in my tree

$ ls source/blender
blendthumb       bmesh           editors            imbuf      python
blenfont         CMakeLists.txt  freestyle          io         render
blenkernel       compositor      functions          makesdna   shader_fx
blenlib          datatoc         gpencil_modifiers  makesrna   simulation
blenloader       depsgraph       gpu                modifiers  windowmanager
blentranslation  draw            ikplugin           nodes

I'm using the 2.90.1 build on Debian, can you please confirm which branch you made the patch for?

I'll try the patch anyway, only applied to source/blender/imbuf/intern/anim_movie.c it is compiling now

Sorry I could have specified - patch should be applied on latest master, it should apply in 2.91 as well I think

Do you see same problem in example file with preview?

Here is correct preview

No, when I used your example, I saw the same white square as in that image

No, when I used your example, I saw the same white square as in that image

Interesting. Just for info, video file used MPNG codec with RGBA channels. So pixel format doesn't change and I can quickly check values.
It is possible that this codec would not cause bug. You can try transcoding some videos and check if this is the case.

Can you paste console output using test-T80912.blend and VID_20150419_101911.mp4 files from description? I will compare to what I get on my machine.

Using the same Blender version as I tested your file, now using my test file and video:

$ blender test-T80912.blend 
Read prefs: /home/daniel/.config/blender/2.90/config/userpref.blend
Read blend: ./test-T80912.blend
pFrameRGB buffer:
6	7	16	255	
5	6	15	0	
2	9	16	0	
3	10	17	0	
2	5	13	0	
6	9	17	0	
8	11	19	0	
18	21	29	0	
37	40	48	0	
49	52	60	0	
59	62	70	0	
58	61	69	0	
58	59	68	0	
57	58	67	0	
71	66	77	0	
72	67	78	0	


Buffer after sws_scale:
6	7	16	255	
5	6	15	0	
2	9	16	0	
3	10	17	0	
2	5	13	0	
6	9	17	0	
8	11	19	0	
18	21	29	0	
37	40	48	0	
49	52	60	0	
59	62	70	0	
58	61	69	0	
58	59	68	0	
57	58	67	0	
71	66	77	0	
72	67	78	0	

Does that give you enough detail to determine the problem or do you need me to compile 2.91 or master with the full patch?

Does that give you enough detail to determine the problem or do you need me to compile 2.91 or master with the full patch?

I don't think i will need that. But it seems I am reading same data in both prints, I will have to fix patch, and make one more try. Also I am missing index info since this is not static picture anymore.

It seems that it depends on need_aligned_ffmpeg_buffer() whether I am looking at same buffer or not.

having YUV buffers available doesn't make too much sense here as problem is, that alpha channel is set to 0 by sws_scale()

This is a bit shot in the dark but can you try if following change would help?

diff --git a/source/blender/imbuf/intern/anim_movie.c b/source/blender/imbuf/intern/anim_movie.c
index c40e65b1c5c..fa33d774f17 100644
--- a/source/blender/imbuf/intern/anim_movie.c
+++ b/source/blender/imbuf/intern/anim_movie.c
@@ -502,7 +502,7 @@ static ImBuf *avi_fetchibuf(struct anim *anim, int position)

 BLI_INLINE bool need_aligned_ffmpeg_buffer(struct anim *anim)
 {
-  return (anim->x & 31) != 0;
+  return true;
 }

 static int startffmpeg(struct anim *anim)

If above doesn't help, I am 99% confident this is ffmpeg issue and ffplay and possibly other apps using ffmpeg work fine, because they ignore aplha channel and render only RGB at full opacity.

Thanks for that feedback, I had already tried that change myself but I tried it again, on the same 2.90.1 source tree and it doesn't fix the issue. The output from the patch is below.

Can you propose any extra unit test in ffmpeg to detect the fault?

Can you propose any ffmpeg command line to detect the fault?

With either of those things, we could open an ffmpeg bug but the unit test would be ideal, that would detect any future regressions on any platform, not just ppc64le

$ blender test-T80912.blend 
Read prefs: /home/daniel/.config/blender/2.90/config/userpref.blend
Read blend: ./test-T80912.blend
pFrameRGB buffer:
6	7	16	255	
5	6	15	0	
2	9	16	0	
3	10	17	0	
2	5	13	0	
6	9	17	0	
8	11	19	0	
18	21	29	0	
37	40	48	0	
49	52	60	0	
59	62	70	0	
58	61	69	0	
58	59	68	0	
57	58	67	0	
71	66	77	0	
72	67	78	0	


Buffer after sws_scale:
6	7	16	255	
5	6	15	0	
2	9	16	0	
3	10	17	0	
2	5	13	0	
6	9	17	0	
8	11	19	0	
18	21	29	0	
37	40	48	0	
49	52	60	0	
59	62	70	0	
58	61	69	0	
58	59	68	0	
57	58	67	0	
71	66	77	0	
72	67	78	0	

Thanks for that feedback, I had already tried that change myself but I tried it again, on the same 2.90.1 source tree and it doesn't fix the issue. The output from the patch is below.

Can you propose any extra unit test in ffmpeg to detect the fault?

I think we can have test for case like this.

Can you propose any ffmpeg command line to detect the fault?

I would try
ffmpeg -loglevel 48 -i VID_20150419_101911.mp4 -vframes 1 -pix_fmt rgba out.png

The command should produce broken image. Please upload image and console output of the command here.

Confirmed, the ffmpeg commands produces a broken image

Richard Antalik (ISS) closed this task as Archived.Jan 23 2021, 1:44 AM

Great, so please create new report on ffmpeg tracker as I won't be able to check whether this has been resolved or not and provide link to issue here.

I will close this report since it is not bug in Blender.

Here is the ffmpeg bug report for this issue:
https://trac.ffmpeg.org/ticket/9077

Thanks for help in identifying the cause of this

fixed upstream in ffmpeg commit
http://git.videolan.org/?p=ffmpeg.git;a=commitdiff;h=2687070d9b092d3a354a6963c65197054ddf7a75

I applied the commit to ffmpeg 4.3.1

Tested with the ffmpeg command you provided above, works OK

Tested with Blender 2.90.1-1~bpo10+1 and it works OK

Thanks for assisting with this.