VSE movie strips: unreliable timing, unwanted time warping
Closed, ArchivedPublic

Description

Short description of the problem

When using video files in movie strips in Blender's Video Sequence Editor - sometimes the videos are impossible to sync, becasue the timeing of the video streams are warping unpredictably - this breaks audio/video sync while editing and in final renders and makes editing movies impossible.

Introduction and my personal history with this issue

Since I started using Blender 2.5 I have experienced severe problems with syncing video to audio in Blender Video Sequence Editor.
Syncing together multi-camera footage was even more impossible. I'm not sure but I feel like the problem was non-existent before Blender 2.5

I was very frustrated with this and had a few-year long brake in using Blender for video editing.
I published this video right before I agave up:

https://youtu.be/7tq4lX6OmLc?t=40m56s

You can hear that I speak, while my lips don't move an vice versa. But it's not just the webcam part, all of the video is totally out of sync with the video. The raw footage I captured with RecordMyDesktop was playing fine alone - after editing and rendering it - the audio and video never align in the screencast part - what's strange, a part captured with Canon 550D DSLR is fine.

After a few years I have finally gave Blender another try. I'm currently using Blender again to edit videos, and I managed to edit and render a 75-minute video without experiencing this problem. Heres that video:

https://youtu.be/SMrHEoci5Fk?t=39m8s

You can see a few frames static A/V desync but that's totally correctible. It's not what I'm talking about here.

Two days ago I recorded footage with my webcam (the same way I recorded footage for the video above) and with a camcorder.
I recorded a visible and audible hand clap to help myself syncing - I tried to sync these movie strips together by their audio first. This seemed to fail. The footage didn't seem to align with the sound. So I tried by frames - I found and locked on the exact same moment when my hands meet. After rendering I dont see any place where the tow cameras meet in syc. The result is this video:

https://www.youtube.com/watch?v=vCwANW3IH60

The camcorder (on the left) seems to be in sync with the audio at first, the ndrifts away, but the webcam (right) is totally off right from the start. Notice how the webcam footage stops sometimes, while the camcorder footage is rolling. If you jump to the last seconds of this video - you can see how far off the webcam footage is in the end, still playing in the middle, while camcorder and audio are already finished with the performance.

The strange thing is that it's not the camcorder footage the one that seems to fail most - it's the webcam. Camcorder was a new thing . I edited webcam footage captured exactly the same way with success in Blender. An example with a ton of editing where the sync is solid:

https://www.youtube.com/watch?v=fEduGnD6ZKQ

Here's a screenshot of the session:

Symptoms

The issue appears unpredictably. Probably when different input video formats are used together. Only for movie strips, never for image sequences.
I know it's on when I sync Movie Strips in one point, I scrub to a different point and they don't match there.
I also know the issue is on, when I do some cuts, only to realize after rendering that my editing decisions were completely distorted and the cuts are not where I wanted them to be.
I know it's on when the original and aligned audio doesn't match up with the video, after 2 minutes into the footage.
Another sign of this problem happening is that 2 sources that started in sync don't end in sync - one is still "in the middle" while the other is over already.

But didn't you just forget to..?

I generate 25% and sometimes 50% JPEG proxy for all the footage. And most of the time I edit using that.
I enable Freerun timecode for all the footage also.
I have A/V Sync enabled at all times.

Testing the problem

So to finally find out what is happening I have captured 4-source footage of a 25-FPS Timecode counter going from 00:00:00:00 to 00:04:00:01

The sources are:

  1. Screencast - Open Broadcasting Software recordng my screen and an overlay webcam, with audio (it's just silence though- but helps check for the correct framerate)
  2. Camcorder - recording the laptop screen with audio
  3. Camera - recording the laptop screen with audio
  4. Phone - recording the laptop screen with audio

You can download all the video files here and do your own testing:
https://drive.google.com/drive/folders/0BxsUHyFo7VxfcUFJT2FTN0VWbHM?usp=sharing

Here's ffprobe output for all the files (omitting the redundant heading information):

Screencast

Input #0, matroska,webm, from 'Screencast.mkv':
  Metadata:
    ENCODER         : Lavf56.40.101
  Duration: 00:04:26.83, start: 0.000000, bitrate: 11868 kb/s
    Stream #0:0: Video: h264 (Constrained Baseline), yuv420p, 1920x1080, SAR 1:1 DAR 16:9, 30 fps, 30 tbr, 1k tbn, 60 tbc (default)
    Metadata:
      DURATION        : 00:04:26.833000000
    Stream #0:1: Audio: aac (LC), 48000 Hz, stereo, fltp (default)
    Metadata:
      title           : simple_aac_recording
      DURATION        : 00:04:26.795000000

Camcorder

Input #0, mpeg, from 'Camcorder.MPG':
  Duration: 00:04:27.84, start: 0.112556, bitrate: 9530 kb/s
    Stream #0:0[0x1e0]: Video: mpeg2video (Main), yuv420p(tv), 720x576 [SAR 64:45 DAR 16:9], max. 9100 kb/s, 25 fps, 25 tbr, 90k tbn, 50 tbc
    Stream #0:1[0x80]: Audio: ac3, 48000 Hz, stereo, fltp, 256 kb/s

Camera

[mjpeg @ 0x7256a0] Changeing bps to 8
Input #0, avi, from 'Camera.AVI':
  Metadata:
    encoder         : 
    maker           : NIKON
    model           : P80
    creation_time   : 2008-01-01 00:00:00
  Duration: 00:04:25.50, start: 0.000000, bitrate: 8455 kb/s
    Stream #0:0: Video: mjpeg (MJPG / 0x47504A4D), yuvj422p(pc, bt470bg/unknown/unknown), 640x480, 8388 kb/s, 30 fps, 30 tbr, 30 tbn, 30 tbc
    Stream #0:1: Audio: pcm_u8 ([1][0][0][0] / 0x0001), 8000 Hz, 1 channels, u8, 64 kb/s

Phone
(I remuxed this video with ffmpeg to remove the GPS location tag: ffmpeg -i input.mp4 -metadata location="" -metadata location-eng="" -acodec copy -vcodec copy output.mp4)

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'Phone.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf56.40.101
  Duration: 00:04:27.26, start: 0.000000, bitrate: 15173 kb/s
    Stream #0:0(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1920x1080, 15056 kb/s, 26.57 fps, 30 tbr, 90k tbn, 180k tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 124 kb/s (default)
    Metadata:
      handler_name    : SoundHandler

The issue initially expressed itself strongly.
I have synced the 4-camera footage, having all 4 sources match on timecode 00:00:00:10 (on frame 1 the laptop screen was obscured by my arm when I pressed the spacebar to start the timecode).

I tested with VLC and all video files contain the full footage from 00:00:00:00 to 00:04:00:01. However - the phone is never getting so far. In Blender it ends earlier. I wasn't able to correct this by importing the footage multiple times, even though VLC shows me footage that Blender never gets to. Strange.

Heres' the resulting video, edited and rendered with Blender 2.78:

https://youtu.be/cPP7YX1ljFE

Generally looks like the phone is lagging behind by about 20 seconds, the rest stays within the limits of 8 frames of desync in relation to the Blender's timecode:

I hope we can sort this out finally.

Details

Type
Bug

The problem here is that your various clips have different framerates (i.e. 30fps for the screencast, 25fps for the camcorder, 30fps for the camera, and 26.57fps for the phone).

Blender's Video Sequencer can only cope with having a single framerate for all movie clips (i.e. the one defined in the render settings). In other words, it doesn't know how to resample videos to get them to have a consistent framerate. That's why you'll see the video + audio streams getting out of sync, and also why some of the videos will look too long/short (depending on which one you used first - at least since 2.77/8, Blender will use the framerate of the first clip to determine the framerate for the project, and then all others will be interpreted accordingly).

AFAIK, there are currently no plans to add support for doing this sort of resampling, as there's no one working on the sequencer itself either.

What you could do in this case is to manually convert the clips to all have the same framerate (it looks like 30fps might be a good baseline, given that half your clips are in that format already). I suspect FFMPEG (or even VLC) should have options to do this.

Good luck :)

I use mixed framerates a lot, so I made blender cope with it better than it does... it's not quite production ready yet as I do notice bugs when I'm using it, but I no longer rely on the "speed control" strip.

https://rightclickselect.com/p/sequencer/Ppbbbc/vse-playback-rate-control

I should put the patch up somewhere, I haven't had any time to work on Blender recently though.

aligortith, I'm not sure if we are on the same page.

I know Blender will not automatically compensate for different framerates and simply use 1:1 frame resulting in distorted video playback speed.
I have no problem using the Speed Control strip to correct this manually (as you can see in the screenshot).
Audio strips from movie import are very useful, because they show how long the clip should be after changing framerate.

Do you mean that the Speed Control strip doesn't work properly?

If this is a known problem, could Blender offer an option to resample the video strips using ffmpeg at import so they will match the project framerate?

Funkster, your patch seems like a great thing for what I want to do, I'd definitely check it out.

I downloaded your phone video, and I think that it does not have a consistent framerate - as algorith noted its average framerate is 26.57fps, but even with blender set to that rate it doesn't map correctly.

If the phone is dropping frames, all hope is lost of ever being able to synchronise it. Not sure if any video container has presentation time for each frame, but I doubt it - most video cameras have a stable clock so it's not required. I have a cheapo IP camera whose framerate drops when it's dark, and none of the usual video tools can handle it.

Bastien Montagne (mont29) closed this task as "Archived".Feb 20 2017, 10:18 AM
Bastien Montagne (mont29) claimed this task.

Yes it can, there is no bug here, this is user assistance task and should not be handled vie this tracker. User forums like blenderartists.org or blender.stackexchange.com should be used for that matter.