Page MenuHome

VSE: audio clipping/distorted after render
Closed, InvalidPublic

Description

Consulted with Dr. Sybren🐱 @dr.sybren via support chat (under name "Kite"); was told to file bug report after he checked .blend file.
Complete walkthrough of problem available here:
https://blender.stackexchange.com/questions/104502/vse-audio-now-not-rendering-properly-clipping-missing-distorted-edited

System Information
Windows 8.1 GTX 755m (i7 processor)

Blender Version
Broken: 2.79 5bd8ac9

Short description of error
Playback of audio in VSE does not mimick rendered audio. Audio renders at a much higher amplification than what is heard on playback. If you engage in changing the volume values of a medium/large waveform in the strip's menu (N) from 1.0 to 10.0 this is readily apparent. Rendered audio often clipping. Master volume in render options presents solution to fix issue, but unknown what value to set at without trial/error. Issue with audio strips with keyframes not rendering though when large file rendered (unable to produce .blend sample at this time, will not open ticket for this problem at this time). Mixdown seems to solve problem above, but mixdown has no mastervolume control thus clipping occurs. Opening a new scene in the project and importing project as scene clip allows you to change mastervolume, potential workaround method. Trial and error then used to set a mastervolume that will not clip rendered audio. Otherwise individual strip volume levels, those with medium - large waveforms, must be manually kept close to a volume of 1.0 in order to avoid clipping, regardless of what is heard in the vse playback.

Exact steps for others to reproduce the error
-Engage VSE playback, listen to the 3 sets of audio strips at speaker volume (each is same strip set to different volume levels). No clipping should be apparent.
-Mixdown/render audio out.
-Check waveform in audacity/audioeditor of the 3 portions and confirm clipping occurs
(easily repeatable inserting any audio track in a new .blend file so long as the waveforms are similar size, otherwise value of 10.0 volume should be decreased/increased to mimick)

Details

Type
Bug

Event Timeline

In your User Prefs, you have System->Sound set to OpenAL (recommended for game engine usage).
Do you still have the same issue with Sound set to SDL (recommended for sequencer use) ?

In your User Prefs, you have System->Sound set to OpenAL (recommended for game engine usage).
Do you still have the same issue with Sound set to SDL (recommended for sequencer use) ?

Changing to SDL did not fix the issue (rendered and tested a few days ago).

Ignore if inappropriate to ask: SDL made the rendered audio and playback audio in the vse much much worse sounding, easily noticeable when mixing many audio strips, I don't understand why its recommended (or much of anything about audio) but I would never use something like that voluntarily given the sheer difference in output quality.

Thanks for checking that out.

SDL made the rendered audio and playback audio in the vse much much worse sounding,

A very interesting data point! Something we will need to look at. (if only to change the tool tip)

Documentation here ( https://docs.blender.org/manual/en/dev/editors/vse/sequencer/strips/sound.html ) is missing information that was available in previous sources:

1.) Volume levels greater than 1.0 raise volume of input strip, in Decibels (dB), by 'value-1' dB.
2.) Volume levels between 0 and 1.0 reduce volume of input strip.

So, if your input strip has 'perfect' volume level, leave the Volume-value at 1.0. If your input file was recorded with low settings, use the Volume setting to raise the dB.

Does this help?

I would first check to see if you have Loudness Equalization on (Link), as that would mitigate some clipping in the realtime preview.

As for the clipping; with the provided file I could not get the same results as https://i.stack.imgur.com/7UnOy.jpg

The shown results look like they've had compression applied to it.

My test:

Also there is a way to control the render audio/mixdown in Scene Properties.

Cerbyo (Kite) added a comment.EditedMar 29 2018, 12:38 AM

1.) Volume levels greater than 1.0 raise volume of input strip, in Decibels (dB), by 'value-1' dB.

The other bit of your post I understand, just this bit here: increases by 'value-1' dB

Just to dumb it down a bit, what does that mean in terms of the waveform when viewed in something like say Audacity?

Below I have a sample: first is what I would see/hear as a normal sounding waveform; and there are 10 other duplicated sections next to it, each amplified by 1dB more than the previous (so the far right is 10db+ the first). You can see my process as selecting the section, going to amplification and increasing each by the amount specified (the last one was 10db+ the first as shown).

I woulda thought there would be some visual correlation between numbers here, so that I could mimick one of the waveforms at any stage via blender mixdown. So rather than ask, I mucked around and tried to see for myself: I was unable to though (visually or numerically).

If there are doubts on my problem please read/follow below my process, it provides a better example of the problem and a better .blend file:

I exported this waveform (the picture1) out as an mp3, put in blender to muck around. I provided a better .blend file showing my process, I'll use words here though to explain. I packed the audio strips file so everything is included. Process explained (moving left to right in the timeline):
<img src="https://blend-exchange.giantcowfilms.com/embedImage.png?bid=4715" />
-I imported the waveform in the picture1 into the .blend file.
-I mixed down this file as a .flac then as an mp3, I then imported these back into the file as shown (.flac sounds right, mp3 on default sounds horrid). I just wanted to show I know the pitfalls of mixdown, and can avoid those, so let's ignore mp3 mixdown and use .flac.
-Now I duplicated the original strip 3 times and set different volumes. listen to them via playback they should sound fine
-I mixdowned these out as .flac. Stuck it in audacity, took a picture of the waveform (below). And exported and imported back into blender as a mp3 to meet upload limits. The mp3 and .flac sound the same when audacity is the one mixingdown (you can confirm this as you recreate the process).
-I copied the 10.0 volumed strip and its mixdowned version (this stays at volume 1.0 of course) and put them in the vse side by side in the final strips, just to show the extreme example. These SHOULD sound the same in the vse! They don't! Sound settings should be irrelevant so long as you aren't listening to them at a volume that blows ur eardrums out, you should hear distortion from clipping in the mixdowned version! But the vse playback of the strip before its mixdowned....is extremely different!!! (as noted in the stack exchange posting, mucking around with the mixdown options (1 accuracy, S32/F32/etc) and choosing "even higher" outputs had no effect on the clipping problem).

Below is a picture of the original AND the 3 mixdowned .flac files at differing volumes in audacity, so u see visually what happened to the waveforms at 0.5/2/10 volume. I don't understand the metrics, and couldn't guess what the mixdowned waveform would be like even with the simple values of 0.5/2/10 volume.
So the formula 'value-1' dB should explain these outputted mixdown waveforms? I'm a visual learner and I don't understand audio stuff, so I'd appreciate any elaboration you could provide on what this means via audacity's metrics for example (like it displays waveform from -1.0 to 1.0, so how does that formula look in relation to the display?)

To work safely in blender I either need the playback to be the same as the rendered version. OR I need to know what the rendered waveform will be in relation to the values chosen in the volume tab of each strip (2 reasons for this: 1. I need to know when it clips; 2. I need to know what it sounds like in relation to other strips it might be mixed with).

I would first check to see if you have Loudness Equalization on (Link), as that would mitigate some clipping in the realtime preview.
As for the clipping; with the provided file I could not get the same results as https://i.stack.imgur.com/7UnOy.jpg
The shown results look like they've had compression applied to it.
My test:


Also there is a way to control the render audio/mixdown in Scene Properties.

Hi, I only figured out how to use compression the day after uploading that. I didn't modify them in any such way. Hopefully the .blend above will allow you to recreate my problem, as it is better structured. You can recreate urself via the original file included and confirm I didn't do anything like that (using the default audacity settings).
I'm aware of the scene workaround, its what I've been using and its the best workaround to the issue thus far found. Its still not enough though as 'I' can't figure out how to tell what the rendered output will sound like until trial and error (see above).
With the loudness equalization, I have no such settings available. The only setting that shows up is "disable all sound effects" but that had no effect even when checking it.

Sorry, Cerbyo (Kite) , I'll try to say it a different way.

The volume level of a sound file, exported from Audacity, should not change when inputted as a strip in the VSE, if the volume slider in the strip's properties holds the value of 1.0. If the original audio strip is properly made the range of audio will be from silence, to a maximum of 0 decibels (dB).

If the Strip's volume level is increased to 2.0, the original strip's content will be amplified, such that the original audio signal will be expanded to occupy the range from silence to maximum volume of +1.0 dB. The greater the volume, the wider the range to maximum volume.

At Strip Volume 3.0, a new range of silence to +2.0 dB.
At Strip Volume 4.0, a new range of silence to +3.0 dB. Etc.

I also use Audacity to edit my sound levels for VSE audio strips. The game I play is this; How close can I get to 0 dB, without going over. All amplifying and recording systems have a range of sound for which they can accurately re-create and manipulate audio signals. When you go over or into 0 dB (the red-area on the UV meter) you are reaching the point where the product designers think the accuracy is going to break-down.

In my case I do this because my editing is for low-fidelity audio, on old TV sets, used to watch standard-NTSC signals on a Public-Access TV. The range of audio for a hi-fidelity amplifier might be from -60 dB to 0 dB. But, with an old or cheap TV, the audio range might be lucky to get from -40 dB to 0 dB. I want to use as much range as I can, but don't want the viewer to have to crank the volume to where he/she can hear the 60 Hz hum of the poorly grounded microphones, or induced in poorly shielded extending cables.

So, I use Blender's VSE to place audio strips where I want them in a project. I then Lock down my video and graphics effects so that nothing will be moved around. I then render out the audio (only), in full-program-length passes. Then I re-assemble the audio in Audacity for corrections, and then re-export from Audacity for assembly in the Blender VSE project (Cntrl-h hiding all the original audio strips so the sound is not doubled).

Sorry, gotta go.

Cerbyo (Kite) added a comment.EditedMar 30 2018, 12:56 AM

I don't fully understand the process or how the community works as of yet. Forgive me on that front. Question: Is this case still open, do I need to give it more time, or is it a case of people can't recreate what I said and its dismissed? I feel like I'm making a big deal out of something that has been that way for a very long time and is deemed a nonissue. And in my quest for answers I am asking questions and creating a massive wall of text that normally would only serve to hinder my original issue (of which it is still a big deal for me). The more I learn, and I'm learning alot here, the more I come to the conclusion that this is a 'purposely' built limitation of the VSE, but I'm the only one who thinks its a clear decisive limitation and a huge deal!

Sorry, Cerbyo (Kite) , I'll try to say it a different way.
The volume level of a sound file, exported from Audacity, should not change when inputted as a strip in the VSE, if the volume slider in the strip's properties holds the value of 1.0. If the original audio strip is properly made the range of audio will be from silence, to a maximum of 0 decibels (dB).

Yup I understand that.

If the Strip's volume level is increased to 2.0, the original strip's content will be amplified, such that the original audio signal will be expanded to occupy the range from silence to maximum volume of +1.0 dB. The greater the volume, the wider the range to maximum volume.

This is where I don't understand. I've confirmed that rendering at 1.0 volume (default) will not change the waveform of the imported file. But in terms of changing the volume up: If you look at the picture I had posted above (picture 2 reposted here) you'll see the waveforms of the top strip with the volumes 0.5/2.0/10 rendered out and displayed. By what I understand from your posting you said a volume of 2.0 will add +1.0dB. Is this process the same as amplifying the waveform by +1.0dB in audacity? Because that doesn't add up if it is the same procewss. Each section to the right of the top waveform is +1.0db more than the previous using amplify. But why doesn't the rendered 2.0 strip matchup with any of these? It should simply be offset by 1 section should it not? Its visibly not though, infact I can't even say what level it starts from in relation to the top waveform. You can run through this urself and get the same waveform I have using the blend file.

And how does this work in relation to 0->1.0 then? I'd assume its a percent since 0 is silencing the original waveform from whatever state it is imported as. Looking at my test and the 0.5 volume render....does that look like 50% of the original waveform? It sorta looks like that. So is that correct that it does it differently when you go down from 1.0 compared to up from 1.0? Cause you can't -1dB if you go down by 1.....cause you hit 0.
And this whole system makes no sense to me! If what you said is true, then why is it setup like this? Shouldn't it be a slider that starts at 0 (the state its imported as) and then you simply increase the slider up by 1 db at a time or down 1 db at a time from the original state? Then you could just have it calculate when the waveform is undetectable (the 0 state) as a unique negative number per strips, and put in an option to mute the strip so you automatically find that state if they need it muted for keyframing or something. You could do the same with clipping, so you know when you are hitting a level of 0db.

So, I use Blender's VSE to place audio strips where I want them in a project. I then Lock down my video and graphics effects so that nothing will be moved around. I then render out the audio (only), in full-program-length passes. Then I re-assemble the audio in Audacity for corrections, and then re-export from Audacity for assembly in the Blender VSE project (Cntrl-h hiding all the original audio strips so the sound is not doubled).

This is what I have been doing to salvage my project. And it is a horrible process. To effectively do this I had to divide the audio into 3 layers and even that is compromising for my project, and render out each layer individually at a volume that doesn't clip. ANd then layer and modify each strip accordingly. I need 1 strip for just the music, and i need to make sure no music strips are overlapping. I need 1 strip for the voiceovers, and I have to compromise when they do overlap. I need a third strip for any sound effects that would otherwise overlap with voiceovers/music that need to be individually tuned. This is a ridiculous process that should not be "necessary", and is not something I plan on repeating in the future. It is made even more ridiculous by the fact VSE playback is displaying the audio the way I want it!!! Why can't the render display it the same as the playback!? You can't modify individual strips if they overlap unless you mixdown them as these huge strips, 1 by 1 in order to keep their place within the project via audacity and back into blender.

All this information is contrary to what I've read via the online and seen in tutorials. Is this common in Video sequence editors? This is the only 'real' one I've tried (that isn't bare bones)...I woulda thought this woulda been a huge deal! How can the VSE be made so powerful and yet barebones in this department? And you do movies with blender do you not? Are you telling me you do this process via audacity when you make these giant long movies? I thought Blender was a standalone VSE and audio mixer that you used to manipulate all ur raw data for these movies (pictures, imported videos, imported sound files) to completion.

Even if you manipulate ur sound files in audacity so they are as ready as possible (all ur vocals at a clear steady volume, music, etc) you still need to arrange and finetune alot of things once you get in there and start manipulating the strips. You are going to be stuck doing that process you described at the end, shoving it back into audacity in different strips, whether you like it or not if ur project has multiple layers of audio mixing down.

I mean this can't be right. People can't actually do things this way. It is like working in the VSE and having it output a different video than you work on in the preview window (which actually was the reality before I learned about timecodes...but a fix existed as a built-in feature!). That is what I am searching for...I was told I'm crazy....Then I was told its a bug and to come here....now I'm looking crazy again. The fix! it has to exist somewhere here....where is it?! Every problem I've had in blender has had a built-in fix that I just haven't learned yet. Something as huge as this can't be the sole exception.

Responding to Cerbyo (Kite):

Question: Is this case still open, do I need to give it more time, or is it a case of people can't recreate what I said and its dismissed?

I can't speak for the Blender Developers. If I understand the process, there will be a notice put in this thread, when they can make a decision on whether the problem you have described is a bug to be fixed. I agree that it would be great to have more functionality in Blender for processing audio. But I understand that this is not the place for asking for new features.

I chimed in, just to let you know that I didn't think it was a good idea to use a volume level of '10', and expect that there wouldn't be a problem with an audio strip in the VSE.

By what I understand from your posting you said a volume of 2.0 will add +1.0dB. Is this process the same as amplifying the waveform by +1.0dB in audacity? Because that doesn't add up if it is the same procewss. Each section to the right of the top waveform is +1.0db more than the previous using amplify. But why doesn't the rendered 2.0 strip matchup with any of these? It should simply be offset by 1 section should it not?

Audacity has better controls to keep you from taking your audio above the safe area of 0 dB and below.

The screen shots that you are presenting are images of a complex wave form, representing many frequencies at many different volume levels. While they are below the 0 dB level, the relationship of those original frequencies and volumes to each other remain fairly consistent. But as the recorded volume is increased, the original frequencies and tones that had higher volume values to them will be produced less accurately than the things that started off with lower volumes. So, if you render audio strips in the VSE, with the strip's volume levels above 1.0, you are more than likely cutting out any of the tones that might have been recorded properly, at just below 0 dB in the original recording.

Try to understand that the graphical representation that you are showing is *not* an increase of the the wave by increments of 1 dB. Rather, what you are seeing is the outer limit of the highest volume that has been recorded. Volume controls only increase the theoretical range of a wave form, by a multiplying (multiplicative?) process. There will always be an upper limit to how loud the hardware can amplify. Distortion is usually an indication that that limitation has been reached, somewhere in the process.

So, sounds with a lower volume and sounds with a higher volume will not be equally increased by the same amount, when the volume level of the strip is raised. Your original recording was not showing the maximum volume at 0 dB, so, increasing the volume level would not have increased the image in steps of 1 dB (even if they could have been displayed properly).

As for the business of reducing volume of a strip, with settings between 0 and 1.0 in the volume properties of an audio strip: I don't think that the scale can be linear. Decibels are a logarithmic scale, and can theoretically go to an negative infinity, to reach silence. Just my opinion.

As it stands, now, you should not be using blender's strip volume control for pulling up weak audio, if you want the same accuracy that you can get in Audacity.

After having done some testing;
there is a peculiarity to the way Blender downmixes (and whatever the name is for doing the opposite).

Rendering through Blender

Downmixing (and upmixing???) affects the volume.

Note:
Track names indicate format.
Duplicate track names are FLAC and WAV, FLAC being first.

Explanation/Findings:

Rendering 2 channels to 1 channel increases the amplitude.
Rendering 1 channel to 2 channels decreases the amplitude.

Rendering through Audacity

Downmixing (and upmixing???) Does not affect the volume.

Explanation/Findings:

Rendering 2 channels to 1 channel doesn't change the amplitude.
Rendering 1 channel to 2 channels doesn't change the amplitude.


I would think that Audacity's behavior is preferred..?

Correction:

I chimed in, just to let you know that I didn't think it was a good idea to use a volume level of '10', and expect that there wouldn't be a problem with an audio strip in the VSE.

I meant that I thought there *would* be a problem with volume '10' on an audio strip. I don't know if this was an 'auto-correction' error, or my own fat fingers.

Cerbyo (Kite) added a comment.EditedApr 4 2018, 2:06 AM

Edit: I recorded the vse playback in audacity of a 1.0 volume strip (one I had doctored up in audacity all nicely that was imported into the vse as a single strip). And lined it up with the mixdowned version of that strip back in audacity. Playback in audacity: The recorded playback sounded clearer and better, the mixdowned version had a waveform that was clipping in some spots. EVen if i was recording the vse playback at a lower volume, thus not getting clipping, who cares! there is no way to change the volume/amplitude of the mixdowned version to get it to sound as good, cause it clipped. I would have to manually do the scene method to change the volume.....and that has countless problems to begin with including the ending audio getting cutout (so u have to put an extra strip in). I'm just gonna record the playback for now on in the vse and use that. I might try my method below still, but seriously I can see so many things going wrong....why bother. And here's what I don't get: the 1.0 volume strip was from a mixed down audacity file I have open right now....and there is noooooo clipping, and the waveform is much lower than the mixdowned waveform!?!?? THAT makes no sense based on my previous tests and the test below that say a 1.0 imported waveform will export as the same volume/waveform level. Contradictions abound. Again this was .flac 1 accuracy.
My only thoughts to leave on is the OpenAL vse SDL might have something to do with all this. As I stated earlier, OpenAL sound so much clearer on vse playback than SDL. Perhaps OpenAL is not making it into the rendering process though, thus I'm hearing some kind of jumbled SDL output? Perhaps OpenAL is applying some unknown amazing effect in the VSE playback and its making it all amazing sounding?!? But I haven't been able to recreate by recording SDL vs OpenAL via audacity capture. Regardless, I don't know anymore, I'll stick with what is working over what is supposed to be working.

Right. As a workaround for 'all' of the mentioned oddities and limitations and differences, and in the interests of mimicking the process of programs like audacity, would this work?:

Say I amplify all waveforms in audacity to their maximum without clipping (an easy process in audacity). And then I import these Maximized waveforms in the VSE. Then I simply arrange the volumes from 0 to the default 1.0 of all my strips in the sequencer alongside my video clips.

This would enable me to:

  1. avoid clipping?
  2. actually arrange the clips in a way where playback via VSE actually mimicks the exported waveforms (maybe?)?

Below is just a quick test I did. Captured waveform in audacity. STuck it in the VSE. Duplicated it 5 times, made the strips 1.0/0.9/0.5/0.1/0 volume. Mixdowned as a basic .flac but changed it to 1 accuracy.

I changed the waveform display (to dB) so its easier to analyze. So 1.0 looks pretty much the exact same to me between top and bottom (top being original, bottom being the Mixdowned version that went through blender).
What I did next was playback just the bottom in audacity and compare it to the playback in the VSE (the version before it was mixed down, so it has the custom volumes still). And they sounded the same to me.

So this seems like a viable workaround fix to me. Is there anything I don't know about though going on that doesn't make this work? All kinds of weird and crazy things were mentioned here (from my perspective anyways) that could make anything that appears right be wrong.

Conclusion (assuming no problems with the above):
My perspective atm is volumes above 1.0 are doing bad things that I can't keep track of, so just use the 0-1.0 volume scale since it seems to operate on a basis that makes sense to me (whether its a percentage decrease or a logarithm or whatever) and seemingly emulates what other programs I use do.

I like your idea, Cerbyo (Kite).
Starting with the best sound possible, and adjusting volumes downward (by ear, or numbers) seems reasonable, if you are trusting of your hearing. But I have been working with blender for many years, and would caution you to stay vigilant, with visual ques (from the Audacity graphical images), as well. Projects (in Blender) that have many small clips moved around, and key-frame adjustments to volume levels, can be taxing on some computer processors.

Here are some things that I have done, in the past, when audio clips have not behaved as they should (in Blender):

  • Refresh the VSE by clicking the "Refresh Sequencer" button in the Sequencer Editor Header.
  • Save your work, whether it is playing properly or not, exit out of blender, and reboot your computer. Other programs out side of Blender may be hogging the audio capabilities of your computer.
  • And, as a final option, render and mix down all of your audio to one giant audio strip - best done in a non-compressed audio format. Blending audio clips, like blending video fades and wipes, require more processing power, than streaming a single audio strip.

I did my first serious non-linear editing with other linux-based editors, that required rendering of audio and video separately, and then muxing them together for the final file. Computers have gotten better and faster, so sometimes it is easy to forget that there are ways to put less stress on the way data flows in computer hardware.

It will be interesting (and possibly scary) to see how the developers will handle multi-threading of data for the VSE, in the future. I want to see improvements to the VSE, but am grateful for how far things have come.

I did my first serious non-linear editing with other linux-based editors, that required rendering of audio and video separately, and then muxing them together for the final file. Computers have gotten better and faster, so sometimes it is easy to forget that there are ways to put less stress on the way data flows in computer hardware.

Are these shortcomings limited to blender or is this a process all Video sequence editors deal with? This seems really unstable, and contrary to what I learned about it via tutorials. I would go one step further and suggest the wiki/manual/whateverucallit be updated to include details such as the ones read about in this thread. And that methods of audio mixing be clearly spelled out for the user. The way people like urself export the strips into audacity and back into blender....this should be spelled out to people so they know! Especially the fact it clips without any auditory or visual indication in the vse playback/ui, doesn't play keyframed audio correctly when its bogged down by too much data, and all these other wonderful things I'm discovering and wish I hadn't.

I woulda thought it would be designed to have enough safeguards so it would work one way consistently regardless of your pc's limitations. A bad pc rendering it, but rendering it badly compared to a good pc....? Normally I would assume that shouldn't happen if the program used is stable.

Are we not getting the same result upon consecutive renders of the same item with the same settings? Is every render unique?
Is there like a best rendering practices guide? Rendering multiple things so long as ur pc's cpu and ram are below 60% is okay right?
I'm guessing that is not true, as I can render one project and my task manager stats are all fine with plenty of space but its skipping keyframes etc as it renders.
I'd love to bombard people with questions related and unrelated, but I get that this is not the place to do so (unfortunately I have yet to find the place to find such answers either tho).
Thanks for everyones time, I've got nothing else to add to the original bug report.

Ok, there is a lot going on here and a lot of the information written here is just plainly wrong. Let me try to summarize: you're setting the volume of an audio strip to 10 and get an output that is clipped. Well, that's expected. I tried this and whatever I do, it always sounds the same - whether I play back in Blender (SDL or OpenAL), mix down (wav or flac) or render to a video (matroska, PCM). In that sense, I cannot reproduce the bug report.

Now let me clear up some wrong information that I saw, while flying over all the comments. I didn't read everything in detail though.

  1. Volume in Blender is always an amplitude ratio and never in dB. If you check https://en.wikipedia.org/wiki/Decibel it will tell you that a volume of 10 in blender corresponds to a 20 dB amplification. That's what I used in audacity - sounds and looks exactly the same as when I have volume 10 in Blender.
  2. Neither SDL nor OpenAL are used for sequence mixing. When playing back in Blender they are only used to play back the mixed result. During rendering/mixdown neither of those libraries are used.
  3. If you have to amplify your sound files in Blender (= setting volume to something above 1) you are doing something wrong, because this means that your source sound files don't use the full spectrum available for the amplitude which decreases the quality of the sound file.
  4. Clipping will always occur if the amplitude is outside the range of -1 to 1. This means that if you have an amplitude smaller than -0.1 or bigger than 0.1 and you set the volume to 10, you will get clipping, since any amplitude will be clipped to -1 or 1 if it's outside this range. The same is true if you have the same audio strip 10 times at the same (temporal) position.
  5. Even if your machine is slow and doesn't manage to play back a scene, the render/mixdown will always be the same and correct output that you would get with an unlimited powerful PC during playback.

So, let's summarize and let me know if I missed something.

Clipping is expected here and not a bug. Yes, it would be nice to have a master audio meter or so during playback, that shows you when clipping occurs, but there are currently no resources to add this feature to blender. If you need sophisticated audio editing, then better use a tool specifically made for that, like ardour. Using jack transport you can even sync it with Blender. Bug reports are not for feature requests, so I would close this bug.

If the output of playback, rendering and mixdown is different, I would consider this a bug. I could not reproduce this. Maybe it's platform specific? Can any developer using windows reproduce this?

Thanks @Joerg Mueller (nexyon) for the clarification.

I think what the root cause, based off some educated guessing, is how Blender mixes channels.

In @Cerbyo (Kite)'s original BSE report there is a screen shot with the caption:

"I need to mix all this audio down so it isn't clipping/distorting/missing: problem is it sounds fine as is in the vse playback, but after rendered it is infact clipping/missing/distorting audio strips."

https://i.stack.imgur.com/wQgiH.jpg


Doing the quick test that I did T54418#491145 would support @Cerbyo (Kite)'s reported issue:

What I hear on playback in the vse is now different from what I get after I render.

On another note, I would think with the vast complexities of audio mastering that its best practice to use the 'split channels' option on export and use dedicated audio software.

Ok, changing the number of channels is another story. Blender supports surround sound (=more than two channels) in contrast to Audacity.

When the number of channels is changed, the total power (not amplitude) is kept at the same level. This means that if you go from mono to stereo, splitting one channel into two, you get the same signal but at lower volume on each channel. Going the other way, merging stereo to mono, the power of the two stereo channels is added and not averaged. This approach is volume preserving and is also what OpenAL does, though Blender doesn't use OpenAL to do this in the sequencer.

Audacity doesn't do anything like this, when splitting (mono to stereo) it simply copies the channel and when merging it simply averages the amplitude. This is an ok approach as long as you only have mono and stereo, but fails quickly - as soon as you have more channels than two.

I would say that Blender behaves better than Audacity here, but if you are used to the Audacity behavior, it's understandable that you expect a different outcome. You can try to go mono -> stereo mixdown and then use that stereo to do a stereo -> mono mixdown. The result should be the same as the original mono file, except for compression and slight numerical artifacts. If this is the case, I suggest closing this bug report, since there doesn't seem to be a bug.

...This approach is volume preserving and is also what OpenAL does, though Blender doesn't use OpenAL to do this in the sequencer.

This is also true when downmixing? If so, I believe this answers this task's issue.

Joerg Mueller (nexyon) closed this task as Invalid.Apr 10 2018, 8:12 PM
Joerg Mueller (nexyon) claimed this task.

Ok, then I'll close this. There is no difference in processing of the strips between mixdown and rendering. Not even to playback, the only difference there is that the user settings are used in terms of channels and sample rate.

Cerbyo (Kite) added a comment.EditedApr 13 2018, 4:57 AM

I don't understand why you said

  1. If you have to amplify your sound files in Blender (= setting volume to something above 1) you are doing something wrong, because this means that your source sound files don't use the full spectrum available for the amplitude which decreases the quality of the sound file.

I don't understand this. Why does the meter allow 1 to 100 if you aren't supposed to use it?
So its only okay to decrease from 1 volume down to 0? So i should maximize my audio strips so they are right before clipping in audacity, then toss them in blender and only decrease volume or keep the same? And someone said to split channels.....but then someone said you can't do it if I use more than 2 channels in the vse......
What do i do exactly to salvage my audio?

if I have 3 audio strips on 3 channels and all 3 are stereo audio.....how do I mixdown safely so it sounds the same as in vse?

I also still don't understand the solution to my original problem. blender vse has 32 channels to use. Are you saying that I can only mix 2 of those channels for stereo audio to get a proper stereo mix in blender? All my imported audio is stereo. Is there no way to grab the audio from vse playback short of recording it in audacity then when I'm using more than 2 channels out of the 32?

If I want my playback as my render....I can't have that cause it doesn't support ALS or any of the methods used for playback? That's all I want...my playback as my render. How do I get that? Can you suggest a program I can use to grab what I have in blender then? Right now I just have audacity record, so it'll record what is output through my speakers but it sometimes jumps/whatever every so often, requiring some pesky layering techniques. If my playback is what I want, a solution is just having a way to grab the playback....so what is the best way to grab the playback?

If I render out each channel as an audio strip separately at 1.0 and merge them in audacity is that the best way then??? If I changed the volumes between 0 to 2.0 and they don't clip and do used the method above for each channel (so no mixing), i'll be safe???
But that'll still sound different than my playback won't it? Cause anything imported into blender has ALS applied to it which is a special thing I can't replicate that sounds different from the original file before its imported? In which case I just want what the playback sounds like since it sounds best.....can you please propose me the best method to doing so?

If your source audio file isn't using the full dynamic range of the amplitude, it is basically wasting quality. Most audio files have a dynamic range of 16 bit. If you have to double the amplitude (volume in blender = 2), you are already wasting one of these bits. At 10, you are already wasting more than 3 bits. So for any recording engineer the goal is to utilize the whole dynamic range to not lose any quality during recording. When you have a file that doesn't use the full dynamic range, then increasing the volume in audacity instead of blender doesn't make any difference in terms of outcome, so you could just do it in blender directly. The problem is of course that you don't by how much, so using audacity makes sense after all. As soon as you normalized the volume in audacity, you are using the full dynamic range, going higher in blender then leads to clipping.

All I wrote above only considers one audio file/strip. But the same is true for the whole sequence that you are creating in blender. Mixing audio is basically adding the amplitudes and if the sum of the amplitudes at one specific time point goes above 1, you get clipping again. The advantage of blender is that it doesn't compute with just 16 bit, but with a higher resolution and also with amplitudes that can be bigger than 1. The clipping basically occurs at the end, when the audio signal is sent to the speakers or encoded in a file. So if you have clipping in your scene, you can simply reduce the master volume instead of reducing the volume of each strip.

With that information, let me try to answer your questions. Some people record videos with very bad hardware and then want to use audio of someone talking far away, so it's very quiet. They might want to amplify that with 100, but the quality will be very bad and they know that. A volume between 0 and 1 guarantees that no clipping occurs for a single strip. If you have more, that safe range decreases by the number of strips that are playing in parallel, but it's unlikely that this strict limit is necessary as it is unlikely for two or more strips to have the maximum amplitude at exactly the same time.

Then you are confused about channels. When I write about channels, I'm talking about the number of audio channels in an audio file or of a strip. For example mono is one channel, stereo is two channels and 5.1 surround sound is 6 channels. It has nothing to do with how many strips you can have above each other in the VSE.

If all your audio is stereo and you mix down to stereo, you also have to use stereo during playback. If you have a 5.1 surround system connected to your computer, then the stereo signal is split up into four of those 6 channels (front left and back left for stereo left and the same for the right side). Therefore clipping occurs later, as the signal is split up. Use stereo playback to hear the same during playback as after mix down. If you have such a system and you can hear audio coming from the back speakers, then this is what is happening.

I don't know what you mean by "methods used for playback" and what ALS is.

So this is what I'm getting from all this:

  1. The 32 channels of the blender vse are called channels but they aren't the same as audio channels (aka mono being 1, stereo 2 channel).
  1. A strip imported into blender retains its number of audio channels, and displays as a single strip in the vse. So even though all audio strips display themselves (when you draw waveform) as a mono strip they retain whatever their original number of audio channels may be (2 for stereo in my case). ********************

Questions


  1. If I wanna merge 10 audio strips ontop of eachother that are all stereo, it was suggested I use mixdown and select the split tracks option. What good is that though? I don't understand how that changes things outside of giving me 2 mono tracks that I would then merge into a stereo anyways via audacity or whatever. The left channels still would add up the same way in blender if I did them as a single stereo wouldn't they? Same with the right. Was the point to mixdown every single vse channel as 2 mono strips and then combine 20 channels of audio in audacity? How would audacity even know which of those 20 are the right and left sections of the stereo strip its gonna make? I just see this as mashing everything together in no order and outputting a stereo with both its channels more or less asymmetrical.
  1. If I import a mono strip, blender always exports a stereo strip. There is no way, that I can find, for me to change this. I can split them via mixdown options but then I have 2 mono with lower volume (smaller) waveforms than what I want, and merging those back into a single mono strip via audacity creates something bigger (more amplified) than what I want. And it Sounds a bit off when I compare and change the amplitude back to the original input's waveform size. Selecting mono under strip properties for the strip had no effect. The strip had a few cuts made to it, but it all sits in a single 'vse channel' and all segments have mono selected under n tab. And my test render is just of a segment of the first strip which is solid with no cuts/modifications. To determine the number of channels of output I stick it in audacity, always is a stereo.

https://blender.stackexchange.com/questions/106534/how-do-i-render-out-a-single-mono-audio-track?noredirect=1#comment188129_106534

  1. You have to know what you want to do... If you can do the audio editing in Blender, do it there, if not use some other program.
  1. You can set the scene audio channels under properties -> scene -> audio -> format -> audio channels.
Cerbyo (Kite) added a comment.EditedApr 21 2018, 3:46 PM
  1. You can set the scene audio channels under properties -> scene -> audio -> format -> audio channels.

Hi, I don't have any such options. I get as far as hitting the audio tab, but then I can't find any format or audio channels options. There is only the ability to split channels as I mentioned before. Using it on a mono track nets me 2 tracks as mentioned, since its always assuming my mono track is stereo.