VBV Rate Control

Part 1

<06/05/07 12:08pm> Manao | he is negating a float by printing it, adding a "-" to the string, and converting it back to float

<06/05/07 12:08pm> Manao | in the process, of course, he doesn't free the allocated string

<06/05/07 12:11pm> dynaflash | Hello, I was wondering if I might ask a question ?

<06/05/07 12:12pm> Manao | no, you can ask only one question, and you spent it by asking if you could ask a question

<06/05/07 12:12pm> Manao |

<06/05/07 12:13pm> dynaflash | touche I am one of the HandBrake developers and we are looking to try to control the abr output for our AppleTV preset.

<06/05/07 12:13pm> dynaflash | rhester and I have been doing alot of testing with vbv-maxrate and vbv-bufsize and have been getting some somewhat unpredictable results.

<06/05/07 12:14pm> Manao | unpredictable how ?

<06/05/07 12:15pm> dynaflash | Well, we are attempting to use an abr of 3000. with a vbv-maxrate=4800 bufsize=7000 and are still getting periodic bitrate spikes up around 8000 kbps

<06/05/07 12:16pm> Manao | how long are the spikes ?

<06/05/07 12:17pm> dynaflash | Seem fairly short on high motion/complex scenes. but enough to drain the devices buffer and cause temporarily paused video

<06/05/07 12:17pm> dynaflash | rhester is doing the same for our iPod preset

<06/05/07 12:17pm> dynaflash | Was mainly hoping you might be able to point me to some documentation regarding those options

<06/05/07 12:17pm> dynaflash | google and doom9 have shown sparse results, so I thought I might go right to the source

<06/05/07 12:19pm> Manao | a VBV of 7000kbit with a max rate of 4800 means that on 1.5 second, the bitrate can reach 9600 kbps

<06/05/07 12:19pm> Manao | x264 should try to respect the VBV buffer

<06/05/07 12:19pm> Manao | if it doesn't, afaik, it's a bug

<06/05/07 12:19pm> Manao | still, the vbv configuration you use allows for high bitrates on rather high periods

<06/05/07 12:21pm> dynaflash | If I wanted to try to keep my spikes closer to the 5000 kbps level, what might you suggest I try ?

<06/05/07 12:21pm> Manao | CBR manages to enforce VBV, and i haven't seen it fail

<06/05/07 12:21pm> Manao | how do you define spike ? ie, on what duration do you compute the average bitrate ?

<06/05/07 12:21pm> Manao | also : do you know how a VBV buffer works ?

<06/05/07 12:23pm> dynaflash | Frankly, no.

<06/05/07 12:23pm> Manao | ok, so lets start with that:

<06/05/07 12:23pm> Manao | each time you encode a frame that is XXX bits big, you add XXX bits to the VBV buffer

<06/05/07 12:23pm> Manao | each time you encode a frame, you remove VBV_buffer_bitrate * frame_duration bits

<06/05/07 12:23pm> Manao | the VBV must never fill completely

<06/05/07 12:25pm> dynaflash | okay

<06/05/07 12:25pm> Manao | so, let's say you have a VBV buffer with a bitrate of 1mbps, and which is 1 mbit big

<06/05/07 12:25pm> Manao | if it starts empty, during one second, if you have a bitrate of 2 mbps, you'll have added 2mbit, and removed 1mbit, so the VBV will be filled

<06/05/07 12:25pm> Manao | so, the max bitrate on 1 second is 2 mbps.

<06/05/07 12:25pm> Manao | on 2 seconds, you can, at most, how a bitrate of 1.5 mbps ( +3, -2 )

<06/05/07 12:26pm> superdump | Manao: i'll put this on a page on the multimedia wiki in a bit

<06/05/07 12:26pm> Manao | and so on

<06/05/07 12:26pm> Manao | the longer the period of time you consider, the lower the average bitrate can be

<06/05/07 12:26pm> Manao | but, on very short amount of time, the VBV doesn't really contrain the bitrate

<06/05/07 12:26pm> Manao | for example, a single frame can be as big as the VBV size, thus having a bitrate of VBV size / frame duration ( in my case, at 25fps, it's 25 mbps )

<06/05/07 12:26pm> Manao | every device should indicate which combination of VBV buffer size / VBV bitrate they can support

<06/05/07 12:26pm> Manao | that is usually done by saying a device supports level

<06/05/07 12:29pm> dynaflash | 3.1 in the case of the appleTV

<06/05/07 12:31pm> Manao | then in the AVC reference document, you'll find that the max vbv buffer size is 14000 mbit, and the VBV bitrate is 14000 mbps

<06/05/07 12:32pm> dynaflash | I believe we were in error setting vbv-maxrate to just under the video bitrate limit of 5000 mbps

<06/05/07 12:32pm> dynaflash | sorry, I mean 5000 kbps

<06/05/07 12:34pm> Manao | mmm, i didn't think 3.1 was allowing so high a bitrate

<06/05/07 12:34pm> Manao | anyway, with your settings, i don't think the bitrate spikes are a bug, and i don't think they should matter on the apple tv

<06/05/07 12:36pm> dynaflash | I should say, we have been using 3.1

<06/05/07 12:36pm> dynaflash | Apple specs it as:

<06/05/07 12:36pm> dynaflash | H.264 and protected H.264 (from iTunes Store): Up to 5 Mbps, Progressive Main Profile (CAVLC)

<06/05/07 12:37pm> Manao | umpf

<06/05/07 12:37pm> Manao | hates apple

<06/05/07 12:37pm> Manao | ok, up to 5mbps means nothing in itself, since you need the buffer size

<06/05/07 12:37pm> LordRPI | I thought that stupid apple TV was supposed to be supporting CABAC. :shudders:

<06/05/07 12:37pm> Manao | but, i would try 5 mbps / 5mbit as vbv bitrate / vbv buffer size

<06/05/07 12:37pm> Manao | LordRPI: it seems it's just an oversized ipod

<06/05/07 12:38pm> LordRPI | good thing I'm not buying one!

<06/05/07 12:38pm> dynaflash | So vbv-maxrate=5000:vbv-bufsize=5000

<06/05/07 12:38pm> dynaflash | We have been feeding it cabac successfully except for stuttering dropped frames that correspond to bitrate spikes of > 8000 in vlc

<06/05/07 12:38pm> dynaflash | in some cases up to 12000

<06/05/07 12:38pm> dynaflash | So, our quest is to try to minimize those spikes with our 2 pass 3000 kbps abr encodes

<06/05/07 12:40pm> Manao | tsss, what did i say about the definition of spike ?

<06/05/07 12:40pm> Manao | you need to tell me on what duration you compute a spike

<06/05/07 12:41pm> dynaflash | blushes

<06/05/07 12:41pm> dynaflash | well, so far, just using vlc's statistics panel

<06/05/07 12:41pm> superdump | Manao: what duration does vlc display its bit rate readout over?

<06/05/07 12:42pm> Manao | superdump: if i'm a betting man : 1 seconds

<06/05/07 12:42pm> superdump |

<06/05/07 12:42pm> dynaflash | yes, seems to poll every second, so its hard for me to say tbh

<06/05/07 12:42pm> Manao | but i'll ask fenrir tomorrow, i don't want to checkout the code just to check that

<06/05/07 12:42pm> Manao | dynaflash: 5000/5000 will allow 1 second spike of 10000 mbps

<06/05/07 12:42pm> Manao | 5000/2500 will allow 1 second spike of 7500 mbps

<06/05/07 12:44pm> dynaflash | ah, therein lies another question. rhester was of the opinion that if the bufsize was lower than the maxrate, x264 would ignore both. Do you know if this is true ?

<06/05/07 12:44pm> Manao | it's wrong

<06/05/07 12:44pm> dynaflash | glad I asked

<06/05/07 12:44pm> dynaflash |

<06/05/07 12:45pm> superdump | i thought he thought x264 ignored vbv max rate is vbv buf size was _not_ set ...?

<06/05/07 12:46pm> dynaflash | yesterday he also told me that eddieg's attmept would yield nothing as the bufsize was set lower than the maxrate.

<06/05/07 12:46pm> superdump | oh

<06/05/07 12:47pm> dynaflash | I was surprised as well. Also, to confirm, maxrate without a bufsize will cause maxrate to be ignored. Correct ?

<06/05/07 12:50pm> Manao | i'm checking it

<06/05/07 12:50pm> Manao | if i'm not mistaken :

<06/05/07 12:51pm> dynaflash | thnx. I ask because Sharktooth specifies vbv-maxrate without vbv-bufsize in some of his profiles. and for us, x264 throws a warning

<06/05/07 12:51pm> Manao | 0 < vbv-buff-size < vbv-maxrate / 3 -> vbv-buff-size = vbv-maxrate

<06/05/07 12:51pm> Manao | erm

<06/05/07 12:51pm> Manao | 0 < bufsize < maxrate * 3 / fps --> bufsize = maxrate * 3 / fps

<06/05/07 12:51pm> Manao | said otherwise, the buffer size must contains 3 frames at the average bitrate

<06/05/07 12:51pm> Manao | if bufsize = 0, maxrate doesn't define a VBV

<06/05/07 12:53pm> dynaflash | okay, that is helpful

<06/05/07 12:53pm> Manao | and it is set to 0

<06/05/07 12:53pm> dynaflash | thank you

<06/05/07 12:53pm> superdump | is that how the hardware/spec works or how x264 works?

<06/05/07 12:53pm> Manao | superdump: ?

<06/05/07 12:53pm> superdump | the 3 frames thing

<06/05/07 12:54pm> Manao | that's because a vbv must be big enough to contain an I frame at the average I bitrate, and the I bitrate is usually 3 times the P/B bitrate ( if not more )

<06/05/07 12:54pm> Manao | so setting a VBV under three frames will result in unpreventable ugliness

<06/05/07 12:55pm> dynaflash | lol

<06/05/07 12:55pm> dynaflash | Manao: thank you so much for your help with this. I really do appreciate it.

<06/05/07 12:58pm> Manao | np

<06/05/07 12:59pm> dynaflash | was nervous even asking

<06/05/07 12:59pm> superdump | there's no need to be nervous if your question isn't stupid and you're rational

<06/05/07 12:59pm> superdump | many handbrake users should be very nervous, for example

<06/05/07 01:00pm> dynaflash | spent alot of time googling before coming in here

<06/05/07 01:02pm> superdump | Manao: "0 < bufsize < maxrate * 3 / fps --> bufsize = maxrate * 3 / fps" <--- is that from the h.264 spec or elsewhere?

<06/05/07 01:09pm> Manao | that's from x264's source code

<06/05/07 01:11pm> dynaflash | both values are specified in kb, correct ?

<06/05/07 01:14pm> Manao | one in kbps, the other in kbit

<06/05/07 01:15pm> dynaflash | great, thanks again. This really helps us as we were obviously going down the wrong road

Part 2

<06/06/07 12:45pm> rhester | Manao: You there?

<06/06/07 12:50pm> Manao | yes

<06/06/07 12:51pm> rhester | Manao: Understand you had a very good conversation with from the HB development group yesterday. I've been reviewing this for a loooooooong time, and I took the opportunity to review the chat log, but I still can't seem to make the math work (re: x264 vbv rate control). Would you have a few moments to clarify a couple of points?

<06/06/07 12:53pm> Manao | yep

<06/06/07 12:54pm> rhester | Just to set the tone: The real problem we're trying to solve is a) Apple's complete hiding of the actual specs, and b) local bitrate spikes that cause the Broadcom decoder chip to choke. It's the same story with the Apple TV and the iPod, though it's easier to see it on the latter.

<06/06/07 12:54pm> rhester | So...source material, in my case, is any HBO title with black-and-white content...a very good test case is From the Earth to the Moon (original 4:3 edition), title 1, disc 1.

<06/06/07 12:54pm> rhester | The two 'problem areas' are as follows:

<06/06/07 12:54pm> rhester | - The opening HBO logo - 8 seconds duration of digital black-and-white "snow" with matching audio effect. Even with 1500kbps ABR, this cripples the iPod. The only way to date I've been able to solve it is CBR, which murders the remainder of the content.

<06/06/07 12:54pm> rhester | - A black-and-white scene showing the first manned Russian cosmonaut post-landing.

<06/06/07 12:54pm> rhester | In fact, I've found virtually all black-and-white content seems to drive bitrate more than color, though it's a mystery to me why.

<06/06/07 12:54pm> rhester | So what we're trying to get to is a way to cap the *absolute* maximum bitrate spike over time n, where n is impossibly small.

<06/06/07 12:54pm> rhester | So, from your discussion yesterday, let me interpret what you said and tell me where I'm full of it

<06/06/07 12:56pm> Manao | don't worry about color, the HBO logo exist with colors, and is a killer too

<06/06/07 12:57pm> rhester | You mentioned 5000/5000 allows a 1-second spike of 10000 mbps, and 5000/2500 allows a 1 second spike of 7500. Why is that? If the second number is the buffer size and the first is measured in kbit/sec, wouldn't doubling the declared maxrate to determine max over 1 second only hold if the bufsize matches the maxrate?

<06/06/07 12:58pm> Manao | OK, first, a vbv can be define with a bitrate/size or with a bitrate/duration

<06/06/07 12:58pm> Manao | we have duration = size/bitrate, or size = duration*bitrate

<06/06/07 12:59pm> rhester | OK, follow so far

<06/06/07 12:59pm> Manao | so 5000/5000 is a 1 second vbv buffer with a bitrate of 5000 mbps

<06/06/07 12:59pm> rhester | OK, one sec

<06/06/07 12:59pm> rhester | So 5000/5000 can spike to 10000 if you assume the vbv buffer is full at start and empties and refills in that one second, correct?

<06/06/07 12:59pm> Manao | now the maximum bitrate over a period T is when you start the period with an empty buffer, and end the period with a full period

<06/06/07 12:59pm> rhester | (well, technically it would empty twice)

<06/06/07 01:00pm> Manao | in such a case, the amount of data that pass through the buffer in that period is : buffer size + T * buffer bitrate

<06/06/07 01:00pm> Manao | or, said otherwise, (T + buffer duration) * buffer bitrate

<06/06/07 01:00pm> Manao | and, over that period of time T, the bitrate will be :

<06/06/07 01:00pm> Manao | (T + buffer duration) * buffer bitrate / T

<06/06/07 01:00pm> Manao | so, 5000/2500 <=> 5000/0.5 sec,

<06/06/07 01:02pm> Manao | mmmm

<06/06/07 01:03pm> rhester | But that means that absolute max for 5000/2500 would be 10000 instead of 7500, right? 5000/0.5 secs for 1 second = 10000

<06/06/07 01:03pm> rhester | (BTW, you're educating a lot of us...most are remaining respectfully silent, I don't mind looking stupid

<06/06/07 01:03pm> dynaflash | obviously neither did I

<06/06/07 01:03pm> Manao | yesterday, i must have made some miscalculations

<06/06/07 01:03pm> Manao | anyway, with 5000/2500, you get 1.5 * 5000 / 1 = 7500 kbps over 1 seconds

<06/06/07 01:05pm> rhester | OK, here's where my ignorance comes in...where did the 1.5 come from?

<06/06/07 01:05pm> Manao | T + buffer duration

<06/06/07 01:05pm> Manao | T = 1 second

<06/06/07 01:05pm> Manao | buffer duration = 0.5 second

<06/06/07 01:05pm> rhester | Right

<06/06/07 01:05pm> rhester | got it

<06/06/07 01:05pm> rhester | So going back to the damned-HBO-logo-on-iPod problem

<06/06/07 01:05pm> dynaflash | cause 2500 is half of 5000 ?

<06/06/07 01:06pm> rhester | dyna: Yes, that's the equation that shows the relationship of rate to bufsize that we were discussing on

<06/06/07 01:06pm> Manao | 2500 with a bitrate of 5000 means the vbv has a duration of half a second

<06/06/07 01:06pm> dynaflash | oh, that fills in a gap for me, thanks.

<06/06/07 01:07pm> Manao | ok, the HBO logo is uniformely very hard to encode

<06/06/07 01:07pm> rhester | Here's basically my goal:

<06/06/07 01:07pm> rhester | I don't mind if the logo sequence itself gets encoded really poorly, as long as a) the remainder of the content is treated 'fairly' and b) the logo doesn't cause such a spike it kills the chip, which is where we're at. Right now, I either ABR and freeze the chip, or I CBR and avoid freezes but also destroy the rest of the content.

<06/06/07 01:07pm> rhester | I'm trying to use vbv as a way to mitigate that

<06/06/07 01:08pm> Manao | ok, that's indeed how you should use it

<06/06/07 01:08pm> Manao | the thing is, I don't know exactly how x264 behaves in abr with a vbv

<06/06/07 01:08pm> Manao | what will surely happens in the logo's case is that the VBV will be empty before the logo, since the video is usually black

<06/06/07 01:10pm> rhester | If the iPod's max ABR is 1500, and I don't know the -real- ceiling or buffer size...where do I start to experiment? It sounds like 750/750 would deliver a local maxrate of 1500 in 1 sec, but within that 1 second, there could be a very wide variance in individual frame size which is actually what's freezing the playback...is that a reasonable assumption? (I'm trying to establish that 750/750 won't fix the actual issue)

<06/06/07 01:10pm> Manao | so it's really the worst case scenario, since x264 will fill the VBV a little at the start of the logo

<06/06/07 01:11pm> rhester | Absolutely the worst-case scenario, agreed - it's why it's been giving me fits for 8 months

<06/06/07 01:11pm> Manao | don't worry about very short amount of time : what kills the chip is a high bitrate over at least half a second - imho

<06/06/07 01:12pm> rhester | OK, that's consistent with my playback trials - it always plays back at -least- a half-second before the freeze

<06/06/07 01:12pm> Manao | 750/750 is too restrictive, since you can't have an average bitrate over the vbv bitrate

<06/06/07 01:12pm> Manao | so you do want the vbv bitrate to be high

<06/06/07 01:12pm> rhester | I see

<06/06/07 01:12pm> Manao | it's just the vbv size / duration that must be small

<06/06/07 01:12pm> Manao | but

<06/06/07 01:12pm> Manao | the vbv mustn't be too small

<06/06/07 01:13pm> rhester | Right...I know there's a lower limit I believe in the mid-130s, which is still not large enough to hold 3 frames

<06/06/07 01:13pm> Manao | what i would recommend is to make it 0.2/0.3 seconds big

<06/06/07 01:13pm> Manao | so, something like 1300/400 may do

<06/06/07 01:15pm> rhester | I get how you're deriving the bufsize (if I'm using 1500 ABR, a tenth of a second is 150kbit, two tenths is 300, three is 450, etc.) - but how did you arrive at the 1300?

<06/06/07 01:15pm> Manao | that would mean, over half a second, a max bitrate of 2100

<06/06/07 01:15pm> rhester | And if I use a vbv-maxrate of 1300, doesn't that mean I have to limit my ABR bitrate (-bitrate) to 1300 as well, harming overall quality for the full length of the feature?

<06/06/07 01:16pm> Manao | rhester: the 1300 is a tradeoff : you can't afford to have a too small VBV

<06/06/07 01:16pm> Manao | you can't affor to have too high a max bitrate over half a second

<06/06/07 01:16pm> Manao | so have have to trade some maxrate for some buffer duration

<06/06/07 01:17pm> rhester | I follow. I assume if you set ABR bitrate to 1500 with vbv of 1300/400, it will simply ignore the VBV?

<06/06/07 01:17pm> Manao | x264 enforces the VBV to last at least three frames

<06/06/07 01:17pm> Manao | i would recommend at least 5

<06/06/07 01:17pm> Manao | i don't know

<06/06/07 01:17pm> Manao | lemme check

<06/06/07 01:17pm> Manao | it seems to allow it

<06/06/07 01:17pm> Manao | it just warns that average bitrate > max bitrate

<06/06/07 01:18pm> rhester | Interesting...but I suppose that means you're likely to undersize.

<06/06/07 01:18pm> rhester | Makes sense.

<06/06/07 01:19pm> Manao | but i would guess that the results won't be optimal

<06/06/07 01:19pm> rhester | I'm actually starting to follow this now

<06/06/07 01:19pm> rhester | Let me mull it over for a few moments if you don't mind, I want to go back to your 1300/400 example and do the math myself to understand the precise relationship

<06/06/07 01:19pm> rhester | (It does me nor the other devs any good to just walk away with canned answers if we don't get where they came from Contemplating...

<06/06/07 01:21pm> dynaflash | Manao: if I may, is there a formula you are using to get: x bitrate of x duration = maxrate:bufsize ?

<06/06/07 01:22pm> Manao | no, it's not constrained enough

<06/06/07 01:22pm> Manao | even with wanting x bitrate on y duration, you still have maxrate=f(bufsize) ( or bufsize=f(maxrate) )

<06/06/07 01:24pm> dynaflash | I was being to simplistic, I just thought I might do this:

<06/06/07 01:24pm> dynaflash | Manao: anyway, with 5000/2500, you get 1.5 * 5000 / 1 = 7500 kbps over 1 seconds

<06/06/07 01:24pm> dynaflash | backward

<06/06/07 01:24pm> Manao | but 3750/3750 also gives 7500

<06/06/07 01:24pm> Manao | hence, it's not constrained enough

<06/06/07 01:25pm> saintdev | it depends on the capabilities/restrictions of the specific decoder, correct?

<06/06/07 01:26pm> dynaflash | gotcha. will shutup and watch

<06/06/07 01:26pm> Manao | yes, and of the minimum buffer duration you can allow, and of the average bitrate

<06/06/07 01:26pm> Manao | basically, you want maxrate > average bitrate

<06/06/07 01:26pm> Manao | and buffer duration > 5 frames

<06/06/07 01:26pm> Manao | but that still leaves you with a continuum of possible couple bitrate/duration

<06/06/07 01:30pm> superdump |

<06/06/07 01:30pm> superdump | i figured that after looking at the equations

<06/06/07 01:30pm> superdump | but that at least gives some scope

<06/06/07 01:31pm> rhester | ManaoOK...I think we have enough now to crawl back in our holes and mull this over...it sounds like we're up against a ton of empirical testing, lacking any real specs from Broadcom or Apple. , thank you very much for your time and effort...it is much appreciated (and I assure you, not wasted :).

<06/06/07 01:31pm> superdump | definitely

<06/06/07 01:31pm> superdump | i won't let it go to waste

<06/06/07 01:31pm> superdump | though i may have to find an implementation specific place to hold x264 specific info

<06/06/07 01:31pm> superdump | i can probably create some place for it on wiki.multimedia.cx

Part 2b

saintdev | Manao: How did you come up with the 0.5s and 1s for the durations for the apple tv?

Manao | saintdev: for the 0.5/1.0 seconds durations : that's only a guess. I lack the exact apple tv specifications, but those values make sense

saintdev | Manao: thank you, that's what I assumed, is the .5 derived from the 1 in any way, though?

Manao | well, from what the other said, I got the impression than 1 seconds was too long a period for computing bitrate spikes for apple tv, so i divided it by two :)

saintdev | ok :)

Manao | and 0.25 seconds is too short a period to restrict it with a vbv anyway, so you better pray 0.5 do work :)

saintdev | so why the 0.3 there, you're keeping it above 5 frames, but why not use 0.5?

saintdev | erm 0.3 for ipod

Manao | i wanted to restrict the bitrate on a 0.5 seconds spike. if i use a VBV that is 0.5 seconds long, then i have maxbitrate = bitrate spike / 2

Manao | which, in the case of the Ipod ( max spike bitrate = 1500 ) gives a too low max bitrate

Manao | a VBV of 0.3 second allows to have a higher max bitrate

saintdev | ok

saintdev | thanks for clearing that up, and thanks for teaching us earlier!

Manao | np

Part 3

rhester | Manao: I now fully understand how vbv works, and have it working quite well for iPod and Apple TV...but I am having an issue with x264 when using vbv that I'm not sure is a bug or a feature. =) It appears from empirical testing that vbv RC only works with 1-pass ABR, 2-pass appears to completely ignore it. Am I doing something wrong?

Manao | if i am not mistaken, 2 passes takes the VBV into account when scaling the bitrate curve, but doesn't enforce it afterwards

Manao | once i'm home ( in 2 hours ), i'll be able to infirm/confirm that

rhester | That's consistent with my findings. Is there any way to "force the enforcing"? :) VBV is critical for marginal players like iPod and ATV, but losing 2-pass kinda sucks. ;)

checkers | Manao, which company do you work for again?

superdump | ateme

checkers | cheers

md` | rhester: crf aint so bad

rhester | md: CRF is utterly impractical for devices with a low bitrate ceiling like iPod. That's why VBV is so important as well. For iPod and Apple TV, ABR+VBV is the only realistic choice. And it works _very_ well in 1-pass...it's just too bad you lose the quality from a second pass. Trying to get the best of all worlds here. =)

md` | low bitrate ceiling?

md` | lolwtf?

md` | someone should be shot at apple i think?

md` | in the face, like several times

checkers | md`, probably, but that doesn't fix the problem :)

md` | maybe it will fix the problem in future products!

rhester | One can't expect to squeeze the circuity required to support a Broadcom into a tiny device without some compromises...like no CABAC, no b-frames...

rhester | But regardless, not here to debate the relative merits of the device, merely to get the best possible content encoded for it. =)

md` | yeah yeah

md` | but what's the fun in that?

md` | :D

md` | i say, let's continue to talk about my fantasy to shoot employees of companies that make crap products

rhester | md: Make a deal with you - get VBV enforcement working in multi-pass x264 encodes, and I'll hunt down and kill the lead iPod video engineer. =)

md` | awesome!

* md` gets to work immediately

rhester | Manao: You back? :)

Manao | yep

rhester | :) I have tried every possible method of forcing x264 to enforce VBV in 2-pass encoding, and I can't find a way. Am I just out of luck? =)

Manao | iirc, two passes with average bitrate == maxbitrate was violating the VBV buffer

rhester | So does that mean it _should_ work as long as vbv_maxrate > bitrate?

Manao | no

Manao | that means it won't work

Manao | lemme check

* caro (i=Vincent@alf94-3-82-66-248-160.fbx.proxad.net) has joined #x264

bobololo | rhester: from what i've read you're trying to produce appleTV compatible stream right ?

rhester | bobololo: I'm actually working on the iPod side, but it's pretty much all the same stuff. For the iPod, I can set proper VBV to allow only for 2500kbps spikes over .5s duration, and it works very well with 1-pass 1500kbps ABR. The problem is that VBV enforcement doesn't seem to work at all in multi-pass.

bobololo | rhester: do you what chip is used in those devices ?

bobololo | I mean is that a dedicated decoding chip

rhester | Broadcom BCM2722

bobololo | or it's software decoding ?

rhester | (in the iPod, not sure about ATV)

bobololo | rhester: let me check

bobololo | rhester ok it's a software based decoder

bobololo | I guess the actual performance depends on the implementation

rhester | Hrm...so the Broadcom is just used for video output, but not decoding?

bobololo | and there is chance it doesn't strictly comply with a normalized profile/level

bobololo | rhester: I think it is also used for the decoding, it embeds some video processing accelerator

rhester | It most definitely doesn't strictly comply with baseline level 3.0, which is what it purports to support. Hence the reason for having to artifically limit spikes via VBV buffer.

bobololo | is there another chip in the ipod ?

rhester | bobololo: Not concerning video, no.

bobololo | rhester then there is very high probability, the h.264 stream is decoded by the bmc2722

rhester | That's what I suspect as well.

rhester | It has a 32mbit buffer, so we're not quite sure why the VBV restriction is necessary, but testing shows clearly that it's needed.

bobololo | rhester: does it support cabac ?

rhester | Definitely not. Baseline only.

rhester | No CABAC, no b-frames.

bobololo | ok

bobololo | anyway, i guess you'll need test and try to find out the threshold

rhester | Using VBV RC, I've successfully gotten very good (well, as good as it can ever get) encoding results...that's not my concern (any longer). The trouble is, with such low ABRs (like 1500kbps), 2-pass encoding is critical for video quality...but it appears you must choose *either* multipass encoding _or_ VBV rate control, but not both (VBV seems to be ineffective with 2 or more passes).

* dynaflash has quit ()

Manao | k, it doesn't respect strictly the vbv

rhester | Manao: By 'doesn't strictly respect', testing would suggest 'doesn't care at all'. :) With VBV of 1500/750, I spike up to 2500 on a single pass (good!), but 8000-10000 with 2 passes (not so good). In fact, I can't see any difference from turning VBV off in 2-pass.

Manao | what average bitrate are you targetting ?

rhester | 1500

Manao | then do a CBR encoding

Manao | a one pass encoding

Manao | x264 can't profit of multipass encoding in cbr

* caro has quit ("Quitte")

Manao | and doesn't respect the vbv in two passes

rhester | That produces even worse results :/ You get what you asked for (pretty much flat 1500 ABR), but the video quality suffers enormously.

rhester | Is the lack of respect for vbv in 2 passes a bug or just can't be done?

rhester | erm, should have read 'pretty much flat 1500 CBR' above

Manao | the quality will suffer whatever you do if average bitrate == max rate

Manao | it's not a bug, pengvado is aware of it

Manao | but i don't think he's got time right now to try to make a vbv compliant two pass mode

rhester | Not so much...1500 ABR with VBV of 1500/750 supports spikes of up to 2500 in one-half second, and there's a very visible quality difference between that and 1500 CBR

rhester | OK...thanks for the info =)

Manao | no, there isn't ( or rather, there shouldn't )

rhester | Maybe my approach to CBR is faulty, then

Manao | 1500 CBR with maxrate == 1500, vbvbufsize == 750 will have the same quality

Manao | can you paste the command line ?

rhester | I've been doing qcomp=0 ratetol=0.01 to force CBR (no VBV)

rhester | What is the correct way to do it?

Manao | no

Manao | CBR == average = maxrate + VBV

rhester | -bitrate 1500 -vbv_maxrate 750 -vbv_bufsize 750, for instance?

Manao | x264 -B bitrate --vbv-maxrate bitrate --vbv-bufsize bufsize will do a true CBR

Manao | never maxrate < bitrate

Manao | that can't work

rhester | I see

rhester | But as you vary bufsize, won't that affect maximum local bitrate, thus not achieving true CBR?

Manao | CBR == respect of a VBV buffer

Manao | nothing more

Manao | nothing less

rhester | True enough. I've always oversimplified CBR to mean literally constant bitrate = every frame is the same size. Clearly not the case in reality. =)

rhester | So then what I've been doing in 1-pass (cmd line coming up)...

rhester | --bitrate 1500 --keyint 300 --min-keyint 30 --bframes 0 --no-cabac --no-fast-pskip --fullrange on --sar 1:1 --level 30 --progress --no-psnr --no-ssim --threads 0 --thread-input --vbv-maxrate 1500 --vbv-bufsize 750 --ref 2 --mixed-refs --partitions all --me umh --subme 6

rhester | that already _is_ true CBR, then?

Manao | yes

Manao | and a second pass won't improve the quality over that

Manao | and will break the vbv compliancy

rhester | Got it. So the reason it looks better with 2 passes isn't because you're taking advantage of bitrate-bucket shift but because you're not actually respecting VBV, then.

Manao | yep

Manao | exactly

rhester | Makes sense. Thanks much. :)

rhester | Urm...one more (stupid) question? :)

Manao | ?

rhester | Say I change the command line to this:

rhester | --bitrate 1500 --vbv-maxrate 2000 --vbv-bufsize 500 (spike of 2500 over .25s)

rhester | That's no longer CBR, but still won't be respected in 2-pass, correct?

Manao | yes, but it should/may be respected in 1 pass

* caro (n=torri@alf94-3-82-66-248-160.fbx.proxad.net) has joined #x264

Manao | i tried to stress the codec, and it seems he's respecting the vbv

rhester | Understood (it appears to be). And the only reason to choose that over the previous command line is if your spikes tend to be shorter in duration, then? (In other words, the 'balance' between maxrate and bufsize is purely to "tune" to the length of your longest spike?)

Manao | no, the reason would be to have a more constant quality ( not better, but more constant )

rhester | Shorter buffer duration = more constant quality?

Manao | no, higher maxbitrate = more constant quality, and average << maxbitrate == more constant quality in general

Manao | when bitrate == maxbitrate, you can't really have a constant quality, you are constrained too much by the maxbitrate

* |bond| has quit ("whoooops")

rhester | Going to re-read the mencoder manual to put this in perspective w.r.t. "quality" :) Thanks again, will leave you be and pass this along to the HandBrake group =)

Manao | i've just made a test :

Manao | br=150 vbv=60 max=300 ---> psnr 38.31

Manao | br=150 vbv=150 max=150 --> psnr 38.73

Manao | erm

Manao | br=150 vbv=150 max=150 --> psnr 38.03

Manao | here, psnr == overall psnr and is indicative of the constance of the quality

Manao | the average psnr, in both cases, is 38.8

rhester | *nods* Very good data, thank you!

Manao | so it's definitely better to have maxrate != rate

rhester | Right

rhester | So the overall objective is to tune things so that a) the maximum 'spike' is within device tolerances (as discussed two days ago), and b) within that split of maxrate and bufsize, you want the maxrate as high as possible, so the bufsize should be constrained so it's just large enough to hold 5 frames at maxrate, yes?

Manao | yes

rhester | Perfect. That answers another dev's question, which I believe is all we had. =)

Manao | more data and what exactly are the devices tolerances would help

rhester | It would help a great deal - and as soon as Apple tells anyone, we'll let you know <G>

Manao | s/and/on

Manao | :p

rhester | We know the chip

rhester | But that's about it

Manao | well, bobololo will find some data about it :)

rhester | :)

bobololo | I went through the chip datasheet

bobololo | it doesn't say much

bobololo | there's some accelerator

bobololo | but it's mainly sw decoding

bobololo | so it strongly depends on the codec dev skill ;)

bobololo | do ipods get some firmware update from time to time ?

rhester | Indeed - perhaps every 6 months

rhester | The last one took the max from H.264 320x240@768kbps to 640x480@1500kbps, so yes, software is a factor ;)

Last modified 8 years ago

superdump's interpretations of the above chat.

Consider a small time period T over which one wishes to restrict the data throughput:

VBV data throughput = (T + VBV buffer duration) * VBV maximum bit rate

VBV buffer duration = VBV buffer size / VBV maximum bit rate

VBV data throughput = (T + VBV buffer size / VBV maximum bit rate) * VBV maximum bit rate = T * VBV maximum bit rate + VBV buffer size

One may prefer to consider the average data rate over time T (i.e. a bit rate spike averaged over duration T):

'Spike' bit rate averaged over time T = VBV data throughput / T = VBV maximum bit rate + VBV buffer size / T

Beyond these mathematics, to find the range of VBV buffer sizes and VBV maximum bit rates that constrain the bit rate spikes per one's requirements:

VBV buffer duration = VBV buffer size / VBV maximum bit rate > frames / frame rate

VBV buffer size * frame rate / frames > VBV maximum bit rate > average bit rate

Mathieu Monnier advises that frames be at least 5 but x264 enforces a minimum of 3. The minimum VBV buffer duration is empirically derived from the relative size of the largest frames (I-frames) to the average frame size. Mathieu claims that I-frames are ~5x the average size of a P-frame.

$spike_rate = $maxrate + $bufsize / $spike_dur; --- 1

$bufsize * $fps / $frames > $maxrate > $abr; --- 2

To find the lower bounds:

$maxrate > $abr; from 2

$bufsize * $fps / $frames > $abr;

=> $bufsize > $abr * $frames / $fps; from 2

To find upper bounds:

$maxrate = $spike_rate - $bufsize / $spike_dur; from 1

but $maxrate > $abr

=> $spike_rate - $bufsize / $spike_dur > $abr;

=> $bufsize < ($spike_rate - $abr) * $spike_dur;

Similarly $bufsize * $fps / $frames > $maxrate; from 2

=> $maxrate < ($spike_rate - $abr) * $spike_dur * $fps / $frames;

$maxrate_lower = $abr;

$bufsize_lower = $abr * $frames / $fps;

$maxrate_upper = ($spike_rate - $abr) * $spike_dur * $fps / $frames;

$bufsize_upper = ($spike_rate - $abr) * $spike_dur;

Examples:

- Apple TV

ABR = 3 Mbps

T = 0.5 s

'Spike' rate = 7.5 Mbps

VBV maximum bit rate = VBV data throughput / T - VBV buffer size / T = 7.5 Mbps - VBV buffer size / 0.5 > 3 Mbps

=> (7.5 Mbps - 3 Mbps) * 0.5 = 2.25 Mb > VBV buffer size

2.25 Mb * frame rate / frames = 2.25 Mb * 24 / 5 = 10.8 Mbps > VBV buffer size * frame rate / frames > VBV maximum bit rate > 3 Mbps

As an example, try taking VBV maximum bit rate as 5 Mbps, then:

VBV buffer size = (7.5 Mbps - 5 Mbps) * 0.5 s = 1.25 Mb

10.8 Mbps > 1.25 Mb * 24 fps / 5 frames = 6 Mbps > 5 Mbps therefore a valid solution

- iPod 640x480

ABR = 1.5 Mbps

T = 0.5 s

'Spike' = 2.5 Mbps

VBV maximum bit rate = VBV data throughput / T - VBV buffer size / T = 2.5 Mbps - VBV buffer size / 0.5 > 1.5 Mbps

(2.5 Mbps - 1.5 Mbps) * 0.5 s = 0.5 Mb > VBV buffer size

0.5 Mb * frame rate / frames > VBV buffer size * frame rate / frames > VBV maximum bit rate > 1.5 Mb

0.5 Mb * frame rate / frames > VBV buffer size * frame rate / frames > 2.5 Mbps - VBV buffer size / 0.5 s > 1.5 Mbps

Try VBV buffer size = 0.5 Mb, then:

If you are in low delay video conferencing mode for minimal jitter and reliable performance you need to be in CBR mode and not VBR mode. It is the VBR mode that gives the shoot ups and peaks that a live low latency system cannot deal with.

There are specialized vbr algorithms for low latency video communication / surviellance applications but x264 does not have that. So don't use its vbr. its vbr is meant for storage.

There is a --nal-hrd cbr to enable cbr

There is no way you can guarantee that bitrate won't shoot up beyond a point because things are statistical. However you can keep it under control for 99% of the time as long as some assumptions are met.

Your vbv buffer size needs to be the smallest size you can deal from a quality perspective with for two purposes 1. Not allowing too much variation in bitrate 2. Reducing end to end delay This is the buffering the encoder assumes is available at the decoder side. The smaller it is worse the quality. Find the smallest value you can deal with.

The vbv max bitrate needs to be set to the cbr target bitrate value. This is the guideline to rc that this is the max instantaneous bitrate you are allowed. Set it to the target bitrate. Remember its a guideline. Because of statistical nature of video, it can be overshot.

原文地址:https://www.cnblogs.com/xkfz007/p/6391480.html