Hacker News new | threads | past | comments | ask | show | jobs | submit DanielBMarkham (43803) | logout
All the giant companies used ffmpeg (2020) (twitter.com/id_aa_carmack)
580 points by tosh 11 days ago | flag | hide | past | favorite | 193 comments

About 14 years ago, at my first job, I worked with a colleague[1] who introduced me to the wonderful world of open source: the culture, the philosophy, how to submit patches to open source projects, etc. He also introduced me to Debian GNU/Linux which I use actively to this day.

One day, he began working on certain audio/video encoding problems that needed to be solved for our project. He chose FFmpeg as the primary tool around which he began building his solution. After having spent several days with FFmpeg, figuring out and documenting various ways of using FFmpeg to solve all our audio/video encoding problems, one day, when we went out for lunch, he declared to the team,

"FFmpeg is awesome! If you give me a shoe, I will turn it into a video file!"

[1] https://news.ycombinator.com/user?id=topa

Yeah, it’s been said that ffmpeg and vlc could play a vinyl record if you fed it to them.

It’s not actually true, but it is story true.

What if you pass it a high-res photo of a vinyl record? There ought to be a transcoder plugin for that lying around somewhere!

Imagine someday we can take photos from slices of a brain and use it replay a person’s memories, and the program used for that would of course be ffmpeg.

It sometimes creeps me out to consider how much information may one day be extracted from artifacts we create today.

This can be done, and indeed was done for an ancient photograph of an ancient record that no longer existed. In that case I think they used some photoshop magic to ‘unspiral’ the groove and something else to decode the image. The result wasn’t super high quality.

Vinyl starts as a poor material for recording music and only gets worse with age and use. Take a picture if it on old film and develop it with old techniques, and you end up with the worst possible recording medium. I'm impressed they got anything from it.

There's a recording medium which is worse than a photograph of vinyl.


If an old photo is what you've got, make the best at extracting from it.

This seems like an argument against something I didn't say in the comment you replied to. Maybe I misunderstood you.

There's at least a plausible interpretation of your earlier comment that the attempt was futile or without merits.

The OP makes clear that this was the only viable way of extracting information. 1 >>> 0.

Also, as a sufficiently early recording (anything before the mid-1940s), the medium would have been shellac or wax rather than vinyl. Shellac especially is immensely fragile and prone to shatter when handling.

Your interpretation was incorrect.

Your expression was unclear.

I'm not telling you what you meant.

I'm telling you what I heard, and, in reponse to your implied question, why I'd responded as I had.

I’m surprised this worked at all. Do you have a link?

Yes, NPR covered it ~15 years ago: https://www.npr.org/templates/story/story.php?storyId=118518...

I worked on an earlier version of this project where we actually had to play the albums but we also took high res (300 dpi tifs) of the individual records with the hope that we'd never have to physically touch them again. So I'll claim contributing to this effort. ;)

Any particular reason they couldn't use one of these laser vinyl turntables?

"It looks like somebody just got hungry and took a bite out of it," says Haber. He has positioned the record on a turntable and fitted the broken piece back into place, like it's a jigsaw puzzle. "If we spun this thing fast, the piece would come flying off, you know, and maybe hit somebody," he says.

Paragraph 2 of the link.

In that particular case, definitely. More generally, it's risk.

Every time an archivist touches the media - no matter gloves, a clean environment, the care applied, etc - there's a risk of altering or even damaging it. In our case, the eventual goal was to keep the ever-deteriorating originals in nitrogen-filled vaults.

Therefore, if you can avoid having to touch it often (or ever), then you've managed to protect it better, probably longer, and more cheaply.


Static, high-resolution photography (or other imaging) is about the least-invasive, lowest-risk options available.

> It’s not actually true, but it is story true.

Glad to see this phrase from The Things They Carried outlive the book.

>Yeah, it’s been said that ffmpeg and vlc could play a vinyl record if you fed it to them.

ffmpeg is doing all the heavy lifting there. VLC mostly uses libraries from ffmpeg.

I have no idea why people are so impressed by VLC. Anything built on libavcodec can play all the same files, but they won't have VLC's deficiencies in performance, quality, reliability, or usability. ffmpeg is a great project and VLC an incompetent one riding its coattails.

The only thing I can think of is the quote:

“The paraphrase of Gödel's Theorem says that for any record player, there are records which it cannot play because they will cause its indirect self-destruction.”

―Douglas R. Hofstadter, Gödel, Escher, Bach

While ffmpeg does support a lot of really old / obscure media formats, support isn’t nearly as good for professional broadcast / film video formats. DNxHD can be hit or miss and it doesn’t support formats like OMF or AAF. I have heard some broadcasters’ QC departments reject ProRes created by ffmpeg as invalid.

It is probably following the spec too closely.

Like when Apple added ExFat support and Windows couldn't handle it, because Microsoft only wrote the spec. They didn't follow it.

I found ffmpeg supports HAP Q pretty good, which is a decent intermediary format.

"Unix" for video. Love it :)

The creator of ffmpeg, Fabrice Bellard, has an impressively long list of projects. Very talented developer!


Some speculate that Fabrice Bellard is actually as many as ten different people.

This comment brings me a lot of sadness and pain.

I know and respect Fabrice from in person interactions. He is exactly who he claims to be. Maybe an interesting question, instead of this current line of discussion, is to find out what led to his successes?

Could it be replicated?

Intrigued by this exact question a long time ago, I took a look at some of Bellard's earlier projects.

I studied a bit of a partial implementation of an OpenGL software renderer written by him. I even found a bug I could fix!

Actually his first projects were not all that advanced, they are things "normal people" could do. I think key lies down to 3 points:

  - Start simple: do not try advanced projects at first, start slow, plan, don't write the first function before you know what will be the last. Use this for many small but increasingly complex projects to gain experience.

  - Understand what you do: most of the most notorious Bellard's projects are complex but based on many simple smaller moving parts and based on simple, understandable concepts. If the first point is respected, this one comes naturally.

  - Be regular: do what you need/want but do it regularly. If you look at what Bellard does in a single day, it doesn't seem that different from what most people can do in a single day. But he does it with incredible regularity.
I think these are points that most people can follow. You may not get to his level, but you will certainly be a solid developer. If you're lucky to start projects that the "right people" need, you can see what separates devs like Torvalds or Bellard from other equally solid developers: they developed projects that attracted strong contributors. Think about the guy writing SerenityOS: he's probably as solid as one can be; nevertheless, there's a very small probability SerenityOS will ever be as commonplace as ffmpeg. And that doesn't mean it is not fantastic, neither SerenityOS or its developer.

Now, in Bellard's case there's just on more special thing: he is a gifted mathematician. That certainly helps being a good developer and planner too, but turning into a mathematician of his level also takes another level of discipline.

> don't write the first function before you know what will be the last

This is some seriously deep meta-heuristics. Very insightful. Thank you.

Do we need a "Think before code" manifesto?

This is a true counter-standpoint to all the agile cargo cult that has been developed in the last years... Not to be taken as 100% dogma, like everything...

In Clojure we call it "Hammock-driven development" after a 2010 talk by Rich Hickey https://www.youtube.com/watch?v=f84n5oFoZBc

This is literally "banned" since Google took over the world's software engineering PR. Dont think. Start Coding. Further fueled by Facebook's move fast and break things.

If you let someone with a CS undergrad degree think about code they immediately start doing premature optimization basically stealing from the company because their profs at school gave them incorrect ideas about performance and 0 ideas about how a business operates.

I understand what you're referring to, but such a manifesto, if even moderately successful, would inevitably become dogma, and then lead us right back to... waterfall.

Cálmate, my friend...pretty sure the comment was a light hearted joke.

I'm reminded of the joke that Nicolas Bourbaki quit writing math books when they found out that Serge Lang was one person.

Anyway we know Elena Ferrante is two people.

> I know and respect Fabrice from in person interactions. He is exactly who he claims to be. Maybe an interesting question, instead of this current line of discussion, is to find out what led to his successes?

While I think the GP was a joke, you ask an interesting question that I tried to answer 12 days ago on this very site. [1]

tl;dr: I'm not fast, but I'm getting faster because I understand my entire codebase. I think that may be a good portion of Mr. Bellard's success, or if it is not, it may be the best way for mere mortals to emulate his success.

[1]: https://news.ycombinator.com/item?id=29533412

>> I understand my entire codebase.

FFmpeg doesn't have any dependencies does it? You are probably onto something.

I think it's a joke dude!

I guess it was a joke about the 10x developer(s)

1) A profound interest in programming for its own sake.

2) More money than God.

Also helps: zero fucks to give about maintaining the damn thing.

How can it be funded?

I would love to work more on open source.

If it solves a specific need of a large company, they may decide to fund the developer to maintain it and tweak it for their purposes.

If it only works for small companies or end users, maybe it can be dual-licensed or mixed-licensed, offering the basic software for free but selling closed-source add-ons for expanded functionalities.

I saw Fabrice comment here during the log4j debacle complaining about how hard it is to get funding. I'd say the answer is that it mostly doesn't, and that there is a very small paid team working on it.

That would still be ten very prolific people.

So you're saying he's the Nicolas Bourbaki of the software world.

I was thinking.Erin Hunter but tomato tomato.

Pretty much.

Or that he is a time traveller from an advanced (human or not) species sowing the seeds for humanity’s leap forward into into extra-terrestrial society.

Or, perhaps more likely, sowing the seeds for humanities demise by enabling the existence of YouTube/TikTok et al.

Demise? Maybe that's how you neutralize the next Hitlers. You let them watch TikTok and become obsessed with dancing.

When you get rejected from art Academy today you can just start a YouTube or TikTok with drawing tutorials, or timelapses of yourself painting. So many possibilities, when in the last century you might have just given up on your passion and gone into politics

Ten geniuses in a long coat!

There's maybe some insight to be gained by analyzing the commits to see if the code seems to be made by different people. I remember that there was the same theory about a very prolific person on the npm/js ecosystem.

Mate, it was a joke. Nobody seriously think he is several people.

Maybe but I still like their idea about analyzing git commits.

The French seem to have a history of this :^) https://en.m.wikipedia.org/wiki/Nicolas_Bourbaki

Reminds me of how Philip K. Dick has sent letter(s) to the FBI informing them them that Stanislaw Lem was a group of communist writers working on mind-controlling sci-fi[1].

[1] https://en.wikipedia.org/wiki/Stanis%C5%82aw_Lem#Philip_K._D...

Bellard spelled backwards is Bourbaki.

And those ten people are actually a pseudonym collective of Fabrice Bellard!

Not many do for Linus Torvalds though!

Its Time Travel.

I hope this is wry humor and not more 10x denialism.

Is there any particular reason that any mention of Fabrice Bellard turns into a praise-fest? I mean, I get it, the guy makes neat stuff and lots of it. But the level of hype is unusual, and I wonder whether there's an origin for this meme that isn't rooted in reading his actual source code.

It's based on using it, not reading it. It's not hype. There just isn't anyone comparable out there.

I love the nonchalance with which this is presented.

"Oh here's a list of my projects."

Talented is a massive understatement. Fabrice Bellard and John Carmack are folks that get 9.5/10 as developers (10 being reserved for God).

He's kind of a "Buckaroo Banzai" type of guy...

Maybe a 10 for Terry A. Davis, then.

Fabrice Bellard, one of a short list of programmers that could make John Carmack go "whoa".

Yeah, you have the average programmer who spends so much time worrying about what FE framework to use, then you have the programmers at the level of the two you mention who just put out so many working projects. All the cargo culting most devs do is just waste.

I didn’t realise he created QEMU too! Amazing talent!

that guy is a genius

I think it is worth posting Tom Vaughan [1]( Father of x265 )'s reply here.

>FFMPEG is a media processing framework. It's great for simple media processing jobs (builds a graph, runs the graph). A build of "FFMPEG" includes that media processing framework plus dozens of open source libraries (filters) that do the actual work (x265, x265, libav, etc).

>So "FFMPEG" gets a lot of well deserved credit (it's the framework that lots of applications and well-known services run), but the core FFMPEG is probably a tiny fraction of the development man-hours in a build of FFMPEG. It's all the linked libraries that make it useful.

It is also worth stating this is from 2020. I am not sure why this is submitted now.

[1] https://twitter.com/tmvn/status/1258610308962111490

libav is a part of FFmpeg, and the heart of it. But FFmpeg isn't just a skeleton. Most decoders + protocols + demuxers / muxsers + filters + scaler + resampler are native components.

The most well-known 3rd party libs provide a few (albeit commonly used) video encoders. Of course, having interfaces for these external libs is what allows media operations to be consolidated into a single process workflow. As standalone tools, x264/x265 would be more cumbersome to integrate into a pipeline.

> It is also worth stating this is from 2020. I am not sure why this is submitted now.

I came across it a day or two ago in a reply to @Suhail's thread of donating to OSS projects[0]. I suspect that's where the submitter saw it as well.

[0] https://twitter.com/nurblieh/status/1473695776253648911

And VLC is a tiny fraction of FFMPEG?

No, but both vlc and ffmpeg use libavcodec.[1] HandBrake uses ffmpeg.

[1] https://en.wikipedia.org/wiki/Libavcodec

I note the ffmpeg team do consulting and accept donations:

https://ffmpeg.org/consulting.html https://ffmpeg.org/donations.html

This guy (Fabrice Bellard) is my personal hero.

I have designed and build with my team VOD service for leading local entertainment TV provider in 2009.

We were amazed at provided flexibility by ffmpeg. In no time we were able to hook up our back-end logic and go.

We opted to "risk" with Nginx and Jquery and delivered successfully on the front-end.

The funny thing is that at the time a lot of commercially available "solutions" were a total joke and costed an arm and a leg.

I note that ffmpeg is on Vizio TVs and Vizio are being sued for not complying with open source licenses, including the ffmpeg one.


Amazing that this effort has legal funding. The stench of hypocracy when media companies, which publicly cry "societal collapse" at personal use IP violations, at the same time disregard basic licenses of others, exploiting volunteer labor for commercial purposes...

I came across a set-top box that I suspected used ffmpeg for transcoding, because I recognized some artifacts I saw when I used ffmpeg's defaults for some settings.

Looked through their copyright notices and it turns out they were using ffmpeg. It's really everywhere.

Completely idle curiosity: What kind of artifacts? I don’t spend much time on AV stuff, but I always find the details fascinating.

One obvious one is the audio quality of Twitter videos, or lack thereof. That one immediately told me they were using ffmpeg's AAC encoder (it's fairly obvious on piano music). Some of ffmpeg's built in codecs are great and some aren't; AAC is in the latter category. I've done some ABX tests with it and some transcoding tests and it's just not great; even at 320kbps two or three transcode cycles result in clearly audible artifacts, especially in high frequency sounds like cymbals (you get a sort of warbly characteristic). MP3 actually has better performance in ffmpeg (via LAME) than AAC at the same bitrate, using the built-in encoder.

If you've ever heard a Twitch streamer, especially people playing 8-bit games or chiptune-ish music, and wondered why the audio sounded a bit dodgy, it's ffmpeg. AAC at 160kbps (the streaming standard) is supposed to sound better than that. OBS will use FDK if available, and on macOS and Windows it'll use CoreAudio if installed, but you need to install an old MSI pulled from an old version of iTunes to get that on Windows... so the vast majority of streamers on Windows are stuck with ffmpeg-AAC.

If you're using ffmpeg on Linux, install libfdk and use libfdk_aac; if you're on macOS, use aac_at instead. Those are much better encoders and ffmpeg can use them as external libraries.

Note that this isn't a criticism of ffmpeg; lack of quality in some codecs is a direct result of lack of funding (though I do fault them for, at one time, claiming the built in AAC encoder was good enough to be made the default and competitive with FDK; it wasn't and still isn't, but I think they reverted that claim at some point).

(source: I help run a couple anime DJ restreaming-based events and fixing the AAC issue is a big red warning item in my set-up guide; I once got a double transcode feed where both steps were using ffmpeg-AAC and I could instantly tell something was wrong)

For a few years, the default aac coder has been fast. But the twoloop coder is better but slower, and was made default in May 2021. Add `-aac_coder twoloop` (in older builds) and recheck.

I should do a comparison with the more recent one then; it's been a while since I first ran into this. Thanks for the pointer!

Would you simplify your ffmpeg advice for the less-programmerly among us?

I use ffmmpeg all the time on my M1 Max and if there are parameters that I should always be using, I would like to know :)

> if you're on macOS, use aac_at instead. Those are much better encoders and ffmpeg can use them as external libraries.

Hey, do you happen to know if aac_at is encoding at the slowest speed? (Most compression efficiency but longest encoding time). I can't figure out for sure, so I keep bouncing audio through afconvert since it lets me set `-q 127`, and it's a pain, I'd rather do one step.

There are two quality parameters and I'm not sure which one is `-q` in afconvert. One of them is the VBR quality, `kAudioCodecPropertySoundQualityForVBR`.

ffmpeg maps it like this:

q = 127 - q * 9;

So `-q:a 0` should do the same thing as passing 127 to AT.

There is another property though; an actual control of codec effort (`kAudioConverterCodecQuality`). That one should be accessible as `aac_at_quality` and maps 0 to 96, so there's no way to go up to the max of 127... But Apple defines "High" as 96, so I get the feeling there won't be a big difference between that and max, especially for an audio codec.

Thank you, this is the best explanation I've gotten in years!

> That one should be accessible as `aac_at_quality` and maps 0 to 96, so there's no way to go up to the max of 127.

Is this a bug I should report to ffmpeg?

If there is a nontrivial file size or quality difference between those two settings, sure. If not, it's probably not worth it :)

> Windows it'll use CoreAudio if installed, but you need to install an old MSI pulled from an old version of iTunes to get that on Windows... so the vast majority of streamers on Windows are stuck with ffmpeg-AAC.

I think windows might have a built in AAC encoder in media foundation? Wonder if that's true and if it's any good.

IIRC there was some attempt at supporting it but it was broken or not very good? I forget...

Why in the world are you still fooling with AAC? Everybody supports Opus now.

Live streaming is stuck in the dark ages of RTMP with H.264 and AAC. It's basically hardcoded everywhere, there is barely any support for other codecs or protocols. You want to send streams to Twitch or YouTube, you're sending H.264 and AAC.

OBS doesn't even let you pick streaming codecs. It just assumes H.264 and AAC. You can get it to stream in other codecs by abusing the "record" function with an output to stream instead of filename, but it's a hack...

not even ffmpeg can bring an encoding back from the dregs

When transcoding some streams, if you use ffmpeg's defaults, sometimes the way it handles PTS, DTS and keyframes isn't ideal, and you'll get weird video pauses, blanks, and strange but distinct visual artifacts while the audio plays just fine.

Having spent a lot of time using ffmpeg for transcoding, it's one of those things I recognized because I'd often forget to set the correct settings and got similar results, as well.

Considering how ubiquitous it is, it's a mystery how and why there's so little documentation about how to use the library versions libavcodec and libavformat.

I think you're supposed to read the header files? I have no idea how people write ffmpeg stuff. The only good tutorial I've seen is: https://github.com/leandromoreira/ffmpeg-libav-tutorial

The headers are actually pretty good.

I learned to use libav by trial and error over years, and now I'm at the point where I can just look at the headers for a refresher whenever I use it.

At a very broad level:

- Encoding and decoding are both pipelines

- Codecs and formats are both part of a pipeline

- An instance of a codec or a format is called a _context_

- The two basic operations for any context are "put data in" and "pull data out"

Putting this all together, a lot of tutorials present this as a case of "Just push the input data in, then pull the output data out!" In my experience this is backwards. This leads to "pipeline buckling" where, e.g., you pull a compressed packet from the demuxer and try to push it into the decoder, and the decoder says "hey, slow down, my buffers are full". You have to just awkwardly stick that packet in your pocket until the decoder's buffers are dry again.

I have always had better luck pulling data. e.g. If you're decoding, first try to pull a decompressed YUV frame from the decoder. If that fails, then try to get a new packet from the demuxer. Only if that fails (and you're not using the default file I/O), then feed the demuxer another buffer. This way it never buckles. It results in a few redundant "Can you give me anything?" checks when warming up the pipeline, but you never have buffers awkwardly outliving a loop iteration. If a buffer enters your control, it's always because you already found an empty spot in the next stage to immediately dump it into.

In my experience, the problem with this is you're still going to run into issues with A/V sync if the input is muxed weird. You can end up having to buffer a ton of A or V before you get the packet for the timestamp in the present. I have an old project using fixed size packet buffers between the demuxer and decoders, and for some inputs it just deadlocks as one side gets full.

My workaround to avoid having to buffer a potentially infinite amount of data was to instantiate the demuxer twice, once for A and once for V. Then they can run out of sync with each other in terms of the muxing, and you can keep them in sync in terms of presentation timestamps.

Part of the problem is the API kept breaking every other version; ffmpeg is an amazing project for users but has historically been a nightmare to integrate with and package, for this reason.

It's gotten better in the past few years, though, but there's still a lot of legacy codebases that are useless today because they won't build against modern ffmpeg any more (even though the rest of it is fine).

API/ABI is meant to be preserved within a major version, but not across. The deprecation period is typically two years, but deletion is usually much later.

Do you have examples to the contrary from the past few years?

Header files (or Doxygen), and the few official examples. Some of them use old and no longer recommended APIs/patterns, and have obvious problems as soon as you compile and use them without changing anything.

I say this as someone who has extended and maintained a libav* language binding with similarly terrible documentation. I tried to write more examples for people. But the API surface is vast and resources are extremely scarce. Sometimes you just have to dig into actual production code of FFmpeg (the CLI side), mpv, etc. to see how other people use it.

We use libffmpeg to stream live video from RTSP CCTV cameras and the main sources I used were https://github.com/mpenkov/ffmpeg-tutorial and https://natnoob.blogspot.com.au/2011/04/modify-api-example-t...

I agree. Unless you had a lot of domain knowledge or a lot of time with a lot of weirdly encoded media I would say that it is impossible for someone to make something as seemingly simple as a thumbnail extractor with those libraries (eg. take the first frame or x seconds of a video and write an image file)

Your best bet is to just call ffmpeg to do what you need and not have to worry about the weirdness of the libav stuff.

Also, stuff like parsing a buffer of memory requires you to write a file driver since the code for libav is tightly coupled to ffmpeg which only handles files and not buffers.

I've been struggling with exactly this, setting up ffmpeg to work with buffers instead of files. I'm about ready to just write frames to /tmp or maybe /dev/shm and invoke the CLI. I work with a lot of weirdly encoded media, so I'm thinking that for long-term maintainability it might be better to use the CLI anyways, as you can iterate significantly faster at the cost of control.

I'm just not sure what kind of performance overhead it will have writing to a ramdisk vs. using a buffer, as it's a non-negligible amount of memory to read/write so many extra times.

>setting up ffmpeg to work with buffers instead of files

I'd take a look at https://github.com/FFmpeg/FFmpeg/blob/master/tools/target_de...

In io_seek I'd also clamp pos at the end to be between 0 and the filesize. That might be needed for some files. It doesn't hurt though.

>I'm thinking that for long-term maintainability it might be better to use the CLI anyways

It's definitely easier that way. You don't have to deal with gotchas like having to set some random flags or needing to specify a filename. It's just so easy to write code that fails on 1% of files which ffmpeg can actually decode.

FWIW, I created a production batch transcoding backend using python to construct ffmpeg command lines. If you don't need streaming you may not need to talk to the libraries.

(The transcoding was for transforming DVB-T captures to multi-bitrate HLS and MS Smooth Streaming formats as part of a Cloud PVR. It was a separate backup pipeline in case the commercial streaming pipeline failed, which tended to happen a lot initially.)

And very few of those big companies contribute back…

See https://news.ycombinator.com/item?id=29524103

Wrt relicensing vlc, would you have made different choices were you faced with it today? (Not trying to be confrontational, just curious.)

Some changes companies make are not accepted by the maintainers for various reasons. NDAs are a bitch in this area too

Don't forget about gstreamer. Both FFMPEG and gstreamer are used in many production environments. That said, you also find professional codecs and transport libraries. I've worked on projects using Main Concept codecs and custom plumbing, for instance.

Genuine question: Are there any actual examples of giant companies using gstreamer?

Why I ask is that pretty much the only reason I have gstreamer installed on my pesonal laptop is to satisfy certain GNOME dependencies (due to GNOME, understandably, dogfooding its own stuff). Everything else (including apps I use daily e.g. mpv, Firefox, etc) depends on ffmpeg.

  $ sudo pacman -Rs gstreamer 
  checking dependencies...
  error: failed to prepare transaction (could not satisfy dependencies)
  :: removing gstreamer breaks dependency 'gstreamer' required by cheese
  :: removing gstreamer breaks dependency 'gstreamer' required by gnome-shell
  :: removing gstreamer breaks dependency 'gstreamer' required by gst-plugins-base-libs
  :: removing gstreamer breaks dependency 'gstreamer' required by libcheese
  :: removing gstreamer breaks dependency 'gstreamer' required by webkit2gtk
  :: removing gstreamer breaks dependency 'gstreamer' required by webkit2gtk-4.1

I have used gstreamer for the Dutch Railways, to stream the announcements to passengers in certain types of rolling stock. The announcements are generated as text, go through a text-to-speech engine and are then streamed to the rolling stock. As you'd expect, every manufacturer uses a different method to broadcast the announcement.

Which text to speech engine are you using, if I may ask?

I believe this was a commercial engine from Acapela.

RDK is the firmware used on a wide variety of set-top boxes, routers, and IoT devices: https://rdkcentral.com/

The core media pipeline is almost all gstreamer, ffmpeg, and just because I think it’ll surprise some people, Wayland.

I work at a company acquired by Motorola Solutions and we use gstreamer, but I don't know about MSI overall, just our division. We also use ffmpeg for some things, lol.

Overall, they do not use gstreamer. You guys are the outliers. :)

Don't know if you'd consider them giant, but a former colleague tells me that Pexip uses Gstreamer for video-wrangling. (Hearsay, admittedly)

gstreamer is ubiquitous in research labs that depend on video - extremely common

NVidia uses it a lot on their Jetson platform.

Yes. It’s embedded in a surprisingly large number of embedded devices.

Yes. On Windows actually :)

Debian is not the best maintained distro.

Debian doesn’t use pacman. Odds are it was Arch (though pacman has been ported to other platforms).

It was indeed Arch. Not sure about the comment about Debian as IIRC it's more or less the same in Debian. Using gstreamer seems to be more of a GNOME thing than anything to do with distros.

Debian tried to switch from ffmpeg to libav. It didn't turn out well.

But it generally gives me no trouble.

Could you give more details on how it did not turn out well?

As I recall, they were not able to maintain feature, and maybe bug-fix, parity with ffmpeg.

I’m using ffmpeg to transcode files. My users give feedback that it is really slow. At first I thought they are wrong as I believed ffmpeg is as fast as it gets, but sure enough after testing transcoding (prores -> h264) is many times slower than using resolve, silverstack, compressor etc. I get asked all the time can’t we just swap to using something else, but this ffmpeg monoculture means there seems to be no alternative.

Doesn't ffmpeg support hardware encoders like Nvidia's nvenc or Intel's idontrememberthename? Aldo AMD, most likely.

I know Handbrake supports it, as I used it for my productions in the fast. And I think handbrake uses ffmpeg, maybe?

Yes it does. I am using hw acceleration. This is using mac hardware so not all options are available, but the ones I have tried do not make significant difference.

Make sure both your decoding and encoding are on hardware, then. You might be cycling frames through system RAM.

Are you sure you are using the h264_videotoolbox options? I mean, you might be, but that, for me, seems to be the problem. Ffmpeg is quite fast if you can invoke the hardware encoders.

Yes I am using videotoolbox. I need to clarify that ffmpeg is not slow (it runs close to realtime bit under 20 fps) problem is that other tools reach 72 frames per second.

The big deal about using FFMPEG, is tuning it (tweaking the arguments).

I never got particularly good at that.

There are people that have made entire careers of tuning FFMPEG.

I liked his making a $1K donation.

Myself I have the opposite experience; my first job was a short thing at a video conferencing company (Tandberg, bought by Cisco), and I remember being surprised that much of their resources were dedicated to building custom transcoders.

FLOSS made internet video possible.

Made it reasonable. RealPlayer™ and QuickTime did sort of "work".

>FLOSS made the internet possible.

FTFA! In all seriousness though; there isn't an aspect of computing that hasn't been improved by F(L)OSS. (What's the L stand for?)

> there isn't an aspect of computing that hasn't been improved by F(L)OSS

Mmmm.. I'd have to disagree, based on my professional experience. FLOSS hasn't really improved business software. User interfaces, Groupware, Enterprise software, Tax software, Government software, Industrial software, Telecom, POS, EHR, etc. It's hardly ever written as FLOSS because it takes a lot of time and effort, and devs don't find it rewarding to write that stuff in their free time (most OSS is written because the developer just needed that code). When there is a FLOSS solution, it's not an improvement over the proprietary alternative.

Even in "IT", there is a huge dearth of quality FLOSS used to build... anything. There's lots of one-off projects that are good for one thing or another, but very few large robust solutions, and components never seem to mix together unless they have "custom integrations" to tie them to other systems. Microsoft will forever be the king of Enterprise software integration and business productivity tools, and each industry has its own "business software" titan. SaaS and PaaS are also more closed-source than open.

When there is some FLOSS that does something difficult like video rendering, or running distributed applications, the entire industry uses it. But for the bulk of the business world's needs, often the only solutions are proprietary.

Libre. Usually used to denote that we mean freedom not zero-cost when saying “free”

It's French or Spanish for free, because in our languages we split the concept of free beer and free speech into two words.

Gratuituous could be used for free of charge since it's what we used in latin languages (which I suppose is the same root as gracious - at least in formal french we replace gratuit with gracieux to mean the same thing).

So I wonder if English, tired of borrowing a strange word, will use gracious or gratuitous some day to denote free of charge ! That would maybe avoid all these questions of what libre means !

We (English speakers) frequently use gratis for free-as-in-beer.

True, but “free” can always have that meaning as well, making it always ambiguous.

As the sibling says, we do use gratis as a loan. Gracious already means "with grace" (of manners or motions), and gratuitous "excessive" (as in violence)...

I use Libre to indicate “freedom” based software, while FOSS for “free” software.

I rarely use FLOSS as, floss is a known word for flossing teeth and for the dance move. Libre just sounds cool compared to floss.

That's like American legislative acts - the effect is completely opposite from the name.

All FOSS is open source, but not all FOSS is libre or OSS.

How can you possibly call non-permissive open source software libre? It's non-permissive, in the case you haven't noticed.

I remember the days of QuickTime, RealPlayer, and Windows Media. There were contemporary open source tools but they had zero adoption. What FLOSS made possible IMO was free Internet video, because proprietary tools were too manual and too expensive to use at large scale.

And opened the doors to the distribution of misinformation on a scale never before conceived of.

Same can be said for imagemagick

Interested to know what Twitch is using, they did a two part blog where they explained how ffmpeg was too slow, so they developed their own.

From sifting through the leak, it was a fork of ffmpeg they converted to C++, but still retains ffmpeg naming/structures. They dropped any code they didn't use also, so it isn't very much code.

I am not surprised at all. What else could you possibly use in place of Ffmpeg?

Can confirm that VKontakte, Russia's largest social network, also does use ffmpeg for video transcoding.

I work for a Fortune 500 company. We absolutely use ffmpeg. Why would we not use the best software out there?

Yeah, still got to work on some audio desync issues when using the xstack filter with inputs of different frame rates and formats. Was told to fix it by reconverting all my inputs at the same framerate, but other than that ffmpeg is really a powerful software.

But yeah, for simple video tasks, I either use ffmpeg or avidemux, davinci resolve crashed at first use.

There's also a python module that wraps the filter syntax I think, because writing a ffmpeg command is not very accessible.

It is really crazy when freely accessible software can get developers more reputation and money, rather than pissing code away for code that they don't own anymore.

Proprietary code gets the job done, but free software helps the whole community of developers.

Yeah, all the medium-sized companies too. When I was in robotics I recall a project to integrate ffmpeg to provide a way to scrub through camera footage in the gathered data from one of our autonomous vehicles.

Not mentioned in the thread but YouTube encoding is also built upon ffmpeg:

https://streaminglearningcenter.com/blogs/youtube-uses-ffmpe... https://news.ycombinator.com/item?id=2831292

Really great example of the long-term impact of opensource.

The "triumph of high quality open source software" is wrong because ffmpeg is old C/ASM hacky code (not even C++, optimized by speed for hardware). I got my first experience of optimizing ffmpeg video decoder for AMD Alchemy processor in 2006.

Let's take a look at the history of ffmpeg created by hackers for hackers.

Do you remember how many times you saw strange squares on the screen when rewinding a YouTube video or DVD MPEG2, or random freezes when streaming Twitch when you think it is a video stream error. But no. Most of these are programming errors of the ffmpeg hacker backend, which has many pointer errors. It is difficult to recovery state from a failure of any level due handling C and pointer errors. The global use of ffmpeg is caused by the lack of FOSS alternatives, but not by the quality of the software. It was cheaper for corporations to fix bugs and hire a developer like me than to develop their own video codec or system from scratch. All alternatives were closed source and required of tones of money per tech licenses like MPEG2. Of course, over the years, the ffmpeg ecosystem has improved and many bugs have been fixed, but this has not changed the general situation with existing and potential bugs and vulnerabilities by design. Therefore, in different video applications, you can observe random bugs and signal drops every day. Nice try John ID_AA_Carmack

This is conflating two very different issues. While ffmpeg certainly has its share of vulnerabilities in part caused by it being an old and crusty C project, that has little to do with video glitches. Video glitches are usually caused by corner cases in decoding, e.g. issues with reference frames during seeking. Those are logic errors, which any project handling video formats would be prone to because video codecs are ridiculously complicated these days.

Your blocky errors in video playback aren't caused by a pointer going wild due to bad pointer math in ffmpeg. They're caused by bad or corrupted input or an improper seek to a non IDR point in the file. ffmpeg is rock solid for playback of standard corruption-free h.264 streams. This idea that any time you see corrupted video it's a (potentially exploitable) bug in ffmpeg is just not true. Nobody is getting "signal drops" due to ffmpeg problems.

Now if you start fuzzing ffmpeg and giving it wildly noncompliant and broken input, yes, you're going to run into memory errors at some point - which is why there are companies doing just that these days, to find and fix those security flaws. Most of those bugs aren't even in popular codecs (since those are extremely well tested), but rather in the very long tail of obscure formats that ffmpeg supports (try counting how many obscure video game formats it supports some time!), and therefore in codepaths that nobody watching YouTube or Twitch is ever going to hit.

This is a code audit issue. Video codecs and drivers are a niche and are extremely protected by patents. The number of ffmpeg researchers and developers is very small compared to the number of lines of code, closed standards and soft/hard bugs in each series of specific video hw/sw encoders and decoders. And an AI that can analyze and validate code as complex as ffmpeg has yet to be invented.

It's a programming language issue. Any time you're writing complex format parsing code in an unsafe language like C, you're going to run into these problems. Unfortunately, Rust didn't exist when ffmpeg was created.

Patents have absolutely nothing to do with this; they don't affect the security of the resulting code.

Nobody can write code for all the formats ffmpeg supports and do it safely, in C. Nobody. You think audits work? Tell that to Infineon, who shipped a broken RSA key generator implementation. In a cryptography product. Which was audited privately. And that's a critical part of the code that should've had many eyes on it.

Perfect memory safety for C parsers for hundreds of formats? Yeah, not going to happen. It has nothing to do with the code quality of ffmpeg. It's just not humanly possible to get this right in every instance.

> It's just not humanly possible to get this right in every instance.

That's what theorem provers and model driven development are for.

I wish these tools could be used more. There should also be automated completeness checkers for requirements and specs - a lot of errors, incompatibilities and security issues are the result of ambiguous, contradictory, or incomplete specs and requirements.

> That's what theorem provers and model driven development are for.

That's building a safe programming language on top of an unsafe one. At that point you might as well just use a safe programming language.

That won't fix logic errors, but the vast majority of security issues are safety problems, especially in this kind of codebase.

There is no algorithm in the world for how to check whether a video codec is working correctly or not

The model checking would not check the output but the trace of operations.

It might prove that all bytes of the output were written to at least once, that dereferenced pointers point toward either the input or some other valid structure, and other properties like this.

I agree with RSA key generator implementation idea and I like Rust is better C idea

ffmpeg/libav is being fuzzed with reasonable success. See for example https://github.com/google/oss-fuzz/tree/master/projects/ffmp... and https://security.googleblog.com/2014/01/ffmpeg-and-thousand-....

They found issues in code I wrote...

(I work for Google but do not speak for it)

What's high quality open source software? ffmpeg works well, I don't think an alternative would be much better. There will never be a software without bugs. The alternative would be spending tons of money for another piece of software with more bugs.

Good quality software means good architecture design, less code, fewer bugs, more language features and advanced programming to prevent bugs. FFmpeg works well because the hackers did a good job but spent millions of man-hours shooting themselves in the foot.

I think good quality software is a relative term, and I'd rather have useful softwares. What happens in big company is when software age, fresh employees will say it is not a good software, then they will start from fresh code base, maybe discard some features, write some new bugs, the software would generally at the same level as the old ones. The same can happen in opensource community as well, someone could start a new ffmpeg.

I've worked for video streaming companies the last 5 years or so. ffmpeg all the way down.

I've only used ffmpeg for some minor video edits, it seems super powerful and is obviously quite ubiquitous. I'm curious to know what people have used it for? I'm sure there are some interesting use-cases out there.

Much smaller use case but I’ve used it to automate most of my video editing work (I work on a youtube channel). It probably cut down my work 70%-80%.

After using ffmpeg extensively in the past year, I’m not surprised at it’s utility being so far spread. It’s amazing and has personally saved me countless hours of what would have otherwise been manual work.

I’ve been working for a broadcasting company for a little while now and can confirm I’ve seen the same. A lot of companies are making a lot of money selling services powered by ffmpeg.

ffmpeg is great.

I wrote a thing to proxy a twitch livestream and return it as lazy animated gif, so it could be embedded into, like, old-school messageboards and proprietary chat clients that allow hotlinking images. At the core it was feeding a .m3u8 URL into ffmpeg and asking it for a gif. When it turned out that gif was kind of awful for that use-case, I just asked ffmpeg to give me an mjpeg instead, problem solved.

i owe so much to fabrice, the ffmpeg community, and the upstream encoders and decoders and their respective communities. this post was a good reminder to donate.

are you saying ffmpeg is the next log4j ?

You may be joking, but if one found a RCE in FFMPEG that would apply to any AV transcoding it does, you would probably be able to target a boatload of websites. Of course not on the scale as log4j, but it would still be pretty bad.

Amazon prime video used to use it (I think both server and client, but I can't fully remember).


is there even any real alternative to working with video files and not using ffmpeg for video file manipulation?

And SQLite.

The only question I have about this is: why did he need a "reminder" to donate...

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact