Hacker News new | past | comments | ask | show | jobs | submit login
An MP4 file first draft (twitter.com/angealbertini)
102 points by mariuz on Nov 26, 2022 | hide | past | favorite | 19 comments



A revised version was subsequently tweeted at https://twitter.com/angealbertini/status/1596058494784110592


This is specifically about a MP4 that contains a PNG image as opposed to one containing video and audio.


MP4 is a container format. You can put an Excel spreadsheet in there if you want. I’m not sure what the author is getting at.


It also has requirements on the box tree structure and order and what boxes are required/optional. Then there are also mapping per codec (see https://mp4ra.org/) that have additional requirements on boxes and how the samples should be encoded.

For example PNG in MP4 is mapped by having a stsd (sample descriptor) box that has a mp4v format/box which itself should includes a esds (mpeg elementary stream descriptor) box which will include a decode configuration stating that the stream type is video and the object type is PNG.

You can see a decode using fq (https://github.com/wader/fq) of the file used in the poster here https://twitter.com/mwader/status/1596219922304360448

Hope that was helpful


Since most of us associate mp4 with moving images, what exactly is usually wrapped in the mp4 container, and is there a good introduction on the topic available?


Audio and video. Wikipedia has a terse but comprehensive overview. It's comparable to Matroska (MKV).

It's nearly identical to apple's QuickTime format: https://developer.apple.com/library/archive/documentation/Qu...


> Audio and video.

And images! The same container format¹ is used by AVIF.

¹ ISOBMFF (ISO base media file format), which is ISO/IEC 14496-12 (a.k.a. MPEG-4 Part 12).


If you want to go deep ISO-14496-12 is probably what your looking for. You can either pay ISO to read it or hypothetically you can google for "filetype:pdf ISO-14496-12". For how things actually work in practice https://github.com/FFmpeg/FFmpeg/blob/master/libavformat/mov... is a good resource.


> For how things actually work in practice

Another version of the demuxer, way simpler: https://github.com/Const-me/Vrmac/blob/master/VrmacVideo/Con...

A complete mpeg4 media file, with enough features to play the content with hardware decoding: https://github.com/Const-me/Vrmac/blob/master/VrmacVideo/Con...


Moving images.

Generally in the form of a video stream (usually h264, but it can be anything), one or more audio stream (AAC or MP3), and a timestamp index that keeps them aligned and makes it easier to seek.

It's not especially easy to find good documentation; people generally don't write software to touch it directly, either they use the OS media library or ffmpeg.


Can that file only contain a single frame, or could it trivially contain a full video with one PNG per frame?


It could easily contain a video consisting of a series of images in PNG or other formats. For example, Motion JPEG files are just a series of JPEG images, and used to be a standard capture and "intermediate master" video interchange format. https://en.wikipedia.org/wiki/Motion_JPEG


Is anyone aware of an open source visual "atom viewer" for ISOBMFF files, similar to https://www.jongbel.com/manual-analysis/atombox-studio/ and Apple's old "Dumpster" and "Atom Inspector"?


fq has mp4 support https://github.com/wader/fq is quite visual but is a CLI tool (for now). It has a REPL and query language to poke around. Disclaimer: i'm the author.

Other mp4 tools i use are https://gpac.github.io/mp4box.js/test/filereader.html and the tools from bento4, also ffmpeg -v trace can give some useful output.


You could also try the Bitmovin MP4 Inspector chrome extension: https://github.com/bitmovin/MP4Inspector


It might not have the prettiest UI, but I've used this one several times when troubleshooting encoder/muxer issues: https://github.com/essential61/mp4analyser.


Author's github: https://github.com/corkami/pics

He works on things like hash collisions, image files that contain an image of their own hash, "polyglot" files and such like.

I think this is intended as a "minimum viable mp4 file" to show what the required binary parts are.


Does everyone has a longer overview of the different components and how they interact (tracks/streams/...) and the different things you can put in there? Something a little more in-depth that is not a full specification, I guess.


Would love to see the H264 MP4 variant! There's also a type of MP4 that is fragmented, this allows for "streaming" the MP4 file.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: