The "Why Use JSMPEG" section hits the nail on the head with respect to the needs of some projects I have worked on.
Particularly, I once needed to stream video from a UAV to multiple tablets and phones on a local network to allow collaborative annotations. Since it was a disaster response system there was no depending on external services, which at the time put WebRTC out of the picture (all of the easy to use implementations required internet access to use existing signaling services).
We ended up using MJPEG and then later a JavaScript implementation of a MPEG-1 decoder. This library certainly would have made my life a little easier at the time!
MJPEG is so freaking simple [1] and almost no one knows about it. It's one of my favorite things to play around with and in my opinion way underutilized.
I built "drawing board" a while ago using an old school image map streaming a motion jpeg. You can interact with the video without ANY JavaScript [2]. The browser stays where you are on posting because the server returns an HTTP 204 - also highly underutilized in my opinion.
MJPEG in casual parlance is more of a "technique" than a format -- and multipart/x-mixed-replace is a very loose definition of even something as loosely defined as MJPEG.
Out of video containers, QuickTime introduced "Motion-JPEG" [1] with each frame in the container being a JPEG, and there have been formal definitions of how to communicate "JPEG-compressed video" with network protocols, like in RTP [2].
Meanwhile mixed-replace was a Netscape trick [3] to do server push with HTTP that never became a real force but was reimplemented by almost every browser trying to stay Mozilla-compatible (except IE), and happily used by webcams for dead-simple videostreaming.
> there was no depending on external services, which at the time put WebRTC out of the picture (all of the easy to use implementations required internet access to use existing signaling services)
That's a good rough demo of how WebRTC connections can still be established with the ask/offers being conveyed out-of-band.
To make it a little more friendly for tablets (and to accomplish before messages expire) I'd think QR codes would be a reasonable way of passing the data without depending on an external service.
There could also be some extraneous information that can be stripped to save on the amount of data you pass between peers so that the QR code isn't excessively gross (see: https://webrtchacks.com/the-minimum-viable-sdp/)
I don't think we ever did formal testing on this (external battery packs were plentiful), but I do recall some of the early versions of the MPEG-1 decoder causing tablets to heat up.
Yes. The link below is an early summary of the project [1], but I'll summarize here too:
We were examining how the different individuals involved in a disaster response could best communicate their needs to the pilot of the UAV without causing significant cognitive burden on the pilot.
We explored using sketching, virtual spotlights, and audio communication. The UAV's video stream and sensor data was pushed out to client devices where users could apply annotations to the video stream that would be reflected across all devices. The pilot would see these annotations unless they toggled out of the collaborative mode.
Hardware-wise, we used a little ARM SBC (ODROID-U3) hooked into the UAV's existing piloting system to serve everything either direct to the client devices or back to servers in our mobile response lab. I believe we used Nexus 7 tablets for the UAV pilot and iPads for responders, but responders could also use their own devices if they preferred.
Video decoding in JS is very impressive - really highlights the speed of modern interpreters. I especially love that a Björk track from the 90's is featured.
I recently worked on a personal project which had to play back .webm files, and I used a similar utility:
Yet another people who cannot correctly decode video. Videos are not in sRGB color space and must be converted to it for displaying in browser. Also videos do not use full range 0-255, instead they use narrow range 15-235 where color (15, 15, 15) means black and (235, 235, 235) means white; this should be converted to the full range which is used in web-canvas.
It's great that you bring more information to the table, (I didn't know about the narrow range myself) but you don't have to be condescending about it.
I remember toying with this idea when I was doing some web-based slot-machine mobile games.
We had a ridiculous amount of assets(mostly animations) that had to be compressed, because one of the sales representatives noticed, that it's impossible to play any of the games if you're connected to a 2G network.
Eventually we didn't go with this solution, because it considerably reduced battery life and made the devices heat up too much.
Beyond the valid patent worries (Broadway.js uses Android code, which isn't cleared by the MPAA in the way the Cisco code is), I would be interested to have a proper comparison with jsmpeg in terms of FPS and battery use on mobile. I would assume there is a CPU cost associated with more complex decoding operations.
Edit: here is jsmpeg's author talking about it:
> There's been an experiment, called Broadway.js, which tries to decode H.264 in JavaScript. And there's some demos available, but I haven't been able to make this work consistently. It's very flaky. It tries to decode different stuff, and different threads And it barely works, if it works at all, so-- and you have to download, maybe, one megabyte of JavaScript for this. It's all part of EM script. [sic — emscripten?] And it's-- yeah, it's very complicated to get working, which is why the MPEG1 form of this is so nice for this, because it's so simple. And you end up with a decoder that's 30 kilobytes in size.
This seems risky to use in production due to H.264 patent issues. MPEG-1, on the other hand, could seemingly work as a drop-in replacement for GIFs in a variety of cases with no such worries.
Media source extensions don't give you the low-level control needed for low-latency and streams need to be re-muxed into fMP4. Web-RTC is a good option, but the infrastructure required can be a non-starter for many projects. For example, native webRTC libraries for golang are currently lacking to support needed to stream low-latency video.
Thor has been merged with Xiph's Dalaa into IETF's NETVC effort, and both Cisco and Xiph are backing this.
NETVC is a next generation codec designed to replace H264, H265, and all future MPEG codecs with a system that is not user- and developer-hostile wrt licensing lock-in via (possibly invalid) patents.
Why not just use VP9 right now? I hated it but changed my mind recently, it's pretty nice and works on Chrome, if you don't need Apple device support (for example for storing and archiving surveillance camera footage or ripping library of DVDs and BluRay movies...) Or in the future AV1 which is Daala + Thor + VP10.
This is very cool!
In 1999 my degree final year project was to implement an mpeg decoder in software.
My only source of information was the MPEG technical reference manuals.
It took me 3 months to be able to decode my first frame.
It ran at less than 10fps on an AMD K6, but I learn a lot about video and compression.
I did exactly the same, but instead purely for fun and learning. Picked a random/blind clip from a naughty movie for extra motivation to get the first frames to the screen :D
I also worked from just the reference book, with no prior knowledge of video coding at all, which made it quite a puzzle to get something on the screen and moving, but it was extremely satisfying when it all worked (to some extent, the thing was horribly slow, broke after one group of predicted frames, and I never implemented chroma, just luma)
Watch out! Some of this code may be GPL-encumbered. The decoder for JSMpeg is based in part on "Java MPEG-1 Video Decoder and Player" [1], which is licensed under the GPLv2. I am not a lawyer, but an argument could be made that the sections of JSMpeg's decoder that are directly ported from that project are a derivative work.
I notice that the demo video stops when I switch tabs. Is this by design or by accident? Can I use JSMpeg but have it play in the background? Another thing; can I have video controls?
PS: That video has a very late nineties, early double-ohs feel to it indeed. Good choice of video :)
The goal isn't to run down the users' battery. The goal is to do whatever timer callback processing a tab is doing that got throttled in the first place.
Throw in an inaudible frequency once per second at a low volume. I doubt they're doing signal processing to decide if a human can actually hear the output.
I don't know where they get the "5 seconds" of latency for dash videos. There's definitely a startup latency, but it's not 5 seconds, and can be avoided with some simple techniques.
They're thinking of live dash streams. Eg. how many seconds from video camera to screen.
I can see it would probably be possible to get low latency, but without a fancy server stalling connections till frames become available, flushing on frame boundaries, etc, I can't see it working. Try doing such things with a CDN like cloudflare to support lots of users...
Also, for true streaming, timing should be done by the camera. Ie. If the cameras frame rate is 0.0001% slower than the 60fps advertised, the display device should slow down to match.
MPEG-DASH has no ability to do this, and would lead to a "buffering" gap for a second every few hours of playback time.
It is incredibly hard to stream video from say, a Raspberry Pi to the web in a way/format that's easy to consume on multiple devices or using just the browser. This is awesome.
This begs the question, which codec is optimal in terms of "cpu-cycles-to-decode per second" for a given image quality ( for some subjective measure of quality)?
"Begs the question" is widely understood to mean the same thing as "raises the question" in common usage, and complaining about it is one of the sillier prescriptivist hills to die on.
The older meaning is obscure, largely redundant, and doesn't really make any sense etymologically, so it's not really surprising that the newer meaning caught on.
(And since we're being pedantic, it's got nothing to do with grammar.)
> In modern vernacular usage, "to beg the question" is frequently[citation needed] used to mean "to invite the question" (as in "This begs the question of whether...") or "to dodge a question"
That's just useless. And you know, saying "it once was A, now it's B, so it can never be A again" is exactly as "prescriptivist".
Little things add up, and before you know it people are just stringing words together as demonstrated on a million youtube videos.
> The older meaning is obscure, largely redundant
How so? How is the "new one" (which one? heh) not redundant? If you want to say something raises a question, that's an easy way to put it right there. On the other hand, I'm not even convinced that the bastardization of "begs the question" into "raises the question" wasn't simply based on not even understanding what assuming an initial point even could be, of just hearing the phrase without understanding the context and using it as another way to say something or someone raises a question. I certainly don't hear it in common usage, regardless of the phrasing used. You know, if all those other people say they just say it because "most" people do, then none of them actually do have a reason. A billion times zero is zero.
And don't even get me started on people suddenly calling low framerates "lag" :P It just destroys information, and you can call it progress because the hands on the clock moved a little, but I won't.
>saying "it once was A, now it's B, so it can never be A again" is exactly as "prescriptivist".
Except I'm not telling you what you can say, I'm telling you what other people do say, which is descriptivist. You can use meaning A if you want to, but don't expect people to understand you.
>You know, if all those other people say they just say it because "most" people do, then none of them actually do have a reason.
That's like saying no one in the US has a reason to speak English, they're doing it just because everyone else does.
Language is about shared understanding, so most people around you using a particular meaning is pretty much the only reason for someone to use it.
When the comment says "understood", read it as "fluently understood". And again, "begging the question" has nothing to do with grammar, so there's no reason to sass by making grammatical errors.
Begging (for) the question is a correct literal usage of those words, under both prescriptivist and descriptivist definitions of the words. You're allowed to use the same sequence of words as an idiom in other ways. I can talk about a hot potato without metaphor, and I can talk about begging a question when not describing a fallacy.
With H.264 decoders being shipped in pretty much anything with a screen these days, that format is usually significantly more power efficient and easier to decode.
It's not the cheapest to encode, but even a baseline H.264 low cpu use profile will beat MPEG-1 or MJPEG anyday anytime on bandwidth and quality.
In realistic cases something like Lagarith was traditionally worthwhile simply because you couldn't read raw video from disk fast enough. I don't know whether that's still true in the age of SSDs.
66MB/s consistently is pushing it, and resolution, bpp and fps can all go substantially higher than the numbers you gave. More to the point, realistic systems could handle much higher resolutions (wherever the limit for that system was) using Lagarith than in raw.
You can actually do this with H264 as well. I did it for a hack-a-thon a couple months back. We are actually looking at doing this in production for a video streaming on an internal network now that NPAPI is deprecated.
Awesome work.
I´m using it for an e-learning/robotics platform for low latency live streaming where it does a great job.
I´d like to know how reverse playback could be achieved.
Is there a way to "undo" a P-Frame calculation, or is a second (reverse) mpeg file a possible solution?
Particularly, I once needed to stream video from a UAV to multiple tablets and phones on a local network to allow collaborative annotations. Since it was a disaster response system there was no depending on external services, which at the time put WebRTC out of the picture (all of the easy to use implementations required internet access to use existing signaling services).
We ended up using MJPEG and then later a JavaScript implementation of a MPEG-1 decoder. This library certainly would have made my life a little easier at the time!