That's odd. My experience with Firefox on Windows is that I can't get audio to work with Zoom. Jitsi, on the other hand, works perfectly though there is a small yellow box that warns me that I should use a "fully supported browser"
We use the Zoom standalone mostly. So Firefox isn't involved in the Zoom side.
Jitsi works fine with small groups. But if we have >10 Firefox users in a room it doesn't for us. Maybe if you have much better internet than we have or otherwise better resources it works for you?
We were using a WebRTC based tool before Corona (DFN), and had deprecated Zoom, unfortunately the people running that didn't manage to scale it up sufficiently in the crisis, hence we had to fallback to Zoom.
in Zooms webclient for audio to work with Firefox you'll need current Nightly, staged to be released as version 76 in May. Edit: another poster linked the bugzilla issue for AudioWorklets
I am on ubuntu 18.04 lts using firefox and jitsi hard locked my machine, it was still sending packets, but all UI was frozen, I had to power cycle the machine.
> Obviously as we stopped using Zoom we have to manually upload all our personal data to Facebook now.
I feel like there is a joke that's going over my head or I simply don't understand. Why you have to manually signup/login/auth/whatever with Facebook because you're using Jitsi now, especially on your own host?
The server (jitsi videobridge) DOES have the keys to decrypt the traffic - i.e. its not e2e encrypted. This is par for the course with webrtc SFUs (there are some "workarounds" to support e2e encryption over webrtc I mentioned in another thread).
The Jistsi team says the server needs roughly 5.5Mbps per Chrome user. Firefox uses a lot more bandwidth, system resources, and degrades the room capacity.
Just something to keep in mind, and after some testing I saw the same results.
My AWS bill was projected to be over $1k/month, so I put it on Linode where it'll cost between $100-200/month. Just about any decent VPS provider would be good options compared to AWS due to bandwidth.
> The Jistsi team says the server needs roughly 5.5Mbps per Chrome user.
That sounds like a lot! I wonder why an almost still image can use that much bandwidth, I guess it has to do with the low-latency requirement but I would love more details on that.
We use a technique called simulcast. It consists on making every participant "work a bit harder" for the good of the bunch.
That is, every participant sends 3 separate video resolutions to the server: 720p, 480p and 180p (this may change due to bandwidth constraints). Then the server will only forward the approopriate layer to each other participant. So, if you are only seeing me in a thumbnail it will only forward the 180p layer. If I become the active speaker (or you choose to pin me to the large view) the server will immediately switch to forwarding the 720p layer.
In addtion, we use SVC with temporal layers, so thumbnails may be given at just 15fps or even less.
We do have a trick up our sleeve (but IIRC it haad to be disabled for the time being) which involves disabling the ssimulcast layers that nobody is requesting. That is, if nobody iss seeing me in the large view, why send the 720p layer at all?
SVC spatial and quality layers [1] sound like a really good solution to the bandwidth issue. From my (extremely limited) understanding, basically if you skip certain packets you get a lower resolution/quality stream. A client sends a single stream at the best quality it can tolerate, and then the SFU can forward whichever layers to each client depending on what resolution that client wants.
What's the state of this in jitsi? I can only find limited info about SVC [2] and that is only on temporal layer. How much bandwidth does it even save in practice - maybe it's not worth the complexity trade-off?
Thanks for your interesting answer. It doesn't really address my question though, so I will rephrase it:
If I take a 720p video of a webcam and encode it to be delivered as progressive live streaming, the resulting stream is going to be less than one Mbps : because the image doesn't move much, I don't need many key-frame, one every 4 seconds is more than enough, and I've seen streams with no more than one every 30s (live streaming of harbours' CCTV cameras, don't ask me why). But of course it won't be realtime either, and you're gonna have a few seconds of latency. While this is OK for live streaming, it is certainly not for a video chat room.
What I'd like to know is why the latency requirement reduces that much the encoding efficiency. Do you have an idea ?
Naive (and perhaps stupid) question here, why don't you request just the 720p feed from each participant and then rescale the streams as clients need them?
The server would have to downscale the received stream. That's at least one round of downscaling for each participant, possibly more if you want to send different quality versions of the same stream down to different participants.
And while your device is quite capable of rescaling your stream -- even a lowest-tier smartphone will do it with no strain on its GPU -- its much more of a strain on the server to rescale 100's/1000's of stream simultaneously.
Additionally: AFAIK you can't easily rescale an encoded stream: you need to decode it, rescale it and then re-encode it. For every single stream! That would be horribly computationally expensive.
Almost still image is a weird way of saying “numerous still images regardless of amount of motion, that have to be stitched together in precise time sorted batches to provide a smooth experience for the viewer.”
Thinking about the use case it’s obvious this isn’t a simple calendar app amount of data.
I bet if you create some images in various resolutions these services support and what looks good to your eye, then fill folders with sequences of them, you’ll see why it’s a lot of bandwidth.
Better to over communicate and let the client deal with the organization as it’s designed to and not let some network admin nickle and dime over bits.
Financial economy doesn’t really say much about the literal economy of building all these toxic gadgets.
Given the big picture, not sure why such a trivial concern as bandwidths ephemeral money cost would foster such strong curiosity.
If you had 4 people the video was going trhough the JVB for sure. Since a Jitsi Meet installation uses multiple components, maybe you didn't shut them all?
Video shenanigans can be a dark art, but not that magical ;-)
Very possible the docs are wrong - im not a jitsi dev, so I wouldnt be able to confirm or deny without poking around. BUt they do specifically state that (as linked).
That said, it is worth pointing out that this thread is specifically about videobridge (i.e. scaling beyond a full mesh).
I've been playing with a custom install of Jitsi Meet for the last few days. It was very easy to setup.
I'm kind of fearing trying to deploy it in my company. This has nothing to do with Jitsi Meet and everything to do with the shitshow that is WebRTC in browsers.
I can't find any desktop browsers that use the available GPU for hardware accelerated video encoding. Chromium (Google and Edge) says it's "only available on Chrome OS and Android". Firefox has a flag with no scare-provisos, but I've been unable to tell if it's had any effect.
By default, every browser on my system was setup to prefer the Intel UHD GPU. I've got an NVidia GeForce RTX 2080 in this laptop. Why not use that?
One of the reasons is that all the browsers have a software-rendering blacklist for certain combos of OS/hardware/drivers. There are rare cases where errant WebGL code can cause a full system crash on (checks notes) Android. So if you're one of the unlucky many who have these combos, but also an operating system smart enough to put graphics drivers into userland, you're taking the fast-train to turning your laptop into a blow-dryer. So they tend to take the Intel GPU over the NVidia GPU because apparently Intel's hardware+drivers isn't as buggy.
You can override it in the hidden settings, which means nobody overrides it.
All of this is to say, you could have a very powerful computer and still have very poor WebRTC performance.
I also had very bad experience with WebRTC in browsers. Just several days ago i tried videoconference with Jitsi Meet. Although it was receive only (no camera or mic), so no encoding, and received video was pretty low-res, it generated enough load on all my cores that after a while it triggered thermal throttling.
Considering that i can play Full-HD H.264 videos with minimal load then this seems ridiculous.
Does this explain why I get poor Jitsi experience on my laptop and better using the Android app on my phone??
I suspect you are hitting on the exact issue I'm having. I have a Thinkpad with just the Intel 5500 GPU, nothing superb. So, how do I figure out if at least that is being used or not?
Windows Task Manager will show you GPU utilization on the Performance tab. Start Task Manager, then start your browser and see if there is an uptick. Then start a meeting to see if there is another one. There probably won't be. Then, disable the software rendering override in your browser: (Firefox: https://wiki.mozilla.org/Blocklisting/Blocked_Graphics_Drive...) (Chrome: https://superuser.com/questions/836832/how-can-i-enable-webg...) and test again. Probably need to restart the browser after the change in settings.
Something that isn't mentioned in the docs is room (and server) capacity, or at least rough estimates.
The Jitsi team gave some specific numbers on room capacity in the forums. Each room should reliably handle 20-35 Chrome users (Firefox uses roughly double the resources), and has a cap at 75. Apparently if you're only using audio, then rooms of 70+ people will work fine.
They're working on upping this number to 500 for configurations that have multiple bridges, as well as improving the UI to account for larger meetings.
Jitsi-meet is very easy to install and to setup.
"apt install jitsi-meet" and you have almost all done.
The documentation is very good.
It works great and it is super userfriendly.
You have no excuses.
FYI -- I tried this in a fresh Ubuntu 18.04 LXC container and it failed spectacularly.
There's a weird interaction between the NOFILE security limit (`ulimit -n`), Java 8, and Linux-based containers (noted on both LXC and Docker). If NOFILE is too high, Java 8 will paradoxically run out of memory as it tries to allocate some huge number of file descriptors. That took a while to figure out, since generally when you get an error about exhaustion of file descriptors, you expect the opposite!
So just set a lower security limit, right? It doesn't seem to work, I'm assuming because something in the Jitsi install process decides it needs to set them really high for the install session. I haven't debugged what component specifically is doing this yet. This results in the install scripts crashing when the installer tries to initialize the Java CA store.
In theory one could wait for that crash, reset the file limit, and then resume the install process. This works to a certain point, but every time I tried it, the crash in the CA store initialization would leave the dpkg database in a badly broken state that `apt-get --fix-broken install`, `aptitude`, and some manual dpkg futzing wasn't able to fix in the course of the 3-ish hours I spent working on it (complains on libc6-linux-dev package install that /usr/include/linux/something-something.dpkg-new doesn't exist; that folder structure isn't present on disk).
Technically Jitsi is set up by this point because the crash occurs as part of the certbot bootstrap, but I was working on getting Janus to work anyway and just tried out jitsi-meet because they had an autoinstaller that would set up Prosody, JVB, and all that good stuff anyway, so it wasn't really worth plugging through the rest of the way or dealing with the broken half-installed Jitsi, especially since there are many warnings about how you must configure the SSL bits properly in the install documentation.
But it'd be great if someone wanted to fix all that and make my life easier. :)
Billions of users who aren't using linux distros with aptitude have an excuse. This isn't remotely close to the "just works" that you need to get wide adoption.
Alternatives to Jitsi are Janus (mentioned in this thread) but also OpenVidu (formerly Kurento) and Mediasoup which provide building blocks to roll your own. All of these also have demos that you can use out of the box for our own conferences.
Just a nitpick: Kurento still exists on its own, and provides a generalist framework of components to build a service that can handle media.
OpenVidu is one such service, it builds _upon_ Kurento, and itself can be used to make videoconference rooms very easily. Otherwise, Kurento is just a media server that handles media but doesn't have the concept of "rooms", "publishers", "consumers", escalability, reliability, etc. so you would have to develop all those things on your own.
Riot isn't an alternative to Jitsi so much as a complement to it. Matrix is for text chat, and Riot uses Jitsi for calls; while you can embed a Jitsi client into Riot rooms, and it auto-shares the URL, AFAIK group calls are not E2EE even if the room is.
One-on-one calls through Riot _are_ encrypted, but only if made in a one-on-one direct chat.
I might be breaking the rules, but can you point me towards any information about how to integrate my existing self-hosted jitsi server into my self-hosted matrix install? Finding this seems elusive.
Last week I could not manage to use it for a three-way conversation (after 1h of trying everything, and re-initializing the stuff several times). Sometimes the sound of one participant was missing, sometimes the video of one. When we got it to work for a subset of two people, the CPU of everybody was at 100%.
I have been considering developing a simple video conferencing solution using this approach with WebRTC. Basically, I was planning on doing something like this example [0] of server-side peers or just a set of forced TURN servers, whichever approach being deployed on servers at the edge.
I figured using anycast IPs, and having server to server communication across regions before then sending back down to the client would be ideal. Each offer and answer signaling would be done via distributed pubsub (e.g. embedded nats) and general persistence also distributed (e.g. something simple like dqlite). Has anyone had success with distributing WebRTC channels like this? Are there any concerns with redistributing, say, raw h.264 and opus as is without special concern for buffering or transcoding? How do slower consumers handle a fast set of UDP packets?
Also, I doubt there's a way, but anyone familiar with an end to end encrypted approach with WebRTC and DTLS when a server is in the middle? I figure not since offer/answer is for a single peer instead of broadcast and I see no approach in the browser to doubly encrypt media streams.
Using your first solution (a custom pion SFU relay), and selectively routing the streams between "edge" servers would be similar to Octo, as shown here.
Using TURN servers instead would not have the benefit of SFU (i.e. clients will have to upload to each peer) - i.e. its still a full mesh network.
Using TURN servers with SFU would work similar to your pion solution, however it would also use more bandwidth, as it would be forwarding the same streams multiple times for each peer routed through that TURN server (instead of once per stream with SFU)
As for e2e encryption over webrtc via an SFU - yes, this is possible, but its currently very messy (wasm video encoding and encryption streamed over an SFU-bound datachannel with full mesh distribution of the encryption key). There are plans to implement "Insertable Streams" which you will be able to transform (e.g. encrypt) which will allow this to work without the hacks.
>As for e2e encryption over webrtc via an SFU - yes, this is possible, but its currently very messy (wasm video encoding and encryption streamed over an SFU-bound datachannel with full mesh distribution of the encryption key). There are plans to implement "Insertable Streams" which you will be able to transform (e.g. encrypt) which will allow this to work without the hacks.
So currently Jitsi meet the one on the web site is NOT e2e encrypted?
I would even say it's more friendly than Zoom. I only tried the hosted version for both, but the experience with Jitsi was orders of magnitude simpler and faster than Zoom.
For creating a room, you need an account on Zoom and it's generally complicated to get to from the homepage. With Jitsi, right on meet.jit.si, you simply pick your url name and you're in.
For joining, Zoom kept trying to push me to download the app, and had to press two tiny buttons to get it to open in the browser. But by default you have no audio or video, so I had to spend 10m teaching my parents how to enable theirs. With Jitsi, it opens right in the browser, and video/audio work with no issue once they accept browser permissions.
At every step, zoom tried getting me to sign up or download their app, whereas Jitsi works fully in the browser with no account. Creating the room is a single click and I get to pick the url too.
No account aside (which both have), Zoom really tries to make you download the app and you have to click a bunch of tiny buttons to get it to open in the browser. Jitsi doesn't do that, it all works right in the browser, including screen sharing and all other features.
I like Jitsi because it doesn't require any signups and it's simple to create a channel/room. But the quality of video and audio seems a bit lacking. Does anyone have suggestions on how to improve Jitsi Meet's video and audio quality when working with the official smartphone apps and platform (not self-hosted)? Sometimes (on a broadband connection), for some participants, the video from a broadcaster freezes once in a while, and worse, the audio is many a times not clear even when the video seems fine. This is with just three or four people in the call with one person turning on the video and audio (like a presenter), and the rest of them with no video and with their mics muted.
Are any of the participants using Firefox by chance? There are known issues being worked out ¹, some in collaboration with Firefox itself ², to improve performance and compatibility.
For now, it seems the best audio/video quality can be achieved by all participants using a Chromium-based browser such as Chrome and Edge.
Which configurations can ensure video and audio at a good enough quality with audio being clear and video not dropping frames (even if the video is a bit blurry, that’s fine)?
You can lower the video quality in the bottom right menu. You can even limit your own camera's capture resolution, for example #config.constraints.video.height.max=360 . See https://github.com/jitsi/jitsi-meet/blob/master/config.js for all the options.
Depends on the bandwidth charges of the SFU host, mostly that means computing the volume (5.5Mb/s * participants * duration) though some hosts aren't volume billed in which case it might then be a matter of whether sufficient data rate available (drop the duration) or paying for the peak consumed. That's just the A/V conferencing not file sharing or other incidentals one often finds useful.
I wanted to evaluate it but was very put off when trying to connect to my google calendar they asked for full access to my Youtube profile. I get there is some youtube integration they have but binding those things together without any mention of why was enough for me to click cancel. Sucks because I really want to ditch Zoom.
Jitsi has really been kicking it at my school lately; I wrote a chat bot for out Nextcloud that creates meetings and now up to 150 people can easily join into one conversation!
At times, the term is used to describe a type of video routing device, while at other times it will be used to indicate the support of routing technology and not a specific device.
An SFU is capable of receiving multiple media streams and then decide which of these media streams should be sent to which participants.
I tried installing this on two different Ubuntu machines using their Ubuntu repo and it did not work. Worked fine on both Android and Ios. This is about what I expect from a java application - the java ecosystem is co convoluted and fragile that the only real options are Android, Ios, and Docker. I wish someone would write something like this in Go or Rust.
Is that possible to change, perhaps if self-hosting? Video quality of screen sharing and frame rate are absolutely atrocious, to the point that you have to wait ten seconds until text becomes readable.
Obviously as we stopped using Zoom we have to manually upload all our personal data to Facebook now.