Instead of spending hundreds of dollars on developing an optimal immersive video call setup just for yourself, we need to get effective designers to work on video chat interfaces to solve the basic problems that still plague us.
* Why can't I quickly and immediately flag my intent to speak? Once you introduce lag into a conversation (and _every_ video chat conversation includes lag), you can't get away with "let's just be unmuted and hope for the best".
* Why can't everyone in a chat see participants in the same order? How can we truly feel like we are speaking in a virtual space together when we do not share the same view? This one is the most frustrating since it has the greatest impact compared to how easy it is to implement. It is hard to overstate the benefit of having a consistent shared experience.
* Why can't I tell at a glance whether I'm muted or not? Why do I have to find the mute button every time? Everyone has a keyboard, why are we avoiding it like the plague?
* Choosing the correct input/output sources is one of the most common source of issues. And yet, we continue to hide it behind confusing settings. I've only noticed Discord get the memo and let you adjust sources straight from the main screen.
>* Why do people have to appear in a random order?
The service we use shows people in order of appearance, except you yourself (who is always prepended to the list).
>* Why can't I tell whether people are present/engaged when their video is off?
Why should you be able to tell? Quite often, meetings are a waste of time, and even in the ones which aren't, only a small amount of information will be useful to each person. It's easy to read HN and pay attention for your name or the name of the team you're part of while on a call. Besides that, what counts as 'present/engaged'? I think everyone would be present (in case something be asked of them). Does 'engaged' mean they need to sit there and nod at everything even if it's irrelevant to them?
> * Why can I only see 5 people if someone is sharing their screen?
This reminds me of my two huge pet peeves:
* Why am I forced to see people? If I'm deep into helping someone else out, I want to make their screen as big as possible, I'm not looking at their face at all.
* Zoom spends some effort making my lips and voice sync up. Why don't you make my clicks sync up as well when I'm sharing?
All these problems come from these video call services trying to serve a million masters. If you're in a meeting it might be nice to have a big red X while you're muted, but it's distracting in an online class when you're always muted. If you're in a class or doing a demonstration you might always want to see the screen, but if you're doing a presentation you may want to see both the face and the screen. How many people should you show while someone is sharing their screen? In my online class I would say 0, and in my meetings I would say everybody.
Zoom shows a huge text overlay if you're speaking while on Mute.
It also has the space bar to temp unmute, basically a tap to speak mode.
I think the UI route for Zoom is wrong though as most users are just lazy and can't be bothered with even minimal effort. So, it should play a loud-ass sound when you speak on mute.
Or just electrocute, to solve humanities problems long term.
Yes exactly! and maybe why can't it show me when my audio is breaking up on the receivers computer, it could be their system or mine, either way i'd like an indicator of dropped audio.
> Why can't I tell at a glance whether I'm muted or not? Why do I have to find the mute button every time? Everyone has a keyboard, why are we avoiding it like the plague?
Would be nice if they just drew a giant red X when you’re muted.
My gripe is multiple screens while presenting. Give me the choice to show whichever screen my mouse is on.
My experience with "hand raising" interfaces has been that it takes clicks, when it should be a first-class operation as easy as speaking itself. Indicators for speaking, too, are insufficient. I think every video chat app has that, but what they don't have is "I am _about_ to speak", like "I would like to jump in now"; An interface equivalent to that thing people do with their body language to signal they're about to speak (they raise their head, open their mouth, perhaps lean in a bit). Not as extreme as "raising my hand".
* Why can't I quickly and immediately flag my intent to speak? Once you introduce lag into a conversation (and _every_ video chat conversation includes lag), you can't get away with "let's just be unmuted and hope for the best".
* Why can't everyone in a chat see participants in the same order? How can we truly feel like we are speaking in a virtual space together when we do not share the same view? This one is the most frustrating since it has the greatest impact compared to how easy it is to implement. It is hard to overstate the benefit of having a consistent shared experience.
* Why can't I tell at a glance whether I'm muted or not? Why do I have to find the mute button every time? Everyone has a keyboard, why are we avoiding it like the plague?
* Choosing the correct input/output sources is one of the most common source of issues. And yet, we continue to hide it behind confusing settings. I've only noticed Discord get the memo and let you adjust sources straight from the main screen.