I just want to say that this is really actually kind of mind-blowing from an audio engineering perspective.
Outputting audio from multiple laptops in the same room is easy. Perfectly syncing it is harder. Implementing echo cancellation across all of that is quite a bit trickier than regular single-device echo cancellation.
But then treating all the laptop microphones as a kind of microphone array, having to deal with sync issues and phase issues and background noise issues... that's hard core.
Kudos to the engineering team on this one. This is actually pretty amazing.
Technically very impressive, and meets a real need. My office is still running telecom hardware from a decade ago, all the wireless mics have dead batteries, and is reluctant to replace since so many meetings are completely virtual, so why have custom in office hardware.
This essentially replaces that expensive proprietary hardware with a matrix of laptops, and essentially every user gets a mic.
As someone who grew up with iPhones coming out around the time I was in middle school and apps producing noises like this were used as pranks in class I have to ask that people don't do this. The sound is painful.
Yeah I'm pretty sure there are plenty of adults that can hear those frequencies though. It's not like everyone reaches 18 and suddenly loses hearing.
Stupid devices IMO. When they first came out I downloaded a sample audio file from the manufacturers website to see if I could hear it. I couldn't.... because they encoded it as MP3 and it was completely filtered out by the encoding! Literally an empty file.
In our region, some people have devices to repel moles or marten using "Sounds that are inaudible for humans".
I have yet to come near to one of these devices that I can't hear.
And it's not only me, my wife can also hear them, as well as my daughter.
I also know some people who can't hear anything from these devices, but it feels like the statistics about what people can hear and what not are not that up to date.
Agreed. My mother thinks I'm lying. I will admit I feel a little bit special that I can hear her motion activated cat-poop repeller thing.
I visited her recently and wasn't sure if I had finally aged out of my sensitive ears (34 now) or if her batteries needed to be replaced.
FWIW I wonder if people do not experience physical pain from certain sounds, because people seem to be totally fine with sirens but it feels like I'm having a spike pushed into the side of my head.
When I was a teenager, the applause from the end of year school talent show caused me physical pain — enough that the teachers noticed and got me out of the hall.
This no longer seems to be the case, as I'm living right by a major junction and get random full volume sirens at least six times in the average day. I hate them, but they don't hurt.
I used to hear the remote from TVs especially old Philips ones and LG's with the single chip on them. That was until I hit 44... after that is hit or miss or just imagining.
I too am in awe of the audio engineering challenges and opportunities here.
But I don't necessarily know that Meet is trying to tackle all this? Are they using the mics as a microphone array & processing signals across phases? Could be missing it but I don't see that they said so. Perhaps they're just picking the loudest mic for a given speaker? Or any of a dozen other simpler tactics?
The current baseline is to manually mute and unmute microphones. So picking the best microphone sounds like a better idea already. If other people make a sound, I think it would be acceptable of that sound was missed/softened.
In a large room, perfect syncing is actually impossible since different listeners will be far enough from each other speakers to cause, at best, comb filtering, and at worst, audible delays.
I assume that if speakers are not set too loud then either you're close and the algorithm works or you're far and sound is quiet enough that it's not an issue.
Also imagine that 44.1kHz on one laptop will not equal 44.1kHz on another or one can run at 96kHz and others on 44.1kHz etc. that means everything has to be dynamically resampled in realtime whilst preserving quality and low latency.
I think this is much simpler than what you're suggesting. Careful microphone level management can handle this. No need for audio sync. I know they use the word "sync" but that's a very broad term.
If the distance from the microphone to an "unwanted source" is three times the distance asthat from the microphone to the source phasing likely wont be an issue.
There's always caveats with engineering but it's a decent rule of thumb assuming equal volume sources... I can imagine it's not too hard to detect that anyway, weve been able to do realtime fft for a very long time.
No, probably not there are no phase issues if you just don't transmit the signal. The hard part would be to determine who's in the room, and then who's talking and then mixing appropriately to eliminate feedback and optimize speaker sound quality. None of which requires signal phase accurate synchronicity.
If they're actually able to "sync" (again a poorly defined term) given the problems associated with network latency and different hardware it would border on magic.
Is a neural network even really necessary? This seems like something where careful application of some normal math would work. Find a loud event, correlate the loud event across devices to get offsets (by sequence ID, not clock time), do some fancy math to apply inverse sin waves or what not.
That's not to diminish the accomplishment or say that it's easy (or that I could do it), but I don't think a neural network is necessary here.
I have a very strong feeling this is more marketing BS than 100% solved technical achievement. (disclaimer: xoogler, no inside info, just familiar with recent Google and separately, audio processing)
I agree that this is amazing - but what I don't like about it, is the fact that a 3rd party is doing this, when it should really be a built-in feature of the operating system - or at least, be implemented as close to the device as possible. From what I can glean from this breathless press release, this functionality requires a fair bit of cloud ... anathema to audio professionals, but maybe not so, the professional management classes.
Too many times these kinds of services are wrapped up at the application layer, where really they belong in the operating system. For example, wouldn't this be a perfect thing to implement as a plugin for Pulseaudio, or JACK, or even .. VST?
(Disclaimer: I work on high end microphone and audio products at a well-known hardware manufacturer of such, where much more effort is being made to make the devices, themselves, smarter ..)
I would honestly try to sync audio output based on a shared time reference, something along the lines of what AES67/Ravenna/Dante does but you can be a little more lax and use ntp or system time since you don't need to be sample accurate.
For the microphones that would be a little harder but you should be aware it's not that tough since a few high end manufacturers have phased microphone arrays for videoconferencing. You could probably get close though but the fact is you need the audio from all the sources in a single location for processing and do phase analysis on it and possibly find an optimal delay for each by checking the group delay.
The advantage they have is some latency is acceptable and they don't need to do it on a low power device.
I don't see what this has to do with Gemini but maybe that's just marketing...
My money is that outside of a couple of dog and pony demos with everyone on one well-administered LAN you could not make this work with system time and NTP on consumer devices. You will regularly see 100ms difference in NTP time.
The fact that phased array microphones exist has nothing to do with the point we are discussing, which is audio coherence across heterogenous devices whose only real connection is a web browser.
I'm thinking more some sort of system with a sync point registered per device and using that as a time reference.
It's not inconceivable that they could easily detect multiple devices in a room and find a sync point based on microphone input from a speaker.
Once you have a sync point found you can then set a delay on all devices to try to match that sync point. Nobody said this is easy or everyone would be doing it but it's simple enough.
The phased array microphones is more a pipe dream but you wpuld absolutely be able to do something approaching that with multiple devices on a single room depending on how accurately you can predict microphone location within the room. Im reasonably sure you could start by just using the closest mic and then over timr as you improve sync you can try to use multiple.
As I said they get every single audio stream in and out into their servers and they have full control of the audio the tab is playing and the timing of that.
I don't see this being any different to what the likes of Sonos/Google Home/ Apple Home etc are doing with synced appliances for stereo/ multichannel devices, it's likely significantly harder because it's heterogeneous devices as you said.
All that doesn't answer my question of how you would do this at the OS level? You don't have any of the required information per device, only the central server has even the hope of having all the relevant information and control.
We agree that doing it at the OS level is probably the wrong direction. I think you could get there with PNTP and audio hardware support, which is more how Sonos etc do it afaik but then again you aren’t solving the heterogenous device problem.
It is apparently a good example of something that needs performant neural nets in the cloud to solve. At first glance it looks like a low-level hardware-firmware problem. Market conditions prevent solving it at that level though, so we had to wait for the right combination of resources, new signal processing and heavy cloud compute.
Meet, to me, is the perfect balance of functionality and simplicity (as an end user). And features like this only make that more apparent.
Zoom and Teams always frustrate me, even though they're what I use far more often.
It's too bad that video conferencing is largely a game of who has the most boxes checked - IT admins and A/V folk are the ones that need to be convinced, and Meet just doesn't ...meet... their needs
Meet frustrates me immensely because I can't just fullscreen someone sharing an HD screen onto my HD screen and get 1:1 pixels with an overlay of faces (like Zoom does). Instead, I have to either not be able to read small text, or awkwardly pan in the recently-added "zoom" view.
I also think this is perhaps the top issue i have with Meets. The current UX does not consider the fact people share screen most of the time. having ability to go full screen to get 1 to 1 scale with some essential floating menus. Even more recent years when remote teams are the new normal.
It is pretty frustrated how limited the layout options are. The best workaround is to open a second Meet window and join the meeting in "companion mode" [0]. Then you can at least position the windows how you like, with each focusing on a different thing.
In Firefox, click on the small "Picture in picture" button, then click on the "full-screen" corner button to make any of the videos full-screen. OK you don't have the faces overlay, but you'll have a nice full-screen version of the presentation.
Just in case any Meet engineers are reading... It needs to let me put my self-view right under the webcam lens, so I can stare at myself and still be looking at the camera.
It's horrible slow on some computers/browser combos, which is my main gripe with it. Video quality is also much worse, but doesn't matter too much to me. And as with all Google stuff the UX is kinda weird and non-intuitive. What does all the different join options do? When should I use what? Why must all presentations be so small if I also want to see the presenter? Zoom is much better here.
Meet's GUI eats screen space and reduces the size of video feeds as a result with not being able to do anything about it. It took years for it to let you choose audio and video sources from the button itself like zoom has had forever and forced you to go into the setting panels and tap on 4 screens. Same with video feed layouts. I really dislike meet compared to zoom.
Zoom also transmits much higher quality video and screen share feeds than meet does.
This new feature is very impressive although, most echo cancellation is "mute everyone except one speaker" which leads to walkie-talkie style half-duplex talking, which really hurts normal communication flows. I wonder how they do it here.
It very rarely matters which product is "better" in any objective sense. If you are team that work with Teams, your meetings will be on teams. Teams could be much worse than Zoom or Meet, you wouldn't move your group from Teams for one meeting anyway.
We work with a lot of external customers. Teams, GotoMeeting, GotoWebinar, Skype, Google Meet, Zoom. And honestly, they all suck.
Getting logged in is always a problem. Setting my name permanently is literally always a problem (I have zero idea why). Viewing someone elses screen always stinks because the scaling is just not good.
On top of that you can't compare Teams to Google Meet because literally every organization has theirs configured differently. For some organizations, I can use my phone for audio. For others I can only use my laptop. So it's never an Apples to Apples comparison. Sometimes you have access to chat history; other times if you reconnect you lose chat history. The differences inside a product go on and on.
I used to use a lot of random services while at work, and then I found Jitsi (browser based) and honestly I’m astounded it hasn’t taken over because of its simplicity or ability to just handle anything thrown at it.
My Meet calls always end up pixelated, where Zoom looks perfect. That’s basically the only thing I care about after the audio quality, I often cannot read someone’s screen when shared via Meet. Zoom also feels lower latency but I never checked if that’s the case.
You might be hitting QOS limitations where traffic from Google is bundled with YouTube et al and served at a lower priority ?
In my experience Zoom is always slightly better, but at a nitpicking level. I use meet day in day out, and fallback to whatever we have in hand if it's unworkable (pixelation would hit that line), and never saw zoom or Skype or discord being significantly better at these times.
Agree, I find Meet is the simplest and works the best. I really appreciate the audio filtering -- don't ever hear colleague's typing or dogs barking or lawn mowing.
I think we are technically supposed to use Zoom at my employer, but almost everyone (at least in eng) just uses Meet instead. The UX is so much better, and it's rarer that something isn't working with someone's setup.
Also I swear the audio latency is worse with Zoom -- I find myself accidentally interrupting people more often.
Microsoft convinced the CIO that they can't meet regulatory requirements unless everything is recorded and under central compliance records management.
If Meet did "autorecord" combined with "autoshare recording with recipients" (specifically including Google Group recipients)... then it would be perfect.
As it is, we're gradually shifting more things to Zoom where that's either part basic functionality, or part easy to automate as things like Zapier are well integrated into the API.
We only use Meet for team meetings now, and we did use Gong.io to solve the above... but Gong is pretty expensive just to allow those who were out that week to catch up on a recording when they're back and if they care to.
Well I'd hope that's a property of a specific meeting and not all meetings, but team weeklies where those not present want to catch up, and the recording is only going to those regular attendees (members of a specific Google Group)... sure, that's a thing we would like to have.
Not sure what Gong's pricing is, but we evaluated a few different notetakers and settled on https://fireflies.ai/. $18 / month gets a recording and summary sent out to all invitees to the calendar invite, uploaded into Hubspot, etc. Very valuable for our sales calls.
You might be surprised finding out that nearly all meetings are recorded by one or more participants for transcription, automatic notes, summaries and action points.
Meeting recordings are available in Google Apps Script.
Autosharing can be easily solved by simple Google Apps Script. I scan my personal calendar and share recording to Slack channel when I detect group meeting with recording available. Mail me, so I can arrange something for you.
Autorecording - yeah, this is missing. There is paid Chrome Extension which do that, but I have never tested it.
I record all meetings locally regardless of the tool with OBS. But that is only useful to me as I don't ask for authorization and not sharing them. It is only because I know my mind will sometimes drift and I want to be able to replay part of the meeting.
Having said that, so many people don't record meeting I or someone else can't attend. This is annoying. It should be automatized so that if you miss part or all of a meeting you can still watch the record as long as you were invited.
That's a great idea to record locally. Do you keep a deep history or wipe after 30/90/365 days? Do you need any special hardware for it?
My memory is not great. I'm starting to have more calls with clients and it would be great to have a record rather than trying to take notes and conduct a conversation at the same time.
> It should be automatized so that if you miss part or all of a meeting you can still watch the record as long as you were invited
People generally don't like knowing that everything they say is being recorded
> That's a great idea to record locally. Do you keep a deep history or wipe after 30/90/365 days? Do you need any special hardware for it?
If there is anything I want to keep, I will usually take notes but sometimes I cut the small part I need and store it in a special folder. I just wipe once in a while the main "obs" directory, every other week or so. if I haven't felt the need to play back a video, I doubt it will in the future so I don't keep a lot of retention, it is pretty much only to help me when I am getting distracted or multitasking during a meeting. Most of the time when I feel the need to play back the video, it is immediately after the meeting because I know something important had been said but wasn't 100% focused and want to be sure I haven't missed anything.
But I don't have lots of meeting, a handful a week usually. I don't need special hardware, obs seems to be heavily multithreaded. It might hurt the battery usage if I am not plugged but no core is going very high in term of cpu and I don't feel any slowness. I am recording in the 2500/160kbps veryfast(medium CPU usage, standard quality) setting at 1080p, it takes like 1MB every 3 seconds.
The process is exactly like that of streaming a computer game along with local microphone audio, except one pushes the "Start Recording" button instead of the "Start Streaming" button inside of OBS. There's got to be a million (or more) howtos written on the subject.
Hardware-wise, it's pretty straight forward: GPUs (including the ones that are a part of most non-Xeon Intel CPUs) have been up to the task of realtime video compression for around a decade or maybe more, which allows for the heavy lifting to be done in specialized silicon.
A bigger concern than the technical practicality might be legal concerns that generally surround audio recording.
For instance: In my state, I am permitted to record any conversation that I am a participant in -- I don't need permission from anyone but myself, and I don't need to notify anyone.
But in the US alone, there's also 49 other states worth of laws on the subject, and they can vary quite a lot.
I have always had way more issues with meet than zoom. Guessing it is something about browser compatibility but participants regularly have issues with mics and cameras that I don’t see with zoom.
With Meet, time from opening the meeting link to joining the meeting is often <1 sec, whereas others (especially Teams) have painfully slow loading screens.
Meet is nowhere near "<1 sec" for me, but my employer doesn't provide the best hardware, neither. It takes several seconds for the camera to initialize.
Team's load time is atrocious, though, by comparison.
Meet is perfect, I’ve always loved it. Slick and not bulky. Works beautifully in the browser. Simple features that work seamlessly. The ability to share single chrome tabs is a godsend. Zoom has always felt like a clunky product to me.
As an "IT admin", it's not that I need convincing. It's that the rest of Googles "office" products are absolute amateur hour and support is non-existent.
> the rest of Googles "office" products are absolute amateur hour
Not quite sure about this. Gmail (both personal and company accounts) is IMO a great email client with loads of handy features, and I always felt Google Docs/Sheets/Slides/Drawings were well put together. Are they the best in class? I guess it depends what you are looking for. Could Drive be faster? Yes, that would be nice. But "amateur hour" sounds like we are simply using different products.
I have used Microsoft Office for 20 years and Google Docs/Workspace for 15 years. I much prefer the Google suite over the Microsoft one. I prioritize quality of user experience, ease of use, performance, platform portability, freedom from constant intrusive software updates, and not messing with the built-in Emacs key bindings built into every Mac (Various MS Office programs inconsistently disable those key bindings). YMMV, esp with UI and UX, but for me there's no contest.
Regarding support, that is not an issue I have never needed it for Google, but occasionally have needed for MS Office. So I can see why if you are using Office you would value that.
This isn't an area where simplicity wins. Zoom is targeted more towards professional use whereas Meet is more for casual or social use. It doesn't really support multiple displays, like I can't watch screen sharing on one display while moving the chat and video windows off to another display. Scaling options for screen sharing are inadequate and it doesn't even support real full screen display. Meet lacks advanced audio options that make it unusable for things like music lessons. Zoom has more third party integrations available.
I think because Meet is fully browser based, some of the features like sharing audio with screen are only available via chrome or chromium based browsers -- definitely not on Firefox in my own testing. I don't know the exact reason for that, but it's kind of sad.
Yeah downloading an app for video calls feels very skype. And it's a non-starter for me personally given Zoom's history of lying about privacy and encryption.
That's fair. Everyone comes to these sorts of positions with a stack of values[1]. How that stack is ordered often determines one's choices.
One of the interesting things for me from a technological perspective is whether or not we're converging on what the ideal set of features for a "video phone" would be. If there was a broad enough consensus on the core feature set I would hope an 'appliance' version would be available which would eliminate needing to use a general purpose processor (and all the risks that entails) for this sort of meeting.
[1] From your response I infer that you value "no native code" and "sandboxed javascript" highly which guides you to the choice to use Meet, vs someone who might stack "User Experience" more highly than those two and end up at a different choice.
I think more critically it doesn't support remote control. I have no idea how they have missed that critical feature. Do people really enjoy "now scroll down, no a bit more... back a bit... there! stop!"??
Meet requires a Google account to attend a video call while Zoom does not. So if your hosting a meeting with strangers, using Meet is not user friendly.
At least personally I refuse to make a Google account bc it requires giving Google my phone number... But I know that's a bit of technoludditism in the current zeitgeist
Is this new? I have definitely joined Meets calls from my personal laptop and I don't have a personal Google account. This includes my last job interview, and, well, I got the job.
Perhaps it's meeting specific, or organization specific?
You can join a Google Meet meeting without a Google account if the meeting creator enables the “Anyone with the link can join” setting, or if they accept you when you click the link.
It's been over 6 months since I've looked into this, so it's possible my info is out of date - but the “Anyone with the link can join” setting is only available to people with a paid Google business account. Nobody in my department had anything like that
Unfortunately, it's only a matter of time until one their VPs decides to "improve it" with the usual imbecile ideas they come up with.
>Google Meet now scans all your files so they can get shared automatically with all members on the meeting when it is appropriate to enrich the conversation. You cannot opt-out this feature.
GitLab offers a nice course about transitioning to remote work. One of the items in their guidelines about online meeting is not having hybrid meetings. This is, all those attending must attend from their own device rather than the meeting room one (see Jabra) precisely for the reasons this feature tries to address.
It does in ways you likely don't understand. At large enterprises with satellite offices, most of us don't want to sit at our desks and annoy our coworkers who are also in-office by taking calls at our desks. That leaves meeting rooms, which are at a premium.
Full single device for everyone would mean either everyone loudly shouting at their desks from within their noise canceling headphones or the company giving everyone an office with a door. As even Google doesn't do this for their employees, this is the next best thing we can do to save the sanity of our coworkers and respect remote colleagues dialing in by including them in a shared room meeting.
I sometimes work out of a WeWork here in Bangalore, and they have these tiny one-person soundproof cubicles for taking calls. All they have inside is a cushioned bench to sit on and a table large enough to hold a 15" laptop.
It's a great idea, IMO. You can fit 10-12 of these in the same space as a regular conference room. This solves exactly the problem you're describing.
> You can fit 10-12 of these in the same space as a regular conference room. This solves exactly the problem you're describing.
So it's kinda like sitting in a conference room, except you're talking to other people via a shitty audio quality with latency, random noise, talking over each other due to not seeing each other. You don't even have the benefit of a more ergonomic personal setup like more monitors. When people are in the office, let them do their meetings in person ...
If your entire team is in the same location, then you should certainly have your meetings in person. It'd be silly to force everyone to join a Meet/Zoom. I'm proposing a solution for situations when that's NOT the case (e.g everyone is fully remote or you have a hybrid team).
When you have a hybrid team, the office people can sit in a conference room (for which this Meet feature will be great) and the remote people will connect remotely ...
I worked for a large company with offices all over the globe where 95% of meetings were online. I worked there for several years and don't remember anyone ever complaining about the setup, even though we were in an open office.
Rules are meant to be broken. I've participated in many meetings where, for one reason or another, some participants couldn't join in person. Making it easier - for them and for us - is a technical challenge, and not something to be decided by an arbitrary rule.
Remote is extreme and is still not the norm. It is easier to baseline on things optimised for remote then relax than try to shoehorn in person into remote which is then doomed to fail.
I worked for a company that was headquartered in London and had satellite offices in Spain and Germany. After we all went remote during the pandemic the EU offices said they felt so much more engaged with the rest of the company because they were no longer disadvantaged by default for not being in HQ and in person bad habits were penalizing them
Its definitely doomed to fail if CEOs want you to think it's doomed.
Your comment is handwavy with words like "feel" and "bad habits". There are real issues with remote work, but there are also ways to mitigate the downsides of it. It's easier for some people to just dismiss the idea entirely and pretend that in-office work is the better alternative without any problems. I have definitely seen that happen in my company and here.
Yes - my response was to the rejection that rules are to be broken.
I have worked there fully remote jobs and the ones that did it best fully leaned into being remote.
There is no hard or fast rule, I've been in a team social where I was the only person remote and it still felt relatively natural. I've done a fully remote follow a recipe and cook at home session which was also pretty fun.
I've also been in places that do the bare minimum of what constitutes remote "oh we use Zoom and screenshare" and dictate to people where everyone is cam off. And the difference is night and day.
I think that a little bit of cargo culting wrt remote etiquette is probably a net good thing because I posit many people still don't know what good looks like.
Personally, it was an eye-opener for me to see what difference the personal contact makes. I now prefer companies which are in-person or hybrid (which doesn't optimize for remote).
It's optimizing for the lowest common denominator. When a majority of the team is present in office, it's silly to downgrade the experience for everyone.
You are right, but sometimes hybrid is just a reality. For us it would be an absolute game changer when you do not have to rush to find an empty meeting room or phone booith.
The meeting room (should) have a much higher quality mic than an attendee's laptop. You want that mic.
Worse, when attendees use their mic, someone in the room asks a question, which remote viewers cannot/do not hear, because it is not picked up by the mic on the laptop. Or they join and don't mute, resulting in feedback loops.
I do feel the need for Google's feature though: the pandemic has meant a reduction in offices, so now we're ending up in "co-working" spaces, which — despite this being the bread and butter of the business — have in my experience alarmingly poor quality meeting rooms. To the extent that those present have given up on them. Literally, we were in one with no cabling. We asked the company for a cable, and they gave us a DP cable, but the room was HDMI. In these situations, yeah, nobody is going to want to waste the time trying to deal with getting the coworking space provider to deliver a quality product, and features like the one here help make up the difference.
I have literally never worked in a place with coworkers who understand VC tech well enough to understand what you're suggesting and who are diligent enough to actually do that. Getting that level of cognizance, over the entire seated body of a conference room, for the entire meeting, is unicorn territory IME.
I just exited my startup and now work for a much bigger company. I'm going from a Google-based productivity stack to a Microsoft one, and so far Teams is the worst and best change. Google Meet was fantastic for "just working" especially for less sophisticated meeting guests. The only time Meet didn't work is when corporate IT departments at customers actively blocked Meet so their minions used the "approved meeting solution".
As far as Teams vs. Meet, it seems like Teams is great when you are working with people that have climbed it's learning curve. Teams is also filled with UX paths where it takes one or two extra clicks (and thoughts) to do simple things like share a file.
I can give several. Most of them are not Meet specific but how Meets integrates with rest of Google Products. This is actually overall massive problem with Google Workspaces in general IMO.
1) You can call someone in Teams. Sure, Google Chat has "Start Meeting now" but it's very passive. I know some will see this as negative but ringing has massive advantages around UX.
2) Chat, Teams will persist chat after meeting ends. In fact, depending on how you hold the meeting, it might dump the contents into chat channel so it's preserved in more open manner.
3) You can have visible meetings. You can start a meeting in a Teams chat channel so everyone can see the meeting is going on. It creates that in the office feeling of two people working on a whiteboard nearby that if topic interests you, you can join in.
4) Sharing Documents, Add Word Document to meeting, everyone is granted permissions. Done.
The fact so many Google Workspaces companies have Slack is just frustrating. You have these two products that barely talk.
Auto-detect people not using headphones, and prevent them from speaking. Until they put some one.
Ideally showing them some customisable scolding message.
So far, any feedback / echo cancellation I've encountered just makes everybody's life miserable. Degradation to non-duplex voice (because all except the loudest speaker are attenuated down), "seaside noise" effect", etc.
This is the reason why a phone call over GSM or landline still often "feels" better than any HD video call with people on screens: Low-latency duplex audio.
Most* of this goes away if you just wear headphones.
Maybe this can be fixed by making the algorithm way more complicated, as the announced feature does. But I'd be surprised.
[*]: "Most":
2 people with headphones sitting near each other still cause echoes for each other and other participants. Fixing that is truly novel, and needed even for headphone users.
This is the part where I mention how Bose, despite making a $300 headset, didn't put a pin/wire on the wired connection for the mic.
(I usually use the laptop mic nonetheless, though, because for some reason if you want to record audio with bluetooth, the quality of the audio output becomes potato.)
The thing though is that it's entirely possible to have a setup with loud monitoring (e.g. performers on stage with floor wedges to hear the music and themselves) using speakers. It works well with directional dynamics and stage condensers close to your mouth, but fails horribly with the omni ECMs on your laptop/webcam several feet away.
I love my WFH setup with a dynamic right in front of my mouth and have the system properly rung out so I can monitor myself with no feedback. The cartoid pattern greatly reduces the sound coming from the off-axis speakers and the noise gate takes care of the rest. Obviously this isn't for everyone though and good luck having something this in the office unless you have a room with a door.
Kudos to Meet team for supporting this! In the video conf world, where most problems are well solved, this is such an amazing feature to differentiate and be customer first!
> Available for Google Workspace customers with the Gemini Enterprise, Gemini Business, Gemini Education, Gemini Education Premium, and the AI Meetings and Messaging add-on.
google meet is part of their office suite, called google workspace. the higher tiers of google workspace include access to the paid tier of gemeni.
also, in a previous iteration of google's AI branding, meet had a feature that would create llm-generated meeting notes for you. i'm unsure if this still exists
It's not related to Gemini. I suspect they're putting non-LLM stuff behind the Gemini paywall to make Gemini look more profitable than it is. Like most LLM rollouts, it could maybe become cashflow-positive, but it will likely never be profitable because of the many billions it takes to compete with OpenAI and Meta.
Bundling could also be a strategy to get people to try LLM products. You want X, you have to buy a package with both X and Y to get X. Many people will think, "I'm paying for Y, might as well try getting something out of it."
Meet is nice and nicer than Teams and Zoom in my opinion, but for 1-on-1 pair programming I think Facebook Messenger actually has the best experience - when you're on the same screen resolution, the screen sharing is 100% zoom (unlike Meet which has padding around the edges), and you can move seamlessly between mobile and desktop (i.e. answer a video call on mobile, the other party sets up screen sharing, and you go to a laptop to continue the call). Too bad I'm not automatically Facebook friends with all my co-workers!
What if the resulting audio output of the digitally converted human vocalization has an inaudible frequency modulation or signature added to it which operates in a range inaudible to the human ear, you could then detect said signal and filtered out its source or merged / manipulated
Now fixing search, that would be truly mind blowing.
What search? Some search within Meets? Or are you talking about Google Search in general, in which case I don't think the Meets team has much to do with that.
Do any of the video call services support full duplex audio? It feels so stifling to have to be perfectly silent while others are speaking, compared to the give and take of a normal conversation.
What you're running into is almost certainly noise cancelling. It's turning off your incoming audio so it doesn't intrude on your microphone picking up your voice.
Disabling or tweaking noise cancelling or audio modes (i.e. to "headset") on basically every meeting software will eventually get you a combination that doesn't do this. It's sometimes a bit hidden though, and many have chosen to just say "everyone gets maximum noise cancelling" rather than trying to guess based on your audio device(s) so it doesn't always do it automatically or obviously.
+1
And usually, the one person in the meeting not using headphones does not feel the problem: he can speak all the time and be heard correctly, while the others cannot interrupt him while he is speaking
Pretty sure literally all of them do, but generally only full duplex.
Not triplex or quadruplex etc.
In other words, if you're having a 1-1 video call, both of you have your audio working at all times.
But as you go to 3, 4, 5, 10, 20 participants, it generally continues to be just max 2 simultaneous audio streams, determined by whoever has been the loudest recently.
Unless you turn on special features like music mode etc.
This is a feature, not a bug, because otherwise background noise and sounds would start adding up to become intolerable. (Why we usually try to intentionally stay on mute anyways, so we don't accidentally become even just that second audio stream.)
Last N audio with N about 3 is pretty common. That might be you and three others, or three speakers including you. Beyond that isn't very common, because a) it's not very useful, and b) Google's WebRTC software (which is commonly embedded elsewhere) would only decode three audio streams anyway.
I feel like Zoom does the best job of the corporate offerings. Google Meet feels like the worst and most stifling.
I was talking to someone who specialises in corporate communication at a conference last week and she confirmed my feeling above as well. Her thing is “talking over each other is a key part of human interaction and forcing one at a time is stifling and unnatural”
They seem to be designed differently. The business ones try to replicate a business meeting where someone talks at the front of the room, shows slides, and prompts for questions. Discord is more like a LAN party.
Might be also related to the delay in communications. There's the time the packets traverse the networks, but also delays due to buffering to accommodate poor connections.
Total audio delay is from record buffering, sampling (typically 20 ms samples), encoding, packetization (1-5 samples per packet), time in transit, decode, jitter buffer, playout buffer.
You could reduce sample size, and send fewer samples per packet to reduce total delay, but overhead goes way up (overhead is near 50% at 20ms samples, one per packet). In theory, you should be able to do something nice for people doing audio and video by including audio on the video packets, but it's not simple, so I think most conferences don't do it.
We use google meet internally where i work but i often have to work with external companies using teams, zoom and whatever else. Google Meet just works, all day, everyday for us (barring a fucked up client audio setup...) and this is such a sweet feature we've struggled with, I genuinely look forward to testing this
Another advantage of Meet that it is under U.S. control and law. Zoom out their development team in China. That country actively targets dissidents. It’s easier when those controlling the software are easy for them to control.
Better to use Meet than Zoom if the Chinese are in your threat profile or a larger threat.
I'm surprised no one here has mentioned around.co yet. They've had this feature for years which was the main reason our team has been using it. Good to see other services catching up.
My company has conference rooms that have specialized hardware for Microsoft Teams meetings. The audio really sucks -- everyone must project their voice for people joining online to hear. If someone speaks in a low voice, nothing can be heard on Teams even though it is clear enough for people in the room. I don't understand how we haven't solved this problem yet.
The functionality here seems interesting, but I assume only works well if everyone or almost everyone brings a laptop. It probably won't work well for those situations where only one or two people take their laptops to an in-person meeting.
But when will Google roll out a stand alone app for meet on Windows and MacOS? It couldn't be that hard and it would have better performance then running in the browser.
What makes you think it'll have better performance if it was rewritten as a native app? All the underlying code that handles audio and network communication is already built using fast C++/Rust code. This includes things like WebRTC, audio and video capture, compression, the WebAudio API, etc.
In products like Meet, JavaScript just acts as a glue language that sticks all these APIs together. All the heavy lifting is actually done by the browser, and modern browsers are some of the most efficient language runtimes humanity has ever built.
The killer Slack feature is the ability to draw on the other end's screen. I really wish I had that sometimes in (small) Google Meet meetings, simply to avoid the "it's on your left. Your other left. Okay down 3 items. That was four items. Go up one. No, now you've gone up 2 items…"
Or mid-prod outage trying to get some engineer to type a command to run for you, and they can't dictate from speech to terminal, because they don't know shell/Unix commands at all. Then I can just write the command on their screen, for them to copy character by character… (to some extent, chat solves this. To some extent. I also wish we could get some basic markdown.)
What do you mean cross-reference to them? You can link to any heading. Just place your cursor on the heading and the page URL will update. You can then link to that either from within the document or externally.
You can’t write “see Figure 2 in Section 1.2.3” and have those numbers update automatically. This is used everywhere in academia and technical reports.
You can use it with just a personal Google account - does that require a phone number? Alternatively, someone with a Google account can create a meeting (https://meet.new) and send you the link, which you can join without any Google account at all.
I've always thought it would be nice if microphones can be merged when on a phone call with AirPods in. The mics in those headphones are really far away from my face and when it's windy talking on them is almost impossible. It would be nice to be able to just talk into the mic on my phone. Then the phone could boost my voice from multiple mics and filter out background noise.
Sounds imprrssive, but also overengineered? Don't headphones solve the same problem?
Though I guess in a hybrid meeting they'd make it harder to hear the people in the room. (But you'd hear them through the laptop, normally? I guess there would be an uncanny delay...)
I think you miss-understood the feature. It fixes the issue where there's multiple people in a room joining the same meeting from their laptops while sitting next to each other. It basically combines all microphones / speakers in that room into one big speaker / microphone just like a fancy conference setup.
Available for Google Workspace customers with the Gemini Enterprise, Gemini Business, Gemini Education, Gemini Education Premium, and the AI Meetings and Messaging add-on.
> Available for Google Workspace customers with the Gemini Enterprise, Gemini Business, Gemini Education, Gemini Education Premium, and the AI Meetings and Messaging add-on.
Odd. Today I was on a Meet with my girlfriend and her audio kept dropping to silence when she would put the phone down and walk around the kitchen. If she spoke loudly enough it would cut back in. As if it was waiting to recognize voice versus ambient noise.
We use Meet a lot, and I'd never noticed that happening before and wondered if it was due to some setting.
I think in most VC software (including Meet) there is a threshold you have to be above, or it's considered just background noise, and not transmitted.
In Discord, you can actually view your audio in reference to the threshold, which helps get that setting set appropriately for the hardware you have. No such luck for the "real" VC apps though.
I've done a fair amount of audio engineering, and my perception today of Meet was that it wasn't the simple threshold / noise gate that I would expect. For instance, I could see her moving her lips and putting down dishes, which I'm sure were noisy, but until she spoke loudly enough for a recognizable voice to overcome the dish sounds, she was essentially muted by the app. I think it was being gated based on vocal range or actual identifiable speech, not just volume. That was what seemed new to me.
I however think (emulated) stereo sound and low latency would do wonders. Sadly this feature here will only introduce latency.
Feedback cancellation for external speakers would also be amazing, so you can get rid of your headphones at home. Close up mic and noise gate works, but is fiddly (and no easy noise gate and compressor on linux...)
A bit off-topic but I'm going to take this opportunity to ask for help. In Zoom, I can share multiple screens/windows at once by CMD + Selecting multiple windows. How do I do that in Google Meet. I want to do a presentation and I want to show/share two separate screens/windows (not monitors) at once.
This amazing news for my hybrid team. It has always been a struggle figuring out with laptop to use, moving around so you get closer to the mic etc. Hopefully it is as seamless as they portray it here in the blog.
I’ve thought about why you couldn’t do this forever! Amazing technology but I always thought multiple audio input should be able to make a room a better source not worse and full of feedback.
Fantastic idea. This is the first actually useful (as in “step change useful”, not just “here’s another way of browsing your cat photos”) feature I’ve seen from Google in years.
does this mean that when a new person joins a meeting that isn't on mute by default, we won't have to deal with the echos from everybody else in the office while they scramble to find the mute button?
Are you using firefox? I use this extension called Stylus that lets you write custom css for pages and automatically apply it when you visit the page. There's probably something similar for chromium browsers
Yeah but why do they make the UX so painful? Its their product, they should make it better for the people who use it every day. WHy make our lives worse lol wtf, its good software so I obviously like using it, and want this small change and I'm sure someone like Google can swing it
Do other platforms already provide something similar? While I've been picking up calls mostly in meeting / silent rooms (times of gathering around a jabra like a camping bonfire are fortunately gone for now) - some of the users of a platform that I've built (https://flat.social) would occasionally experience the atrocious feedback whistle while being in physical proximity.
I'm talking about when I hear my own voice. It happens when the person I am talking to is using speakers and a microphone, and my voice is picked up by their speakers and transmitted back to me. I didn't know this was controversial.
It can happen on any service and it means the echo cancellation on their end got confused -- it's an adaptive algorithm but occasionally adapts wrong.
Best way to fix it is to mute yourself for ~5 seconds and then unmute, which should reset their echo cancellation algorithm.
If that doesn't fix it, then have both of you pause and not talk for ~5 seconds. This should absolutely reset it, as the echo cancellation algorithm has now definitively learned what silence is supposed to sound like.
The fact that you can't easily go full screen on Meets is bizarre. Makes it a non-starter for pairing. Also, the dedicated app's session lasts for like a day which makes it useless, so I'm stuck having to scroll through my 3 chrome windows and 40 tabs to see where my meets tab is. Although I'm sure the session timeout is some sort of gsuite configuration
I don't trust Google with barely anything these days anymore (except Gmail just because it has been so long, and Maps), but Google Meet is the one thing that I prefer Google's solution over anyone else's.
Meet is just so much better than Zoom, Teams, FaceTime, WhatsApp Video, etc.
I'm so glad they are tackling this specific issue. Pretty amazing feat if it works well.
Outputting audio from multiple laptops in the same room is easy. Perfectly syncing it is harder. Implementing echo cancellation across all of that is quite a bit trickier than regular single-device echo cancellation.
But then treating all the laptop microphones as a kind of microphone array, having to deal with sync issues and phase issues and background noise issues... that's hard core.
Kudos to the engineering team on this one. This is actually pretty amazing.