New look at WebRTC usage in Google Meet

I hadn't looked at Google Meet webrtc internals for a while so while I was having a meeting last week I decided that it was a good time to check what were the latest changes that had been added.

P2P Connections

One of the first things that I checked was if Google Meet was using P2P connections when there are only two participants in the room and I was surprised that it was not the case.   P2P support was included in the past (P2P-SFU transitions discussion) but apparently has been removed.

This increase the infrastructure cost (not an issue for Google) and increase a bit the end to end latency for the 1:1 calls but given that Google Meet is probably deployed in many points of presence that's probably not a big increase and the simplicity of not having to handle another type of connections and the transition between them (P2P <-> SFU) is a big benefit so it looks reasonable.

ICE candidates and (NO) TURN servers

Google Meet is not configuring any ICE servers anymore and their servers are providing candidates for both IPv4 and IPv6 and types: UDP (random port) + TCP (same random port as UDP) + UDP (3478 port) + SSLTCP (443 port).

This is the iceServers configuration that is empty as you can see:, { iceServers: [], iceTransportPolicy: all, bundlePolicy: max-bundle, rtcpMuxPolicy: require, iceCandidatePoolSize: 0, sdpSemantics: "unified-plan", extmapAllowMixed: true }, {advanced: [{googScreencastMinBitrate: {exact: 100}}]}

The fact that it doesn't use any TURN servers shouldn't be a huge issue given that the servers have support for SSLTCP and was seen also also in other apps in the past like Jitsi or Houseparty.   The main implications could be the lack of support in other browsers like firefox and some very strict proxies/firewalls blocking the pseudo-SSL connection used by SSLTCP.

New RTP header extensions

Google Meet has always used many header extensions and this time and saw two that I hadn't seen in the past.

One header extension is related to the new AV1 video codec and used to provide scalability information that the server can use for selective filtering and forwarding of video layers.  It is interesting because AV1 is not used by Google Meet but maybe this is an indication of some work in progress or the support of the new codec in internal versions of the product. 


(@murillo&@HCornflower mentioned in Twitter that even if this header extension was defined for AV1 it can also be used also for VP9 and that is apparently the case based on a quick test with a sniffer)

The other header extension provides information about target bitrates, resolutions and framerates of the layers so that the server can use that information to decide which layer to forward instead of making those decissions based on the bitrate received like all the SFUs do today.  The advantage is that this value is more stable and reliable than the calculated one that can have temporal spikes depending on. the network and pacing implementation and even the content being sent.


Special RTP Control Messages (RTCP)

There were a couple of RTCP features in the SDP negotiation that got my attention.

The first one is a new RTCP message called RRTR.  Well, it is not really new as it was standardized a long time ago (RFC3611) but I haven't seen it used till now.   The purpose of the RRTR message and the corresponding to DLRR is to be able to calculate the round trip time for receiving streams in a similar way to how it is calculated for sending streams with SR and RR messages.     Basically RRTR sends a timestamp and DLRR sends back the original timestamp plus the delay between RRTR and DLRR messages.

a=rtcp-fb:96 rrtr

The other interesting thing is that Google Meet enables transport-cc RTCP messages to use audio packets also for bandwidth estimation and not only video packets like most of the platforms do.   This was problematic in the past but apparently it is safe enough now and in that case it looks like a very good idea to have it enabled.

a=rtpmap:111 opus/48000/2
a=rtcp-fb:111 transport-cc

Video Encoding

Google Meet uses VP9 with profile 0 although for some reason it falls back to VP8 whenever a Firefox browser joins the room.   One interesting thing about the video encoding is the attribute "useadaptativelayering" included in the SDP negotiation.  The attribute has been there for a while and there are two versions of it but there is nothing about it in the opensource version of Chromium/WebRTC and neither in the Chrome binary 🤔, so I would assume that is something only used in an old or custom version of libwebrc (mobile?).

a=fmtp:96 useadaptivelayering_v2=true; useadaptivelayering=true

Audio Encoding

Google Meet is using opus as expected (with dtx and inbandfec enabled) but the most interesting part is that the usage of redundancy encoding (red) is also negotiated for the audio channels.  This encoding allows sending multiple copies of the same packet when needed, to provide better robustness against packet loss (detailed analysis of RED in WebRTCHacks).

But even if it is negotiated it is not really active for sending because it is included in the codecs list after opus. I also tested it introducing a 15% packet loss and checking the packets sent and received with a sniffer.

a=rtpmap:111 opus/48000/2

a=fmtp:111 minptime=10; useinbandfec=1; stereo=0; sprop-stereo=0; usedtx=1
a=rtpmap:63 red/48000/2

Media Acquisition

The last thing I took a look at is at the configuration requested from the microphone and camera.   The most interesting thing I found was the fact that the framerate is limited to 24fps explicitly when the default is usually 30fps.


Video constraints: {deviceId: {exact: ["xxxxxxxxx"]}, advanced: [{frameRate: {min: 24}}, {height: {min: 720}}, {width: {min: 1280}}, {frameRate: {max: 24}}, {width: {max: 1280}}, {height: {max: 720}}, {aspectRatio: {exact: 1.77778}}]} 

There are other things that can be interesting from Google Meet like the usage of a single peer connection (apart from the extra one for the screen sharing), the creation of offers always from the browser side, the multiplexing of audio over 3 transceivers or the codec selection (VP9) but those has been like that for a while and there are other places where they have been discussed.

What do you think? Any other interesting features that you found in Google Meet?  Feel free to share those here or in Twitter.


Popular posts from this blog

Bandwidth Estimation in WebRTC (and the new Sender Side BWE)

Improving Real Time Communications with Machine Learning

Controlling bandwidth usage in WebRTC (and how googSuspendBelowMinBitrate works)