How are Images/Videos sent in WhatsApp?

I've been involved in the development on different IM&P services in the past and always one of the core features was the ability to share media files (audio, video, pdf...) with other users or groups.


Probably the most popular of these services was TU Me that was based on SIP but used HTTP to upload and download the files from a central storage. SIP was used to send the message with the url to the other end so that he can download the media file.

In cases where I had to use XMPP there were typically two approaches at least if you need to provide offline messaging:
  • Some inefficient inband alternatives where the media files are sent using the XMPP connection.
  • Custom (or not widely supported like xep-0363) extensions based on HTTP Uploads.
But today I was curious on how this is done in WhatsApp.  I found this opensource python implementation (yowsup) that looks great and got an idea on how it could be done but I wanted to confirm what is the protocol used by the most recent WA clients and also get some captures of the messages to see how everything looks like.

Obviously the easiest way was to use to WA Web Client because you have the Chrome developer tools in Chrome so I tried sending and receiving some files from the browser.

Sending a file

1/ Request a url to upload the media file

The first thing the client does when it needs to send a media file is to request an upload url from the server.  This is done using the signaling channel over WebSockets with a message with the action "encr_upload", the type of the content ("image" or "video") and an long id that looks like a hash of the file to be sent.


The server replies with a "status=200" and a dedicatdd url where the file has to be uploaded.  In my case the url was: https://mmg.whatsapp.net/u/f/CIX33...._kOA/Ajlc....78Xf

2/ Upload the media file

The file is uploaded to the url received in the previous message using a HTTP POST as you can see in the next capture:


It looks like if the file is uploaded encrypted (appart from using HTTPS) probably as part of the E2E encryption feature in WhatsApp.

3/ Send a message to the recipient of the media file

After the file is uploaded to the mmg.whatsapp.net server the client sends a message using the WebSocket connection to send the actual message to the other participant (yellow line in the first capture).  This message was bigger than expected (2-10K) so I thought it was very likely to include a thumbnail/preview.

I confirmed that in the receiver side.   The size of the preview image in base64 shown in the UI was more or less the size of that signaling message.

Comments

Popular posts from this blog

Bandwidth Estimation in WebRTC (and the new Sender Side BWE)

Controlling bandwidth usage in WebRTC (and how googSuspendBelowMinBitrate works)

Different WebRTC flavours from a server/platform point of view