Live Audio and Video over WebRTC’s datachannel

UNINETT IoU has over the summer developed a WebRTC demonstrator which attempts something “naughty”…

As part of our work on WebRTC as well as our work within low latency collaboration tools, we decided to find an answer to the following research questions:

Is it possible to transfer live audio and video over the data-channel in WebRTC?
If yes, can we achieve lower latency with data-channels than wit WebRTC media-channels?

Our demonstrator, titled WebRTC data-media, is now available (also on github.) In short the demonstrator

  • consists of a node.js based server and a html+css+javascript based webrtc client,
  • applies the socket.io framework to provide “rooms” for peers to communicate basic signaling,
  • sets up a separate independent data-channels for audio and video content,
  • applies “getUserMedia” to grab live audio and video from microphone and camera,
  • applies the “ScriptProcessorNode” class to grab, transfer, and play out raw audio samples,
  • applies canvas‘s “drawImage”” and “toDataURL” to grab, compress and send video frames

The implementation of the demonstrator is a success. Both live audio and video is transferable over webRTC data-channels. Hence the answer to our first question is a definitive “yes”.

However measurements (to be published in our Multimedia Delay Database) show no significant improvement in delay compared to what “vanilla” WebRTC multimedia channels can offer.

For audio, delay is at best similar, but raw data-channel-audio degrades in quality when buffer lengths are reduced to the supported minimum for ScriptProcessorNode, i.e. 265 samples. Packet loss/jitter is probably caused by the fact that ScriptProcessorNode’s javascript code is executed in the web page’s main thread. Utilizing the upcoming AudioWorklet API will potentially imporve upon this since separate threads for audio processing will be available. However AudioWorklets are (when this article is written) not yet supported by any browser. (Only a developers patch seems to exist for Chromium.)

For video, delay is also very similar, at best slightly better with data-channel transfer. The most significant limiting factor in this case seem to be a combination of maximum frame rate provided by the available cameras and the necessary buffering (and buffer copying) of video frames in the code.  A maximum of 30 frame per second implies an added 33ms delay for each frame buffered.

Attempts were made (in early version of the demonstrator) to minimize buffering by pushing raw uncompressed video frames across the data channel. But as the data channel capacity was limited to ~150 Mbps , only very low resolution video (less than VGA) was possible to transmit. Hence no measurements were performed for this version. If data-channel capacity can be increased and/or buffer handling made more efficient by applying multi-threading via Worklets, is currently an open open question.

A future version of the demonstrator will aim to implement and utilized Worklets both for audio- and video-processing.

(Note: This blog will be updated with diagrams an explicit results soon…)

Leave a Reply

Your email address will not be published. Required fields are marked *