Multimedia Infrastructure

The “Multimedia Infrastructure” project surveys and investigate tools and technologies within the multimedia domain. A trend towards more openness in software solutions for multimedia applications as well as decreasing cost for high quality multimedia hardware enable new use of multimedia in education and at new scales.

Among topics addressed by this project are
– Streaming
– Multimedia collaboration (high quality videoconferencing, web-meetings)
– Audio challenges
– Search and management of multimedia content

The “Multimedia Infrastructure” project follow up work performed in the completed “Open Multimedia Tools” project.

Live lecture streaming and capture by smartphones

Smart-phones only with standard wired hands-free has recently been applied as capturing devices to capture lectures with UNINETT IoUs experimental Live Streaming service, with reasonable success.

Many cloud services support live streaming and recording from smart-phones (e.g. Justin.tv, Youtube, UStream, facebook,…) however they typically handle a single stream per channel.

Live streaming a lecture with a single mobile phone as capture device has its challenges with respect to placement.

  1. A medium distance from the lecturing scene gives the best video quality. Less zooming means better quality as zoom on smart-phones is digital, i.e. the resolution is lowered and the image is cropped.
  2. A long distance is preferable to be non-invasive. Placing a camera “up in the face” of a lecturer will often make him feel uncomfortable. Capturing both the blackboard and a projection screen is usually also easier from a longer distance.
  3. A very short distance is required if the built in microphone or a standard wired hands-free set is to be used for audio capture.

Pin 3 may potentially be amended by applying a bluetooth hands-free device. We did not have a suitable bluetooth device, however we had multiple phones (with cracked displays) laying around. Investing in a new wireless device seemed unnecessary as we already had several.

Our solution was to install MIV Dev team’s “RTP Camera” app on two Android phones. We configured both to push their streams to the same server. We manually “merged” the SDPs output by the phones such that audio was received from our phone with wired hands-free connected to it, and video from the phone mounted on our camera stand. Ref. the images below.

The phone capturing and streaming audio was place in the speaker’s pocket. The ear plugs of wired hands-free was stuffed inside the speaker’s collar leaving the hands-free mic outside on the chest.

To be as none-invasive as possible we placed the video capture rig at the upper row of the lecture theater. Hence we chose to trade off video quality some what.

The resulting live stream and recording had a quality “good enough”  according to the students following the course where capturing was tested. Audio quality was very good, however the video quality has room for improvement. Wireless streaming of high quality video using RTP has its challenges with respect to packet loss.

A commercial alternative to our setup seems to been getting close to production.  higgs.live is about to offer and app for multi-phone live streaming to facebook and Youtube.

Live Audio and Video over WebRTC’s datachannel

UNINETT IoU has over the summer developed a WebRTC demonstrator which attempts something “naughty”…

As part of our work on WebRTC as well as our work within low latency collaboration tools, we decided to find an answer to the following research questions:

Is it possible to transfer live audio and video over the data-channel in WebRTC?
If yes, can we achieve lower latency with data-channels than wit WebRTC media-channels?

Our demonstrator, titled WebRTC data-media, is now available (also on github.) In short the demonstrator

  • consists of a node.js based server and a html+css+javascript based webrtc client,
  • applies the socket.io framework to provide “rooms” for peers to communicate basic signaling,
  • sets up a separate independent data-channels for audio and video content,
  • applies “getUserMedia” to grab live audio and video from microphone and camera,
  • applies the “ScriptProcessorNode” class to grab, transfer, and play out raw audio samples,
  • applies canvas‘s “drawImage”” and “toDataURL” to grab, compress and send video frames

The implementation of the demonstrator is a success. Both live audio and video is transferable over webRTC data-channels. Hence the answer to our first question is a definitive “yes”.

However measurements (to be published in our Multimedia Delay Database) show no significant improvement in delay compared to what “vanilla” WebRTC multimedia channels can offer.

For audio, delay is at best similar, but raw data-channel-audio degrades in quality when buffer lengths are reduced to the supported minimum for ScriptProcessorNode, i.e. 265 samples. Packet loss/jitter is probably caused by the fact that ScriptProcessorNode’s javascript code is executed in the web page’s main thread. Utilizing the upcoming AudioWorklet API will potentially imporve upon this since separate threads for audio processing will be available. However AudioWorklets are (when this article is written) not yet supported by any browser. (Only a developers patch seems to exist for Chromium.)

For video, delay is also very similar, at best slightly better with data-channel transfer. The most significant limiting factor in this case seem to be a combination of maximum frame rate provided by the available cameras and the necessary buffering (and buffer copying) of video frames in the code.  A maximum of 30 frame per second implies an added 33ms delay for each frame buffered.

Attempts were made (in early version of the demonstrator) to minimize buffering by pushing raw uncompressed video frames across the data channel. But as the data channel capacity was limited to ~150 Mbps , only very low resolution video (less than VGA) was possible to transmit. Hence no measurements were performed for this version. If data-channel capacity can be increased and/or buffer handling made more efficient by applying multi-threading via Worklets, is currently an open open question.

A future version of the demonstrator will aim to implement and utilized Worklets both for audio- and video-processing.

(Note: This blog will be updated with diagrams an explicit results soon…)

High Quality Multimedia Tools for Sophisticated Users

During 2015 the Multimedia Infrastructure innovation project (MMI2015) has continued UNINETT’s work towards a future infrastructure capable of satisfying high requirements to multimedia content and transport quality.

I MMI 2015 a fair amount of time has been spent collaborating with test-users from the domain of music and musical theaters. Department of Music at NTNU and Dokkhuset scene has been the most active user group, but other Norwegian institution including University of Tromsø and University of Oslo has joined in on experiments.

This user group, musicians and musical producers, has an very important and valuable quality. They have very strict requirements to sound quality as well as to latency aspects of both sound and video. For UNINETT, collaborating with such users is valuable. Very likely they represent the more general users of the future, i.e. the users UNINETT will be providing resources and services to.

During 2015 the MMI2015 project has contributed to several achievements.

  • Cribrum – a multi-display manager: MMI2015 has supervised a team of student (the screen team) from the Department of Computer and Information Science at NTNU in developing a user friendly open source software package for management of multimedia content in a multi-display infrastructure. Dokkhuset scene’s 8 screen infrastructure has been the target test bed. The resulting tool is operational and available on-line at https://sourceforge.net/projects/cribrum/  . Cribrum has already been applied in educational settings.
  • streamer.uninett.no: A foundation for a new version of this test-service for multimedia streaming has been developed and tested. The new service depends only on open source tools with ffmpeg as the core transcoder and segmenter. MPEG DASH was attempt applied as streaming transport technology. However it turns out that DASH is not yet straight forward supported in browsers on desktop, Android and IOS devices. The current version of the service hence applies HTTP Live Streaming as its transport format. By applying Momovi.com‘s open source HLS player for the browser, the streaming service makes content available on “all” devices with adaptive bit-rate control streaming as well as time-shift features. The finale “polish” to the service remains before it become available on-line.
  • High capacity music collaboration infrastructure: MMI2015 has been supporting NTNU’s work on establishing high capacity 10Gbps links to relevant rooms and halls for musical collaboration in the city of Trondheim. Three larger venues, Dokkhuset scene, Olavshallen, Rockheim and NTNUs orgelsal are now connected. Two studios at the Department of Music is also connected, enabling multiple students in parallel to test theirs skill in live production of concerts and events without needing to be on-site. Work on getting NRK, the Norwegian Television Broadcaster connected directly too is ongoing.
  • Business models for streaming at Dokkhuset scene: MMI2015 has supervised two Msc student in their work on a business model analysis for future multimedia steaming services at Dokkhuset scene.

Most of the activities persued in MMI 2015 has been suggested to be continued in a Multi media Infrastructure 2016.