WebRTC: Video telephony without a browser plugin

Direct Line

© Lead Image © Yanik Chauvin, Fotolia.com

© Lead Image © Yanik Chauvin, Fotolia.com

Article from Issue 154/2013

The WebRTC protocol converts your web browser into a communications center, supporting video chat over a peer-to-peer connection without the need for helper apps or browser plugins.

People who use video chat and other forms of real-time Internet communication often rely on Skype or similar tools. Web browsers too often depend on Flash or Java plugins for real-time communication. The latest generation of browsers, however, offer a powerful new tool for building real-time communication into scripts and homegrown web applications. WebRTC (Real-Time Web Communication) [1] supplements the new HTML5 standard by bringing native real-time communication to the browser.

WebRTC can handle video chat and similar formats. Communication occurs directly from browser to browser, without the need for an intervening web application. In this article, I show how easy it is to build a homegrown Internet video chat application by integrating WebRTC with the usual collection of web developer tools: HTML, JavaScript, CSS, and Node.js.

WebRTC is jointly promoted by browser vendors such as Google, Mozilla, and Opera. (Microsoft considers WebRTC to be too complicated and has presented UC-RTC [2] as its own design for real-time communication in browsers.) Although the WebRTC specification is not yet complete, the Google Chrome and Mozilla Firefox 22 web browsers already largely support it. WebRTC is a free standard described in a set of IETF documents [3], and W3C has already accepted a draft for a programming interface [4] for WebRTC in the browser.

You'll find a demo video that describes some of WebRTC's capabilities on YouTube [5]. The demo, which comes from the Mozilla project, shows a video call from a Firefox browser on a cellphone via the public telephone network.

How It Works

Figure 1 shows the data flow for a WebRTC session. Application and configuration data go their separate ways: The server acts as a router that accepts configuration information (i.e., IP addresses or information about video and audio formats) from one browser and forwards it to the other. WebRTC calls this the signaling process. The HTTP or WebSocket protocol is a useful choice for transferring the configuration data, with JavaScript Object Notation (JSON) providing the structure.

Figure 1: Configuration and application data go separate ways in WebRTC: The configuration data are routed by an intermediate server, whereas the protocol transmits the application data from peer to peer.

After exchanging the IP addresses on the signaling server, the browsers establish a peer-to-peer connection, as shown in Figure 1. The connection is used to transmit the application data directly between browsers via TCP or UDP, saving time and data traffic.

To open the connection, WebRTC works around Network Address Translation (NAT) routers or firewalls through the use of Interactive Connectivity Establishment (ICE) [6]. ICE retrieves usable IP addresses and ports for sending and receiving data over peer-to-peer connections.

ICE first tries to determine the IP address and port using Session Traversal Utilities for NAT (STUN) [7]. To do so, it sends a message to a public STUN server and receives the sender's address in return. However, if the NAT router blocks STUN, the IP address and port are provided by a relay server on the web using Traversal Using Relays around NAT (TURN) [8]. ICE reports the appropriate IP addresses and ports to WebRTC by triggering a onicecandidate event. The data is wrapped, along with the protocol to be used – TCP or UDP – in an ICE candidate-type object.

Sample Application

The following sample application uses WebRTC in the browser to implement a video chat. The browser takes the image from the webcam and the sound from the microphone and passes it to a second browser via WebRTC. The client application in this article uses HTML, CSS, and JavaScript for the implementation. Node.js is used for the signaling server.

The example is initially restricted to the Firefox browser, but you can port it to Chrome. Figure 2 shows the sample application running in Firefox on Ubuntu Linux. The other end is using Firefox on Windows 7.

Figure 2: At a glance: On the left is the image from the local webcam; on the right, the image from the remote webcam.

The getUserMedia() programming interface [9] in the latest versions of Firefox, Chrome, and Opera provides access to the webcam and microphone on the local machine, but it first asks the user's permission in a pop-up. WebRTC combines the video and audio streams to a media stream, as shown in Figure 3, which it can then process. A media stream can contain any number of video and audio tracks; one audio track includes two stereo channels.

Figure 3: The media stream combines video and audio streams for use in HTML video and audio elements, as well as in peer-to-peer connections.

Video and Audio Signals

Listings 1-3 [10] demonstrate how to integrate the local webcam image into an HTML document (Figure 4). The HTML document in Listing 1 references two JavaScript files in the header area (lines 3-4). The core.js file contains functions for several examples in this article, such as mediastream.js. In the body of the HTML document in Listing 1, the video element with the ID local is waiting in line 8 for a connection to a media stream.

Listing 1

HTML with a Video Element


Figure 4: Initial success: Firefox playing the image from the local webcam in the browser.

The call to the mozGetUserMedia() JavaScript method in Listing 2 (lines 2-8) grabs the images from the webcam and the sound from the microphone and bundles it into a media stream. The code in lines 4-6 passes in the stream as a parameter to the callback function. The connectStream() function in line 5 (and in Listing 3) plays the media stream in the video element. If a problem occurs, a callback function is called in line 7 to handle errors. Firefox makes the specification of a callback function for this case mandatory.

Listing 2



Listing 3

core.js (Excerpt)


Listing 3 shows the JavaScript connectStream() function for the Firefox browser. Line 2 uses the querySelector() method with a CSS selector to select an element from the HTML document – preferably a video element. Line 4 binds the media stream to this element using the Firefox-specific mozSrcObject attribute.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • An open source, federated video sharing platform

    With PeerTube you can self-host your videos without the limitations embedded in YouTube and similar platforms.

  • Easy Video Telephony

    A video telephony system with huge user benefits does not have to be complex. This project starts a phone call with just a single button press and switches channels automatically on a TV.

  • PeerTube

    PeerTube, the Fediverse's video platform, offers a decentralized, open source way to watch videos and live stream your own content. We'll show you how to get started and even set up your own instance.

  • Tube Archivist

    Tube Archivist indexes videos or entire channels from YouTube in order to download them with the help of the yt-dlp tool.

  • Post-PRISM Privacy

    Linux users didn't need the recent NSA eavesdropping scandal to convince them that securing communication was a good idea. Free software developers have been creating secure tools for years that offer similar functionalities to all of those popular but very leaky services with ridiculous names.

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More