Making WebRTC Simple with conversat.io

WebRTC is awesome, but it’s a bit unapproachable. Last week, my colleagues and I at &yet released a couple of tools we hope will help make it more tinkerable and pose a real risk of actually being useful.

As a demo of these tools, we very quickly built a simple product called conversat.io that lets you create free, multi-user video calls with no account and no plugins, just by going to a url in a modern browser. Anyone who visits that same URL joins the call.

conversat.io

The purpose of conversat.io is two fold. First, it’s a useful communication tool. Our team uses And Bang for tasks and group chat, so being able to drop a link to a video conversation “room” into our team chat that people can join is super useful. Second, it’s a demo of the SimpleWebRTC.js library and the little signaling server that runs it, signalmaster.

(Both SimpleWebRTC and signalmaster are open sourced on Github and MIT licensed. Help us make them better!)

Quick note on browser support

WebRTC currently only works in Chrome stable and FireFox Nightlies (with the media.peerconnection.enabled preference enabled in about:config).

Hopefully we’ll see much broader browser support soon. I’m particularly excited about having WebRTC available on smartphones and tablets.

Approachability and adoption

I firmly believe that widespread adoption of new web technologies is directly corellated to how easy they are to play with. When I was a new JS developer, it was jQuery’s approachability that made me feel empowered to build cool stuff.

My falling in love with javascript all started with doing this with jQuery:

$('#demo').slideDown();

And then seeing the element move on my screen. I knew nothing. But as cheesy as it sounds, this simple thing left me feeling empowered to build more interesting things.

Socket.io did the same thing for people wanting to build apps that pushed data from the server to the client:

// server:
client.emit("something", {
    some: "data"
});
// client:
socket = io.connect();
socket.on("something", function (data) {
    // here's my data!
    console.log(data);
});

Rather than having to figure out how to set up long-polling, BOSH, and XMPP in order to get data pushed out to the browser, I could now just send messages to the browser. In fact, if I didn’t want to, I didn’t even have to think about serializing and de-serializing. I could now just pass simple javascript objects seamlessly back and forth between the client and server.

I’ve heard some “hardcore” devs complain that tools like this lead to too many poorly made tools and too many “wannabe” developers who don’t know what they’re doing. That’s garbage.

Approachable tools that make developers feel empowered to build cool stuff is the reason the web is as successful and vibrant as it is.

Tools like this are the gateway drug for getting us hooked on building things on these types of technologies. They introduce the concept and help us think about what could be built. Whether or not we ultimately end up building the final app with the tool whose simplicity introduced it to us is irrelevant.

The potential of WebRTC

I’m convinced WebRTC has the potential to have a huge impact on how we communicate. It already has for our team at &yet. Sure, we already used stuff like Skype, Facetime, and Google Hangouts. But the simplicity and convenience of just opening a URL in a browser and instantly being in a conversation is powerful.

Once this technology is broadly available and on mobile devices, it’s nothing short of a game changer for communications.

Challenges

There are definitely quite a few hurdles that get in the way of just playing with WebRTC: complexity and browser differences in instantiating peer connections, generating and processing signaling messages, and attaching media streams to video elements.

Even at the point you have those things, you still need a way to let two users find each other and have a mechanism for each user to send the proper signaling messages directly to the other user or users that they want to connect to.

SimpleWebRTC.js is our answer to the clientside complexities. It abstracts away API differences between Firefox and Chrome.

Using SimpleWebRTC

At its simplest, you just need to include the SimpleWebRTC.js script, provide a container for your local video, a container for the remote video(s) like this:

<!DOCTYPE html>
<html>
    <head>
        <script src="http://simplewebrtc.com/latest.js"></script>
    </head>
    <body>
        <div id="localVideo"></div>
        <div id="remoteVideos"></div>
    </body>
</html>

Then in you just init a webrtc object and tell it which containers to use:

var webrtc = new WebRTC({
    // the id of (or actual element) to hold "our" video
    localVideoEl: 'localVideo',
 
    // the id of or actual element that will hold remote videos
    remoteVideosEl: 'remoteVideos',
 
     // immediately ask for camera access
    autoRequestMedia: true
});

At this point, if you run the code above, you’ll see your video turn on and render in the container you gave it.

The next step is to actually specify who you want to connect to.

For simplicity and maximum “tinkerability” we do this by asking that both users who want to connect to each other join the same “room”, which basically means: call “join” with the same string.

So, for demonstration purposes we’ll just tell our webrtc to join a certain room once it’s ready (meaning it’s connected to the signaling server). We do this like so:

// we have to wait until it's ready
webrtc.on('readyToCall', function () {
    // you can name it anything
    webrtc.joinRoom('your awesome room name');
});

Once a user has done this, he/she is ready and waiting for someone to join.

If you want to test this locally, you can either open it in Firefox and Chrome or in two tabs within Chrome. (Firefox doesn’t yet let two tabs both access local media).

At this point, you should automatically be connected and be having a lively (probably very echo-y!) conversation with yourself.

If you happen to be me, it’d look like this:

henrik in conversat.io

The signaling server

The example above will connect to a sandbox signaling server we keep running to make it easy to mess around with this stuff.

We aim to keep it available for people to use to play with SimpleWebRTC, but it’s definitely not meant for production use and we may kill it or restart it at any time.

If you want to actually build an app that depends on it, you can either run one yourself, or if you’d rather not mess with it, we can host, and keep up to date, and help scale one for you. The code for that server is on github.

You can just pass a URL to a different signaling server as part of your config by passing a “url” option when initiating your webrtc object.

So, what’s it actually doing under the hood?

It’s not too bad, really. You can read the full source of the client library here: https://github.com/HenrikJoreteg/SimpleWebRTC/blob/master/simplewebrtc.js and the signaling server here: https://github.com/andyet/signalmaster/blob/master/server.js

The process of starting a video call in conversat.io looks something like this:

  1. Establish connection to the signaling server. It does this with socket.io and connects to our sandbox signaling server at: http://signaling.simplewebrtc.com:8888

  2. Request access to local video camera by calling browser prefixed getUserMedia.

  3. Create or get local video element and attach the stream that we get from getUserMedia to the video element.

    firefox:

    element.mozSrcObject = stream; element.play();

    webkit:

    element.autoplay = true;
    element.src = webkitURL.createObjectURL(stream);

  4. Call joinRoom which sends a socket.io message to the signaling server telling it the name of the room name it wants to connect to. The signaling server will either create the room if it doesn’t exist or join it if it does. All I mean by “room” is that the particular socket.io session ID is grouped by that room name so we can broadcast messages about people joining/leaving that room to only the clients connected to that room.

  5. Now we play an awesome rocket lander game that @fritzy wrote while we wait for someone to join us:

  6. When someone else joins the same “room” we broadcast that to the other connected users and we create a Conversation object that we’ve defined which wraps the browser’s peerConnection. The peer connection represents, as you’d probably guess, the connection between you and another person.

  7. The signaling server broadcasts the new socket.io session ID to each user in the room and each user’s client creates a Conversation object for every other user in the room.

  8. At this point we have a mechanism of knowing who to connect to and how to send direct messages to each of their sessions.

  9. Now we use the peerConnection to create an “offer” and store our local offer and set it in our peer connection as the local description. This contains information about how another client can reach and talk to our browser.

    peerConnection.createOffer();

    We then send this over our socket.io connection to the other people in the room.

  10. When a client receives and offer we add it to our peer connection:

    var remote = new RTCSessionDescription(message.payload);
    peerConnection.setRemoteDescriptionremoteDescription);

    and generate an answer by calling peerConnection.createAnswer() and send that back to the person we got the offer from.

  11. When the answer is received we set it as the remote description. Then we create and send ICE Candidates much in the same way. This will negotiate our connection and connect us.

  12. If that process is successful we’ll get an onaddstream event from our peer connection and we can then create a video element and attach that stream to it. At this point the video call should be in progress.

If you wish to dig into it further, send pull requests and file issues on the SimpleWebRTC project on github.

The road ahead

This is just a start. Help us make this stuff better!

There’s a lot more we’d like to see with this:

  1. Making the signaling piece more pluggable (so you can use whatever you want).
  2. Adding support for pausing and resuming video/audio.
  3. It’d be great to be able to figure out who’s talking and emit an event to other connected users when that changes.
  4. Better control over handling/rejecting incoming requests.
  5. Setting max connections, perhaps determined based on HTML5 connection APIs?

Hit me up on twitter (@henrikjoreteg) if you do something cool with this stuff or run into issues or just want to talk about it. I’d love to hear from you.

Keep building awesome stuff, you amazing web people! Go go gadget Internet!

About Henrik Joreteg

Henrik Joreteg is a Partner at &yet, where he’s written dozens of realtime apps two dozen ways. At &yet, he works on And Bang and provides consulting and training on JavaScript and HTML5 applications. Henrik also curates RealtimeConf and the Keeping it Realtime Newsletter. He believes WebRTC is the most interesting technology to hit the web in many years.

More articles by Henrik Joreteg…

About Robert Nyman [Editor emeritus]

Technical Evangelist & Editor of Mozilla Hacks. Gives talks & blogs about HTML5, JavaScript & the Open Web. Robert is a strong believer in HTML5 and the Open Web and has been working since 1999 with Front End development for the web - in Sweden and in New York City. He regularly also blogs at http://robertnyman.com and loves to travel and meet people.

More articles by Robert Nyman [Editor emeritus]…


27 comments

  1. Julien

    Ho, woah, that’s awesome. Can’t wait to get rid of skype and co’s to favor this kind of open technology!

    March 21st, 2013 at 02:44

  2. louisremi

    “I’ve heard some “hardcore” devs complain that tools like this lead to too many poorly made tools and too many “wannabe” developers who don’t know what they’re doing. That’s garbage.

    Approachable tools that make developers feel empowered to build cool stuff is the reason the web is as successful and vibrant as it is.”

    Amen.

    March 21st, 2013 at 05:03

    1. Steve Price

      Preach it, brother.

      March 21st, 2013 at 11:44

  3. Nikos Roussos

    and an open-source alternative:
    http://hibuddy.monkeypatch.me/
    :)

    March 21st, 2013 at 09:00

    1. Henrik Joreteg

      That’s what’s awesome about these technologies. There will be lots of alternatives. Solving the federation problem could make it really interesting.

      Also, To clarify, both the signaling server and client library are MIT licensed.

      The conversat.io site itself is a static html page that just uses those tools. “view source” FTW :)

      March 22nd, 2013 at 07:03

  4. Chris Peterson

    I’m a developer, but not really a web developer. It’s posts like this that make me realize that I need to extend my skillset to include javascript so I can do cool stuff like this.

    March 21st, 2013 at 11:10

  5. Fabian

    Just saying: very cool! :)

    March 21st, 2013 at 12:39

  6. Robert O’Callahan

    Firefox supports createObjectURL. Also, we just landed a patch for nightly/FF22 (soon to be on FF21 as well) that makes autoplay work.

    March 21st, 2013 at 19:28

    1. Henrik Joreteg

      Excellent news, thanks!

      March 22nd, 2013 at 07:04

  7. Clayton Gulick

    The thing that still confuses me about WebRTC is the ICE negotiation of firewalls. The way I understand it, you’ll pretty much never get an actual peer-to-peer connection because everyone is sitting behind local firewalls, so all the traffic needs to be proxied through a central server (which sort of defeats the whole purpose). I’m afraid of setting up a WebRTC server experiment, because it seems like I’ll end up paying for a ton of proxied bandwidth. I’m confused by the ICE docs I’ve read though, so I could be missing something obvious. How does ICE allow peer-to-peer when both parties are sitting behind some SOHO router?

    March 30th, 2013 at 19:45

    1. Clayton Gulick

      To be more specific, I mean in the case where TURN is employed – i.e. symmetric NAT. I understand UDP hole-punching. It’s the TURN fallback of ICE which makes me nervous, since I don’t have a good sense of how frequently it will be used, and what the bandwidth costs will be.

      March 30th, 2013 at 20:03

      1. Henrik Joreteg

        Hi, Clayton. Unfortunately I’m not quite familiar enough with NAT traversal to be of much help in this case. Perhaps Robert (this blog’s editor) could direct you to someone at Mozilla who could help clarify this.

        April 7th, 2013 at 08:19

        1. Robert Nyman [Editor]

          Best way would probably be to ask the #media channel in irc://irc.mozilla.org.

          April 8th, 2013 at 10:11

  8. Fran

    Awesome stuff. I’ve been following WebRTC since several months ago and to find people working on this kind of tools is really appreciated.

    April 1st, 2013 at 15:12

    1. Henrik Joreteg

      Awesome! Happy for any feedback and/or pull requests :)

      Cheers!

      April 8th, 2013 at 08:57

  9. Damian

    So how easy is it to keep out uninvited people? I’m sure it won’t take long before unsavory types realise that they can just go to a url and make trouble. Vocal spam perhaps, or simply people eavesdropping.

    I really like these open innovations that come out, but you have to remember that not everyone on the internet plays nicely.

    April 2nd, 2013 at 15:07

    1. fileneed

      well, on tinychat they just have some big shiny kick and ban buttons for that…

      April 6th, 2013 at 20:12

    2. Henrik Joreteg

      This is definitely something that needs to be supported. Simply throwing everyone immediately into a chat was a quick-start way to remove a many barriers to entry as possible.

      In terms of how it would work, it’s something that needs to be done by the signaling server. Rather than immediately joining when someone else hits that url it would need to send a message saying a new user wants to join. In that case, you’d also want to have give people some way to say who they are, etc.

      This would not be too terribly hard to add. It just adds complexity when the goal was to get people connected as quickly/simply as possible with as few lines of required code as possible. But that’s also why we open sourced it so people can tweak it to their needs.

      Anyway, for the purposes of conversat.io, in order to keep things dead simple for this app/library we opted to just use the guessability of the URL determine the security in the room.

      A little trick if you’re worried about unexpected guests: If you just hit the “let’s go” button on the conversat.io page without entering anything it will create a room with a long random string as the url, for example: http://conversat.io/5e06147d-55f3-48d9-8e85-511a8130de8e . This makes it easy to generate more “secure” rooms than simply doing something like http://conversat.io/test which is much more likely to have uninvited visitors.

      April 8th, 2013 at 08:54

  10. Guo

    Hi, are you developing it for supporting smartphones? In other word, how can you make it, just use the webRTC library Firefox supports or you write some other JS codes?

    April 5th, 2013 at 00:27

    1. Henrik Joreteg

      That’s mostly up to smartphone and mobile browser manufacturers. Since it’s built on the WebRTC standard if support exists in mobile browsers then it should Just Work™ :)

      April 8th, 2013 at 08:56

      1. Guo

        Can WebRTC be a part of HTML 5? Or will Firefox Moblie (Android) support it in the future? I do think it may help couples who are in long distance relationships
        .
        And another question: If I use Python or some other programming languages, can I build a web app like conversat.io? Or WebRPC takes fewer memory?

        April 8th, 2013 at 19:04

        1. Robert Nyman [Editor]

          WebRTC is a separate draft at W3C. And yes, the idea is to have support on mobile as well.
          When it comes to building apps, I’m sure you’ll be able to build a good one.

          April 8th, 2013 at 23:52

  11. Jon Ellis

    Web RTC is a truly a breath of fresh air. We have just implemented it for our online tutoring app, as a replacement for Flash. Pity it’s only available on Chrome at the moment, any idea of the implementation date on Firefox?

    Customer feedback has been excellent. We have blogged about our implementation experience at http://blog.tutorhub.com/2013/04/09/tutorhub-enters-the-world-of-webrtc.

    April 13th, 2013 at 07:46

    1. Robert Nyman [Editor]

      WebRTC covers many different things, and when it comes to getUserMedia, it has been in Firefox for some time.
      As mentioned in the article, it is behind a preference in Firefox Nightly, though, and it is planned to be available soon in default Firefox.

      April 13th, 2013 at 08:57

  12. Rahul

    Hello,
    Cannot seem to get it to work. Any firewall ports that need to be opened ?
    Thanks.

    April 14th, 2013 at 03:04

  13. Russ Petersen

    Wow, you made it so simple. Great job! I’m looking at using SimpleWebRTC for a secure 2 party video chat, with an encrypted connection. I’ve been digging through the code to find where I can use encryption, but so far I’m mostly lost. Can you give me an idea where I should start digging more?

    Thanks,
    Russ

    April 18th, 2013 at 15:10

    1. Henrik Joreteg

      WebRTC has a lot of security features baked in already for the media streams: http://www.html5rocks.com/en/tutorials/webrtc/basics/#toc-security

      If you want to control who has access to joining a particular room, you’d have to do that by modifying the signaling server to include permission checks, etc.

      April 19th, 2013 at 11:41

Comments are closed for this article.