Thursday, 27 December 2012

WebRTC is the new battleground for peer-to-peer vs. server-based models for communications

I'm doing a really deep dive into WebRTC technology and business models at the moment. My view is that it's going to be a huge trend during 2013, and will be subject to the highest levels of hype, hope, marketing, debunking, politics, ignorance and misinformation. I'm not predicting it will take over the world (yet) - but I certainly am predicting that it's going to be a major disruptor.

**NEW Feb 2013 - Disruptive Analysis WebRTC report - details here** 

It's a fast-moving and multi-layer landscape encompassing telcos, network suppliers, device vendors, Internet players, software developers, chip vendors, industry bodies, enterprise communications specialists and probably regulators. Because my research and analysis "beat" covers all of those, I'm hoping to be the best-placed analyst and strategy consultant to decode the various threads and tease out predictions, opportunities, threats and variables.

One of the most interesting aspects is the linkage between intricate technology issues, and the ultimate winners and losers from a business point of view. Just projecting based on the "surface detail" from PR announcements and vendor slide-decks misses what's going on beneath. I'm finding myself going ever deeper down the rabbit-hole, before I can return and emerge with a synthesised and sanitised picture of possibilities, probabilities and impossibilities. That's not to say that there aren't also a set of top-down commercial and end-user trends driving WebRTC as well - but that's for another day.

A cursory glance at the WebRTC landscape reveals a number of technical battlegrounds - or at least foci of debate:

  • Codec choices, especially VP8 vs. H.264 for video
  • Current draft WebRTC vs. Microsoft's proposed CU-RTC-Web vs. whatever Apple has up its sleeve
  • The role of WebSockets, PeerConnection, SPDY and assorted other protocols for creating realtime-suitable browser or application connections
  • What signalling protocols will get adopted along with WebRTC - SIP, XMPP and so on
  • What does WebRTC offer that Flash, Silverlight and other platforms don't?
  • What bits of all this does each major browser support, when, and how? How and when are browsers updated?
While a lot of these seem remote and abstruse, there is another (mostly unspoken) layer of debate here:

Is WebRTC mostly about browser-to-browser use cases? Or is it aimed more at browser-server/gateway applications?

That is the secret question which is both chicken and egg here. Certain of the technical debates above tend to favour one set of use cases over the other - perhaps by making things easier for developers, or introducing the role of third parties who operate the middle-boxes and monetise them as "services". Because of this, it is also the hidden impetus behind various proposals and political machinations of various vendors and service providers. Other, less "Machiavellian" players are going to find themselves in the role of passengers on the WebRTC train, their prospects enhanced or damaged by these external factors without  their control.

Let's take an example. Cisco and Ericsson are both fans of H.264 being made a mandatory video codec for WebRTC. Now there are some good objective reasons for this - it is widespread on the Internet and on mobile devices and it is acknowledged as being of good quality and bandwidth-efficiency. But.... and this is the pivot point... it is not open-source, but instead incurs royalty payments for any application with more than 100K users. Conversely, Google's preferred VP8 is royalty-free but has limited support today - especially in terms of hardware acceleration on mobile devices. Maybe in future we'll see VP8-capable chipsets, but for now it has to be done in software, at considerable cost in terms of power use.

On the face of it, Cisco and Ericsson are behaving entirely rationally and objectively here. A widely adopted, hardware-embedded codec is clearly a good basis for WebRTC. But.... by choosing one with a royalty element, they are also swaying the market towards use-cases that have business models associated with them; especially ones that are based on "services" rather than "functions", as someone, somewhere, will need to pay the H.264 licence. (Ericsson is a member of the MPEG-LA patent holders for it, too). That works against "free-to-air" WebRTC applications that work purely in a browser-to-browser or peer-to-peer fashion. I guess that it could just push the licensing cost onto the browser providers, ie Google and Mozilla etc, but that doesn't help non-browser in-app implementations of WebRTC APIs.

But looking more broadly at all the battles above, I see a "meta-battle" which perhaps hasn't even been identified, and which also links to things like WebSockets (which is a browser-server protocol) and PeerConnection (browser-browser) as well as the role of SIP (very server/gateway-centric).

In a browser-to-browser communications scenario, there is very little role for communications service providers, or those vendors who provide complex and expensive boxes for them. Yes, there is a need for addressing and assorted capabilities for dealing with IP and security complexities like firewalls and NATs, but the actual "business logic" of the comms capability gets absorbed into the browser, rather than a server or gateway. It's a bit like having the Internet equivalent of a pair of walkie-talkies - once you've got them, there's no recurring service element tied to "sessions". Only with WebRTC, they'd be "virtual" walkie-talkies blended into apps and web-pages.

Now, the server-side specialists have other considerations here too. Firstly, they have existing clients - telcos - that would like to inter-work with all the various end-points that support WebRTC. Those organisations want to re-use, extend and entrench their existing service models, especially telephony and SIP/IMS-based platforms and offerings. Various intermediaries such as Voxeo, Twilio and others are helping developers target and extend the reach of those services via APIs, as discussed in my last post. Some vendors like SBC suppliers are perhaps a bit less exposed than those more focused on switching and application servers.

There is also the enterprise sector here, which will clearly like to see its call-centres and websites connect to end-users via whatever channel makes most sense. WebRTC offers all sorts of possibilities for voice, video and data interaction with customers and suppliers. They'd also (generally) prefer to reduce their reliance on expensive services-based business models in the middle, but they're a bit more pragmatic if the costs become low enough to be ignored in the wider scheme of things.

Now all of this looks like a big Venn diagram. There are some use-cases for which servers and gateways are absolutely essential - for example, calling from a browser to normal phone. Equally, there are others for which P2P makes a huge amount of sense, especially where lowest-latency connections (and maximum security/privacy) are desirable. It's the bit in the middle that is the prize - how exactly we do video-calling, or realtime gaming, or TV-hyper-karaoke, or a million other possible new & wonderful applications? Are they enabled by communications services? Or are they just functions of a browser or web-page. We don't have a special service provider to enable italic words online, so why do we need one for spoken words or moving visuals?

This isn't the only example of a P2P vs. P2Server battle - obviously the music industry knows this, as well as (historically) Skype. But it goes further, for example in local wireless connectivity (Bluetooth or WiFi Direct, vs. service-based hotspots or Qualcomm's proposed LTE-Direct). The Internet itself tends to reduce the role of service providers, although the line dividing them from content/application providers is much more blurry.

It would be wrong to classify Google as being purely objective here either. Despite high-profile moves like Google Voice, Gmail and Chat, I think that its dirty secret is that it doesn't actually want to control or monetise communications per se. I suspect it sees a trillion-dollar market in telecoms services such as phone calls and SMS's that could - eventually - be dissipated to near-zero and those sums diverted into alternate businesses in cloud infrastructure, advertising and other services.

I suspect Google believes (as do I) that a lot of communications will eventually move "into" applications and contexts. You'll speak to a taxi driver from the taxi app, send messages inside social networks, or conclude business deals inside a collaboration service. You'll do interviews "inside" LinkedIn, message/speak to possible partners inside a dating app etc. If your friend wants to meet you at the pub, you'll send the message inside a mapping widget showing where it is... and so on.

I think Google wants to monetise communications context rather than communications sessions, through advertising or other enabling/exploiting capabilities.

Even when abstracted via network APIs, conventional communications services pull through a lot of "baggage" (ie revenue and subscriber lock-in). They perpetuate the use of scarce (and costly) resources like E.164 phone numbers.

I also think that Microsoft and Apple are somewhere in the middle of this continuum, which is why they are procrastinating. They both have roles to play in both scenarios - and therefore, perhaps, are the kingmakers. Both are advocates on the specific issue of H.264 - Apple because of FaceTime, and Microsoft for reasons that seem unclear to me, as Skype is adopting VP8. More generally, Microsoft seems more server/network-centric, but is also wary of doing anything that allows the IE browser to fall further behind.

Either way, this contretemps is about more than just technology - it is, ultimately, rooted in the nature of WebRTC as a business. Specifically, it is about drawing the boundary between WebRTC services and WebRTC features.

I'm not making a judgement call here. This is not so much an iceberg analogy as a tectonic one. We've got a number of plates colliding. The action - the subduction zone - is occurring at a deep level. And over the next few years we're going to get some sudden movements that generate earthquakes and tsunamis.

(Amusingly, the first line on the tectonics web-page says "When two oceanic plates collide, the younger of the two plates, because it is less dense, will ride over the edge of the older plate" - perhaps a better analogy than I realised at first!)

Stay reading this blog in coming days: I'm working on the first seismic map of the WebRTC world. Sign up for updates here and follow @disruptivedean on Twitter.

**NEW Feb 2013 - Disruptive Analysis WebRTC report - details here** 

No comments:

Post a Comment