The Next Internet? Inside PARC’s Vision of Content Centric Networking

8/7/12Follow @wroush

The Internet may be hurtling toward collapse under the strain of too much traffic. But PARC research fellow Van Jacobson thinks he knows how to fix it.

He’s done it before. Back in the mid-1980s, when the Internet was seeing its first modest surge in usage, Jacobson noticed that data packets were piling up on the message routers of the day, like cars waiting for cross-traffic to clear before entering an intersection. Working with fellow Berkeley computer science instructor Mike Karels, he came up with a small change to the Transmission Control Protocol (TCP) that, in essence, allowed packets to ease into the intersections gradually, curing the congestion. Later, Jacobson also came up with a way to compress the “headers” or address sections of Internet Protocol (IP) packets from 40 bytes down to about 3 or 4 bytes, which made a big difference at a time when so many packets were still squeezing through narrow telephone lines.

But the challenges the Internet is facing today are very different, and call for a much broader solution, Jacobson believes. He argues that the global computing network was never designed to carry exabytes of video, voice, and image data to consumers’ homes and mobile devices, as it’s now doing, and that it will never be possible to increase wireless or land-line bandwidth fast enough to keep up with demand. In fact, he thinks the Internet has outgrown its original underpinnings as a network built on physical addresses, and that it’s time to put aside TCP/IP and start over with a completely novel approach to naming, storing, and moving data.

Jacobson’s alternative is called Content Centric Networking, or CCN, and it’s grown into the single biggest internal project at PARC, the Xerox-owned research center that’s famous as the birthplace of graphical computing, laser printing, and the Ethernet standard. If the ideas behind CCN were broadly adopted, PARC researchers believe, it would speed the delivery of content and vastly reduce the load on the networking equipment at the Internet’s core.

It would also pose a challenge to the model of utility-style storage and processing that’s come to be known as cloud computing. And that might undermine many current business models in the software and digital content industries—while at the same time creating new ones. In other words, it’s just the kind of revolutionary idea that has remade Silicon Valley at least four times since the 1960s. And this time, PARC doesn’t want to miss out on the rewards.

“When there is widespread adoption of CCN there will be lots of opportunities to build valuable businesses on top of it that are really impossible to foresee today,” says Teresa Lunt, vice president of PARC’s Computing Science Laboratory. “The main reason we’re investing is because we’re in love with the technology, and we want CCN to make it out into the world…[but] we know that PARC will be able to participate in the upside as well.”

Replacing “Where Is It?” with “Who Wants It?”

To understand why Content Centric Networking is so different, you have to start by looking at today’s Internet, which was designed back in the days when there were only a handful of machines that needed to talk to each other, and the network was used mainly for short bursts of point-to-point communication. In this established scheme, every piece of content has a name, but to find it you have to know in advance where it’s stored—which means the whole system is built around host identifiers and file hierarchies like www.xconomy.com/san-francisco/2012/08/07/the-next-internet/. (The first part of that URL gets translated into the IP address 63.246.24.145, which leads to the server at St. Louis, MO-based Contegix where Xconomy’s content database is hosted. The rest refers to the sub-sub-sub-folder on that server where WordPress, our content management system, stored this page.)

The fundamental idea behind Content Centric Networking is that to retrieve a piece of data, you should only have to care about what you want, not where it’s stored. Rather than transmitting a request for a specific file on a specific server, a CCN-based browser or device would simply … Next Page »

Wade Roush is a contributing editor at Xconomy. Follow @wroush

Single Page Currently on Page: 1 2 3 4

By posting a comment, you agree to our terms and conditions.

  • hhemken

    “How much might consumers be willing to pay for such a service?”

    We all know the answer: Nothing. They (we) will want to pay nothing. Somebody else will have to pay for that vast new caching infrastructure. Now that web advertising seems to be collapsing, the “who pays for it?” question seems like an enormous gaping hole in the model.

  • patpentz

    this would also be great for a email address replacement (and physical mail). Instead of addresses like johndoe@xyz.com or ‘John Doe, 130 Park Place, Denver, CO’, we would simply send email or physical mail to ‘John Doe’; additional text to select only the correct John Doe. Change of address becomes trivial, and transparent to others. Actual address is hidden, also.

    • H.

      Never mind replacing email addresses – it could easily replace email itself (just to give one example). Just write some text in your favourite word processor, set it to be accessible by your good buddy John Doe, then sit back and relax while the document is automatically stored in the internet-based cloud and John scribbles his response on top of it. Want to do IM too? Just type faster. Security? Standard. Top quoting? Redundant. And no more ‘your message was rejected because your attachments were too big’ either. All the benefits of email and IM combined; none of the current jumping-through-hoops drawbacks. I imagine the MS Outlook team wouldn’t be too happy, but hey, can’t please everyone.

  • R.

    This article feels like reading about the Internet in the late 90s. Aside form the smartphone-based, personalized edge applications, how is this proposal different from the existing CDN industry? Very few big companies use a purely centralized model anymore. They’re all using Akamai or one of its competitors.

    • H.

      In addition to being standards-based rather than vendor-specific, the really big difference it that scales all the way from the giant video distributors all the way down to individual users. Instead of a content provider having a single master server and relying on a bunch of ‘dumb’ cache servers dotted around the world to reduce latency, you treat all servers as equal peers, replicate data across them according to whatever propagation rules are appropriate and rely on crypto where necessary to determine who has rights to access a particular resource. It’s a bit like the difference between centralized source control systems like Subversion and distributed ones like Git – the latter blast the former out of the water for power, flexibility and multi-user support.

      I’m a little surprised the Parc folks didn’t mention the next logical step. Today’s consumer desktops are becoming more and more like a miniature version of the internet itself: a random assortment of untrusted applications, each sandboxed to prevent it causing mischief amongst the file system and its peers. In addition, users themselves are moving away from owning a single general-purpose PC to possessing multiple task-optimised devices (smartphone, tablet, smart TV, games console, NAS, etc). Apply the same “describe what it is, not where it’s stored” philosophy to users’ files, and various problems such as having ready access to your data from any of your devices, worrying about whether or not your backups are up-to-date, or viewing a document’s entire change history become non-issues: your data gets replicated and secured across your devices/home cloud/internet cloud according to whatever standard rules you specified when you set your network up. This might even be a better way for CCN to get its foot in the door: start at the individual user level and gradually work up.

      So, lots of opportunity to radically consolidate and simplify how consumer computing works today. It’s mostly a matter of working out the practical implementation issues (synchronization and security models, etc.), and figuring how the heck to get major vendors like Apple and Microsoft and Facebook and Twitter to buy in when so much of their current business models are based on user lock-in and system balkanization. But if anyone can figure it, I reckon the Parc folks can.

  • http://www.facebook.com/rraisch Rob Raisch

    We looked pretty extensively at the idea of authoritative naming of resources back in the early 90′s while developing the IETF “Uniform Resource Names” draft – see http://tools.ietf.org/id/draft-ietf-uri-urn-req-00.txt – and came to the conclusion that the idea of truly authoritatively naming a resource wasn’t really viable because we couldn’t think of a reasonable way to name things *authoritatively*.

    To illustrate, assume I write an article entitled “Authoritative Naming Won’t Work” and name it “robraisch/articles/no_authoritative_names” and later, I discover and fix a typo.

    Is my article now a different version of the original or simply a correction to the existing version? At what point do changes to my article require a new name?

    If my changes do require that I republish my article with a new name, what happens to any external references that might exist to the resource by its original name?

    What happens if changes I make invalidate references to it by external agents that rely on its original form?

    If I choose to publish my article in text, HTML, PDF, mp3, and SuperFutureDocumentFormat(tm), do I need to make five names for it or only one?

    And most importantly, who decides whether my article has one name irrespective of any changes I may make to it or formats I choose to publish it in?

    In the end, we reached the conclusion that the publisher of a resource must control its name but that doing so couldn’t be considered “authoritative” in any sense other than that which the publisher chose.

    • has

      “If my changes do require that I republish my article with a new name, what happens to any external references that might exist to the resource by its original name?”
      One option might be to attach a GUID to the document and let everyone else worry about what names/labels/metadata/whatever they want to associate with that. Or just treat the original name as canonical and alias any subsequent names to that. Once the process of storing and retrieving files is better abstracted over, the software can make such nuts and bolts invisible to users.

      “What happens if changes I make invalidate references to it by external agents that rely on its original form?”
      Isn’t this just the sort of problem hyperlinking was invented for? In such a long-lived system the only constant is change, so use it in ways that accommodate change gracefully instead of fighting it. And as a fallback, there’s always search, of course.

      Also, FWIW, rather than devise a new naming scheme it might just be simplest to retain the existing URL scheme and just tweak the way URLs are treated so that any trusted server which holds a copy of the desired resource can handle them. Technically, the domain and path portions of a URL are supposed to be treated as opaque anyway; it’s just an accident of design that they’re somewhat human-readable, encouraging humans to futz with their internal structure or assign special meaning to it.

      “If I choose to publish my article in text, HTML, PDF, mp3, and SuperFutureDocumentFormat(tm), do I need to make five names for it or only one?”
      This one is dead easy: it’s a complete non-issue once the user is no longer responsible for specifying precisely where the file is stored. The article should appear a single resource which is declared as being available in five different representations. That allows the client to specify the article name and the format they’d most like to get it in, and leaves the server to worry about all the back-end storage and retrieval details.

      See Roy Fielding’s seminal paper on Representational State Transfer, aka REST. It’s just a pity so many web programmers today don’t understand (or misunderstand) it – REST may not solve every problem, but it puts a good dent in quite a few of them.

      • JerryL

        Sorry, but this completely hand-waves on the real issues. For example, you say it’s a dead easy to deal with multiple representations of the same document. But what counts as “the same”? Of the list given, the text version will certainly contain only a subset of what might be in the HTML and PDF. At the least, it will be missing formatting; it may well be missing embedded graphics, hyperlinks, what have you. The same, or different? Let’s add to the mix. All those formats are nominally final-form; how about an editable format? The same, or different? How about the same document translated into French?

        REST is a great idea, but ultimately it has problems with exactly the same issue. A given URI is supposed to name “the same thing” over time. But except in rather boring (though important) cases, “the same thing” does *not* mean “the same bytes”, nor even “a semantically equivalent representation of what you got last time”. No, it means “the same resource”, for some abstract notions of “resource” and “the same”. There are many cases where the appropriate definitions of these things are clear – and then REST works well. But there are other cases where the entire difficulty is figuring out what these notions should map to – if they even *have* natural mappings. Absent those, REST gets you nothing very useful.
        — Jerry

        • has

          “Sorry, but this completely hand-waves on the real issues. For example, you say it’s a dead easy to deal with multiple representations of the same document. But what counts as “the same”? Of the list given, the text version will certainly contain only a subset of what might be in the HTML and PDF. At the least, it will be missing formatting; it may well be missing embedded graphics, hyperlinks, what have you. ”

          You see this as a problem but it isn’t. It’s the *client’s choice* whether they want to see the full semantic HTML version, the pretty-looking PDF or the bare-bones text version.

          Assuming the document isn’t encrypted, the RESTful approach can provide further benefits too. For example, the server may host a single canonical file and perform the conversions to the lossier formats on the fly. If such conversions are common, the converted formats may be automatically cached for efficiency. This reduces workload on the original author, who only has to save the document in one format then leave the server to worry about the rest.

          “Let’s add to the mix. All those formats are nominally final-form; how about an editable format? The same, or different?”

          Things get a bit spicier once you add editing into the mix. For example, if a client GETs the HTML representation, modifies it and PUTs it back on the server, how should the system deal with the old PDF and text versions? Assuming the system tracks the resource’s full revision history (as a modern content system should), one option is to treat all old representations as the previous revision and declare the uploaded HTML document as the only representation available for the current revision. If versioning isn’t supported, delete the obsolete representations and either let the server regenerate them on the fly (see above) or else leave the author to save and upload the additional formats themselves.

          “How about the same document translated into French?”

          Non-issue again. The server declares which language(s) the document is available in and the client negotiates to obtain whichever representation best suits it.

          “REST is a great idea, but ultimately it has problems with exactly the same issue.”

          REST deals with the file format and language concerns pretty well, as long as you remember it’s all about *client choice*, not about the server imposing its own opinions on the client. Revision tracking needs more thought, but like I say REST should be the starting point, not the complete answer in itself.

  • http://twitter.com/xRDVx Ardyvee

    This sounds a lot like freenet, if you ask me.

  • Volt Lover


    The request has to travel from the Apple TV over my Wi-Fi network, into Comcast’s servers, then across the Internet core, and finally to Yahoo.”

    It would be better if the author understood how the Internet works. NO Comcast server is proxying that request

    • Jed

      It doesn’t matter if Comcast uses NAT or not, the author is exactly correct. Engage brain next time?

  • http://www.facebook.com/rraisch Rob Raisch

    Another issue that strikes me as problematic is how Content Centric Networking can be used to support content specific censorship. It’s not unimaginable to consider how Verizon or ComCast might use CCN to restrict their customers access to content from Netflix. One useful aspect of Internet Address-based routing is that “bits is bits” irrespective of their source, they are all equal. In CCN, bits no longer enjoy this opacity because they “belong” to a publisher-identifiable resource.

    • Mark

      bits no longer enjoy this opacity because they “belong” to a publisher-identifiable resource.

      Good thought by the way …
      link

  • http://newstechnica.com David Gerard

    He appears to have described BitTorrent with magnet links.

    • http://www.facebook.com/rrohbeck Ralf-Peter Rohbeck

      Yup. More precisely, a distributed content-addressable system like a DHT.

  • Happy Heyoka

    @David Gerard wrote: “He appears to have described BitTorrent with magnet links.”

    My thoughts exactly – the torrent file describes the desired content and a way check that you have a valid copy. You don’t actually care as a user where that content might be. With some support from the infrastructure (eg: anycast) it would kill HTTP bandwidth wise (and maybe speed) for content that is in common and frequent use on a wide scale.

    I suspect that Van has in mind an additional level of indirection; off to read a slightly more technical description…

  • http://covac-software.com/ Christian Sciberras

    Now here’s the real question.

    Do you browse the web with bittorrent? No?

    Since the web amounts to a good chunk of the internet, how is the next internet (which won’t be current-web-friendly) be the “Next Internet”?

    Don’t get me wrong, the research some scientists are doing is commendable. The hype some people come up with? Not at all.

    Especially from a guy that doesn’t know the difference between file on a server and URL rewriting (AKA routing).

  • Lawrence

    First of all, the Internet did not use TCP/IP in 1985 – it used UUCP until the laet 1980′s, so what are you talking about ??

  • http://www.ur2die4.com/ amanfromMars

    Thanks for all of that mould breaking info/intelligence, Wade. It is good to hear that not all are content with things as they are, whenever they can and need to be better in Beta Virtual Programs, …. SMARTR Future AI Works in Immaculate Stealthy Progress.

    “Similarly, in a content-centric network, if you want to watch a video, you don’t have to go all the way back to the source, Lunt says. “I only have to go as far as the nearest router that has cached the content, which might be somebody in the neighborhood or somebody near me on an airplane or maybe my husband’s iPad.””

    For leading original novel content-centric networks though, in order that one can lead in a CHAOSystem*, one will always need to feed and seed a direct, and ideally exclusive connection to source, to ensure there is no conflict and cross interference with competition or opposition, although it is as well to consider that in such networks, is it the nature of the leading original novel content itself, which sublimely delivers all of that effortlessly and invisibly and anonymously.

    Words Shared Create and Control Worlds in and for Worlds in Control of Creation with Shared Words ……. which is a Fabulous Fact and an Outrageous Fiction which does not Suffer Fools to Use and Abuse and Misuse ITs Tools.

    * Clouds Hosting Advanced Operating Systems