the internet

This article is part of the series the interwebz.

The genesis of the internet was a US Department of Defence project to provide communications between the Advanced Research Projects Agency (ARPA) research bodies. This original network was called ARPANET.

The ARPANET project started in the 1960s and its initial deployment in 1969 involved four US west coast universities. By 1975, it was international, using satellites and undersea cables to Europe and the UK. The community was universities and military research organisations.

ARPANET predates the earliest personal computers by ten years.

The purpose of ARPANET was to allow member bodies to share computing resources and to exchange research findings. It introduced some quite remarkable new technologies, particularly communication resilience through packet switching. If a link went down, the network could re-route through alternative channels. Given that ARPANET was a product of the Cold War, this was a feature to mitigate effects of a nuclear strike.

In simple terms, ARPANET was a means to connect networks together. The network at one site was now able to connect to the network at another site. It eventually became known as the Internet, from the term “inter-network”.

ARPA, or DARPA (Defence Advanced Research Projects Agency) as it became, sponsored some other notable research that has become mainstream. One of the member bodies was Xerox, particularly their research division in California – the Palo Alto Research Center (PARC). PARC invented, among other things, the laser printer and a graphical user interface called WIMP (Windows, Icons, drop-down Menus and Pointer). This user interface was adopted by Apple, and later pretty much everyone else. Just think that as we point and click and send messages around the world, we are using military technology from the Cold War.

cover artwork of where wizards stay up late
where wizards stay up late

Katie Hafner’s “where wizards stay up late” explores this history and is a wonderful read.

During 1983 and 1984 two relevant events for today’s internet happened. The underlying technical protocols changed to use the protocols we still use today, and the Department of Defence separated from ARPANET, forming their own sub-net called MILNET. These changes paved the way for significant expansion, with ARPANET becoming a sub-net of the nascent internet.

For Facebook’s edification that is a lot longer than 25 years ago.

The next pertinent event came in 1991 when the “Gore Bill” was enacted by the US Congress. This effectively relinquished US government control and opened the internet to everyone. It also removed prohibition on commercial activities on the internet. Al Gore is often misquoted or taken out of context when he claims to be the “inventor of the internet”. He can rightly claim, I believe, to be the “father of the Information Superhighway”, a term he first used.

For historical interest, the President who signed this Act into US law said at the time:

“The Act would help unlock the secrets of DNA, open up foreign markets to free trade, and promise cooperation between government, academia, and industry”. George H W Bush

Who would have thought?

how does the internet work?

The internet is, at its most basic, just a mechanism for getting messages or data from one network to another.

To send something, the computer breaks it into pieces called packets, addresses these packets, and forwards them to the internet. These packets then get routed to their destination, often through several hops and possibly different paths. Once they are at their destination, they are reassembled, acknowledged if required, and then processed.

If a response is required, then same process is repeated to send it.

A service called the domain name service (DNS) is used to find out where to send the packets. This is just an address book that translates domain names (such as www.pfeiffer.net.au, where this article is) into an address that computers can understand. This DNS is at the heart of the internet and exists in many copies in many places.

One of the great strengths of this design is that the internet does not need to be aware of the contents of the packet. This means that anything can be sent by this method. Email, video streaming, or banking transaction are all handled the same way. This ubiquity is also considered a weakness by many and is discussed in the article on why the internet is broken.

the internet today

Today we have a pervasive and ubiquitous internet. If a device can compute, it can likely communicate. If it can communicate it is likely connected to the internet. These devices include traditional computer devices and mobile devices such as smart phones and tablets. It includes industrial and consumer sensors and devices that we think of as “things”. It includes surveillance cameras, and radio and television broadcasts.

Even our landline phone is involved. Our analogue conversation is digitised, directed through the internet, and converted to analogue form at the other end. Own a modern car? It is most likely connected.

The internet is not limited to this planet. Space probes are connected to the internet. Those amazing pictures of Saturn? They came from Cassini, an internet-connected spacecraft.

There are a lot of connected devices. Cisco, who make many of the devices that route traffic around the internet, estimated that in 2012 there was somewhere between eight and ten billion internet-connected devices. They projected that in 2020 there would be 40 billion such devices.

“99% of physical objects that may one day join the network are still unconnected” Cisco, 2012 Annual Report

This has massive implications for the future of the internet. The first of these is communications saturation. Is there enough communications bandwidth to support all these devices and the data they publish or consume, or will we need to start rationing?

The rationing of scarce resources is hardly new, but it is interesting to consider what the study of economics teaches us. A black market? Hoarding? Inflation? Wealth defined by communications capability? War over telecommunications resources? Sounds excessive, but not outside the realms of possibility. An interesting problem for others to solve, I think.

The second problem is that there are nowhere near enough addresses for all these devices. For a device to be on the internet it needs an address and with the current addressing scheme there are only around four billion possible addresses. When the technology was designed several decades ago, no-one predicted this level of proliferation of internet connections.

This problem has inspired an aggregation solution, where several devices sit behind an aggregator and share an address. This has been workable, more or less, but it is not an enduring solution. There is an alternative addressing scheme that does not have these limits, but it is certainly not in common usage.

It is worth noting that the addresses under the old scheme have been rationed for some time. Another interesting problem for someone to solve – and sooner, rather than later, would be good.

Some other things worth considering. What goes on the internet, stays on the internet. Every purchase, post or tweet we make is copied, backed-up, archived and generally replicated several times. It is never deleted. We love this because if an important site goes down we can often use their fail-over facility, because all their data is replicated. And, by their data, I mean our data. When we suddenly realise we have been stupid and deleted something important some time ago, we don’t worry because we can usually recover it. We can do this because it hasn’t actually been deleted at all, just removed from view. However, we don’t like this because of that embarrassing selfie we posted. It is there for eternity, or even longer. It’s like a weed – remove it and it just reappears, despite our best and continual efforts.

The resilience of the internet is another matter. When it was designed, someone forgot to put an on-off switch in the specifications. Consequently, it would be almost impossible to turn the internet off today. Assuming it was considered important to deactivate the internet, just how that would be achieved is not well understood. Consensus suggests that the most likely successful method is to turn off every single device connected to, and part of, the internet. If we only did part of the task, the internet’s inbuilt resilience would find a way to overcome the loss of large parts of its fabric. Once a device was reconnected, the internet would restore it.

Given Cisco’s estimates, that means every man woman and child on the planet would need to find and disconnect at least eight devices each, all within a short period. Achieving this would be the victory of hope over experience.

Assuming we were successful in shutting down the internet, the world without it is unthinkable.

It is not just a First World issue. Sure, not being able to get an Uber, order our café latte to go, or check our online betting agency will be a massive inconvenience. But there are alternatives.

For the Third World, the situation would be dire. The internet delivers the support for addressing problems of health, poverty and education. It also supports the execution of trade and its concomitant economic benefit.

Without the internet, humanity would revert to something like Western civilisation’s Dark Ages, where information and knowledge were controlled by the few.

We cannot exist without the internet. But I’d still like that on-off button, just in case.

This resilience prompts an interesting thought exercise. There are many huge data centres that support the internet. Provided by the likes of Amazon, Microsoft, IBM and Akamai, they are distributed around the world, and are in geologically stable locations. They are largely unmanned, self-maintaining and self-sufficient. Most have their own independent power supplies, often using thermal energy or wind farms. They tend to be linked by fibre-optic cable, which will withstand just about anything except a pair of bolt-cutters. The concentrators, switches, hubs and routers which handle the traffic are housed in similar self-sufficient facilities.

In the event of a global catastrophe which may eradicate all of humanity – think pandemic disease or meteors, or our own stupidity – it is conceivable there will only be two survivors; cockroaches and the internet. Certainly, much of the fabric of the internet might be destroyed or become inoperable but remember that it was designed for such a situation. It is designed to be resilient, and to run in a degraded mode. Even an invasion by aliens is unlikely to get rid of our embarrassing selfie.

The internet is a massive, massive creation. It uses military technology largely invented in the 1960s, is controlled by no-one, and the world’s economy and livelihood depends upon it. Arguably it is the biggest thing that humankind has ever built.

how big is the internet?

The simple answer is that no-one truly knows.

The best approximations come from a few studies where they have measured some of the larger players, such as Amazon, and have extrapolated, based on estimates of their share of the internet. The studies have periodically been performed by companies such as Cisco and IBM, or publications such as Forbes magazine.

What they seem to agree is that the internet today holds about ten zettabytes of data.

Now, a zettabyte is just one of those fundamentally incomprehensible quantities. It is 1021 bytes, or written as a one, following by 21 zeroes. Try as I might, I cannot find a way to put it into some frame of reference. A trillion trillions? Not even close.

There is evidence to suggest that this estimate is conservative. It is known that there is more storage at what is called the edge of the internet – our phones, for example – than there is at the core of the internet. Unfortunately, no-one has come up with a reliable method for including this data into internet size estimates.

The internet is not static and is growing at a disturbing rate. Conventional wisdom had the internet growth rate at doubling over 18-24 months. This wisdom was challenged by an IBM study that revealed that 90% of the data on the internet today was created in the last two years alone. Our current upload rate for data is roughly 2.5 exabytes (1018 bytes) per day.

If this data is correct, we have tenfold growth every 18-24 months, rather than twofold. This is disturbing because this growth rate is greater than our ability to make storage devices. Simply, we are going to run out of disk drives, unless a solution can be found.

If we take Cisco’s estimates, we have tens of billions connected devices, accessing an incomprehensible amount of data.

This is available twenty-fours a day, 365 days a year. Further, based on estimates from Berkeley Laboratories, it seems that somewhere between two and five percent of world’s energy production is devoted to keeping the internet running.

And we use it every day without paying it too much attention.

For something so vast and complex, how do we humans use it? Enter Tim Berners-Lee and the World Wide Web. Which leads nicely to my next article.

digression

There is apparently a patron saint of the internet. It is Isidore of Seville, who wrote a twenty-book series called Etymologies, or the Origins. In this he tried to record everything that was known then. Published after his death in 636CE, it was considered the encyclopedia of all human knowledge for the next thousand years.

All comments are moderated according to our comment policy.

your comments

Your email address will not be published.