Tuesday 4th December 2012 (08:30AM)
(As with all "concise and simple" articles I assume no prior knowledge of the subject and keep the length to less than 1500 words.)
Have you ever wondered what makes it possible to serve web pages like this one, deliver an email or join a video conference, all seemingly in an instant? The answer is the Internet, a decentralised global network of computers. This article explains what's involved.
Devices on the Internet communicate using different yet related protocols. A protocol is a set of rules and conventions that explain how to communicate information. For example, although nothing to do with the Internet, Morse code is a sort of communication protocol.
For the Internet to work, protocols are layered: each layer provides specific functionality needed to make the complete strata of layers work as a whole. Layers depend upon each other with lower levels providing services for the layers above.
At the very lowest layer are protocols concerning how information is physically sent from one device to another. At the very highest layer are protocols for information sharing systems such as the World Wide Web (WWW or just the "Web"), Email, Internet Relay Chat (IRC) and Voice Over IP to name but a few.
Layers are not the end of the story. Messages are split in to chunks of data, sent over the network and then reassembled at the recipient. Each protocol has a different name for such chunks of data: for example, at the data link layer they are called "frames", at the network layer "packets" and at the transport layer they are called "segments" or "datagrams". Chunks of data from higher level protocols are wrapped within those of the lower levels.
Data chunks contain two types of information: control information and the payload (containing data from the higher level protocols). Control information is used by the protocol to fulfil its function while the payload is handed over to the next protocol up in the layers described above.
Furthermore, each connected device needs to have a unique address so it can be found on the network. These are unique numbers called IP addresses implemented in the network layer, most commonly by the Internet Protocol (IP) - hence the name.
Blocks of numbers are assigned to different "entities" (governments, educational institutions, organisations and companies) by the Internet Assigned Numbers Authority (IANA). Such entities further assign sub-blocks until we arrive at a single IP address assigned to an individual device on the network.
IP addresses generally come in two types reflecting the version of IP that they correspond to: IPv4 and IPv6. The maximum number of IPv4 addresses is 4,294,967,296. While this may sound like a lot of potential addresses I'm afraid we've already run out. As a result IPv6 caters for a huge number of addresses: 340,282,366,920,938,463,463,374,607,431,768,211,456 (that's 2128). The Internet is currently transitioning from IPv4 to IPv6.
Because humans are no good at remembering long numbers we refer to devices connected to the Internet with human-friendly domain names such as bbc.co.uk. Whenever we make a request to a domain we must use the Domain Name System (DNS) to map the domain name to the IP address of the correct device on the Internet (so, looking up the bbc.co.uk domain gives an IP address of a device owned by the BBC). DNS is a distributed lookup service that is organised and run by domain name registrars (with whom you register your ownership of a domain) and policed, via laws and legal interventions, by governments and business interests (such as the RIAA and BPI).
However, any single device may be running several different networked applications at the same time (for example, an email client, a web server or a chat service). Numbered ports on a connected device function in a similar way to numbered mail boxes in a block of flats: applications know to "listen" on specific port numbers for only their messages. Standards dictate how certain port numbers map to particular application protocols. For example, unencrypted requests on the web get sent to port 80.
It is because of the existence of ports that there are many more devices connected to the Internet than there are IPv4 addresses. The Internet is, in fact, a collection of many local networks. Within each local network devices are assigned unique IP addresses but the device that is connected to the wider Internet (called a router) only has a single external IP address. How can the router's single IP address be shared by all the devices on the local network?
The answer is Network Address Translation (NAT) - the router maps an external and unused port to an internal (local) device. For example, all requests to the router on port something-or-other could be simply forwarded to the device on the local network identified by a locally unique (internal) IP address.
Finally, network traffic is policed by firewalls using techniques such as IP address blocking. A firewall is like a border control checkpoint: every chunk of data is examined and either discarded or allowed to pass depending on a set of rules. IP address blocking is a specific example of a firewall rule: blacklist any traffic to or from a specific block of IP addresses (such as those from outside China).
Some firewalls are sophisticated enough to do deep packet inspection where the content of network traffic is examined for the purposes of security, data mining, eavesdropping or censorship. Chunks that are deemed bad are not allowed through. Encrypting messages at the higher levels of the Internet stack go some way to circumventing such measures although traffic analysis (an examination of network behaviour) can be used to infer the intent of a message.
Unfortunately, the details of specific protocols can't be explored in such a short introductory essay. However, if you're interested in finding out more you should investigate the Internet Engineering Task Force's (IETF) database of Requests for Comment (RFC) used to define the various protocols.
And remember, most of the above appears to happen instantaneously to connect computers that could be on opposite sides of the planet - something that fills me with amazement.1499 words. Image credit: © 2007 rickz under a Creative Commons license.