Internet Architecture vs Security
How Security Pragmatism Overcame the Internet Architecture Conservationists
Turn of millennium was a hidden yet decisive turn in course of the Internet development. The notion of middlebox emerged and started to change the course of the Internet design based on the end-2-end principle. Let us use the distance of almost two decades to look back what has been happening, how impatient market forces prevailed over Internent architecture conservationists, and how security has been driving this change.
The "religion" upon which the Internet was born was the "end-to-end principle" articulated in 1981 by Saltzer, Reed and Clark. It foresees smart end-devices and dumb networks in order to decouple application innovation from network maintenance. Another principle is that of Internet transparency that was summarised retrospectively in 2000 in RFC2775 by Brian Carpenter. It suggests that IP packets flow over the Internet unaltered so that they can be processed in the same way at any part of the Internet they can visit. This way requires no processing state to be kept in network and is thus well-aligned to the end-2-end principle. By the original principle, IP packet's addresses uniquely identified the source and destination machines. The RFC went into several reasons why transparency was already at that time diminishing.
The Internet adoption in nineties (see [IU], [IWS] and [ISC-S]) greatly owned to these principles. Core networks remained simple and allowed almost uncomplicated growth inside the networks. Most alignments occurred in early nineties and related to the operational practice how to best organise the IP address space (see CIDR). At the same time innovation gained in momentum by separation from the infrastructure that tends to be maintained in a conservative and capital-intensive way. The nature of the IP model spread the Internet over technological and geographic borders in a way that balkanized telecom technologies hadn't manage before.
The growth unveiled the first serious weakness: IP address space utilisation reached an alarming level (see IPv4 address exhaustion), a problem which has been recognised back in eighties and still exists today. The IETF standardisation process offered a new version of the IP protocol, IPv6, to dramatically enlarge the address space. Yet the standardisation process turned out to be lengthy, and the migration to IPv6 appeared to be a chicken-end-egg problem: IPv6 networks would loose their IPv4 users, and IPv6 users would loose their IPv4 connectivity.
The market was not prepared to wait for the IETF. Routers already shared IP addresses among multiple devices by a technique known as "NAT Address Translation". Devices behind such a NAT have their own private IP addresses (typically beginning with 10., 172., or 192.168), and the NAT maps them to its public IP address using TCP/UDP port numbers. This was not without controversies as the NATs did depart from the transparent Internet model. An IP address no longer identifies a machine, and the IP packet is changed en route. Yet the NATs solved a compelling problem and simplicity of adoption prevailed over conservation of the architectural model. The NATs were the most significant precursors to what was later coined "middleboxes".
The next problematic situation inviting "middleboxes" was adoption of VoIP. VoIP is by nature a more complicated technology than web and email that dominated the Internet traffic in 2000. If nothing else, then VoIP data must flow real-time between multiple parties. The brand new VoIP protocols and a lot of backwards compatibility to PSTN didn't make things easy. Perhaps adopting HTTP would have kept things a bit simpler but that's speculative and it wasn't attempted until lately in 2020 (RIPT BoF). Instead Session Initiation Protocol (SIP) emerged and from beginning on inherited the NAT problem. SIP telephones began to make themselves known to their peers by their private IP address whereas they were only reachable by their router's shared IP address. VoIP simply didn't work.
VoIP adopters were little keen to accept IETF conservationists' offer to await arrival of IPv6 and deprecation of NATs. End of 2018 we still have limited IPv6 adoption and thereby solid evidence that waiting for it would have been odd. Compromise proposals began to emerge that offered how to marry application awareness with the transport network.
In turn of century the notion of middleboxes emerged. In 2000 in Australia, the IETF formed a "foglamps BoF" that later turned into "midcom" working group. The group coined the notion of "middleboxes" in RFC3303 and began to design devices that try to reconcile applications with network connectivity in departure from the conventional Internet architecture model. The group created several proposals for interaction between applications, NATs and firewalls. Alone the words of "control" and "firewall" were by then controversial in the IETF whose historical inclination built upon "stupid, simple and fast" routers. The word "control" too much resembled the legacy telco technology from which IETF has been striving to distance. All in all, the midcom effort has only gained isolated acceptance.
The history repeated, market didn't wait again and placed VoIP middleboxes in the networks. Where NATs fixed shortage of IPv4 address space before, Session Border Controllers (SBC) fixed NATs for VoIP. That proved the conservationists right in that one hack (NAT) did produce another hack (SBC). Yet phone calls worked by bridging the gap between apps and transport in a single box. This departure from end-2-end principle was not only of aesthetic nature. Indeed, by design it features shortcomings against which the conservationists have warned. To name at least some: the SBCs hop-by-hop security model is weaker than end-2-end security, the SBCs become single point of failure, and their monolithic nature fails when signaling takes a different route than media. Keeping the application logic up-to-date with all other equipment soon becomes an architectural bottleneck as well.
The other driver for emergence of SBC middleboxes was increasing demand for security. The growth of the Internet has been accompanied by growth of attacks with different motivations: curiosity, fame, fraud espionage, schadenfreude. Initial attacks in eighties, such as the Internet Worm in 1988 were succeeded by orchestrated attacks using botnets such as Slash Zero in 2012, with millions of compromised hosts to attack their victims. It was clear that network administrators cannot cope with the attacks without knowing what's going on. What emerged as answer was the Deep Packet Inspection appliances -- all kind of middleboxes, SBCs among them, that intercepted, inspected, analyzed and policed the application traffic.
The importance of intelligence gained by DPI appliances has begun to diminish though. The interest in user privacy and the actually growing use of encryption and end-2-end security ([SSL], [HTTPS]) has made life way harder for attackers. Yet network security administrators are confronted with the very same problem: they started to loose insights into what's going on. The intelligence effect of DPI appliances is minimising and the balance between security risks, user privacy and network security needs to be re-adjusted. Dynamics of clouds continue to re-inforce this trend. The "old good" Internet model with hosts identified by an IP address is being replaced with ephemeral virtual instances, that pop up and down at any point of time and communicate to outside using encrypted channels. In fact, the notion of host as known in the original Internet architecture has become outdated.
We recognise two major trends in winning back the balance between end-to-end use security, and administrator's network security: log analytics combined with dynamic network access management.
The advantage of log analytics relies on the observation that there are already many machines in the network infrastructure with thorough insights into what's going. They produce extensive log events across the whole stack: IP routers on encountered IP flows, proxy servers on requested URLs. The logs survive ephemeral nature of cloud instances. What it takes is collecting these logs horizontally across the cloud and vertically across the protocol stack, aggregating them and analyzing them. Then a crystal-clear picture of network emerges like images did in chemical bath in the ancient age of analog photography.
With this intelligence in place, we can find out early what traffic is trying to break in a network, and find out where it has happened already. Yet finding out what's going on is not sufficient -- a counter-action must happen. This must be an automated process since modern attacks are automated, orchestrated on a large scale, and often started in off-hours. Manual intervention simply cannot keep pace with this attack sophistication.
Real-time automated response is what really helps to regain security balance and what SDN vehicles lend themselves for. The SDN technology, BGP flowspec or openflow to name some, allows to adjust network access policies in real-time. With the intelligence gained during the analytics process, one can spot offending traffic sources and block them early. One can also close the network by default and only open small pinholes for hosts after additional validation, for example using Multi-factor Authentication.
In conclusion, we have observed that the original Internet design based on end-2-end principal and transparency has changed over decades. Security technology, initially absent in the early Internet, has been a major driving force in this process. In eighties and nineties packet filtering firewalls and NATs have split the end-to-end Internet in networks with interconnection limited on purpose. Later, Deep Packet Inspection began to dig even deeper in the packets. In the past years, DPI appliances as combined intelligence and enforcement devices began to loose in importance due to advances in end-2-end security. What is emerging instead is collaborative collection of logs from all places in the network to re-gain the intelligence, correlated with information about packet flows, analyzed and used to dynamically counter security attacks.
Trends in the Internet architecture evolution have been spotted by academics and various articles have been published on the subject of security being driving force. Many observation that middleboxes are to stay in some form and what implications they have to the Internet architecture were published in 2000-2005 timeframe.
A similar view on the role of middleboxes in the computer networks can be found in the following 2004 academic article:
It is worthwhile mentioning, that one of the co-author, Robert Morris, is also known as the author of the Internet worm mentioned above.
Blumenthal and Clark, one of the e2e paper authors, have published a paper "Rethinking the Design of the Internet: The End-to-End Arguments vs. the Brave New World" in 2001. The paper identifies lack of trust mechanisms in the original Internet design as a key reason for today's departure of the Internet architecture from the initial model.
The IETF also acknowledged in 2004 in "The Rise of the Middle and Future in the End-to-End" security to be the most pressing force: "the single most important change from the Internet of 15 years ago is the lack of trust between users".