Why are we still here? Just to suffer?
You’ve probably stared at your fair share of little “Loading…” texts and funky throbbers. Probably far to many and for far too long. It’s almost a spiritual experience waiting for the next page to load when you’re in a hurry and the network is barely there. Even better if you’re taking an online exam with 300 or so virtual peers and the poor server you’re all trying to talk to is cursing the gods that wrought its wretched existence. Maybe you’ve had that same experience, but the other people were all next to you in the same big room, on the same big LAN.
Maybe it isn’t something loading slow, but outright failing to connect. We’ve all tried accessing a computer from another computer, and wondering why “Refused to connect” shows up on our screens. Maybe you’ve run out of options and tried typing in different IPs only for one of them to work, but only sometimes.
What does that have to do with any of this?
So what is it that is actually slow or broken in these scenarios? Is it the network? Is it the server? Is it your client? Or is it something else entirely? Usually it’s a mix of all three. But how would you even know? Did you ever actually check? How would you even do that? They tell you that your device is wireless, but if you actually look inside it, you’ll see lots of cables and wires. Are they lying to you about the networks as well?
Networking is a notoriously complex beast that we managed to abstract away in our everyday usage. It’s a bunch of copper and optical-fiber cables or even radio signals, hubs and switches and routers, network stacks galore, and even more applications on top of them.
It’s not a single entity that you can just look at and say “this is the network”. It’s a whole bunch of things that are all connected together somehow. To make sense of this mess, we’ve come together to create a reference that we call the OSI model.
The main character
The Open Systems Interconnection (OSI) model is a set of seven layers that are used to describe the way data is transmitted over a network. It’s a standard that is used by many networking protocols and (other) standards. The layers look like this:
Each layer is responsible for a specific part of the process. The layers are ordered from the lowest to the highest. The lowest layer is the physical layer which deals with the actual physical transmitters (wires, antennas, etc). The highest layer is the application layer, which deals with stuff like JSON and/or REST. It’s the closest one to the user and the layer that you’ll (probably) have most of your interactions with as a developer.
The model described here isn’t the end-all-be-all. Some technologies bleed into several layers. Some RFCs allude to which layer they belong to, but never actually specify it. All that to say, in the wild you are probably going see something that doesn’t quite fit the model or breaks some assumptions. That’s just the nature of trying to simplify or model an inherently very complex and intangible thing.
Getting back on topic…
Shorthands
It’s kinda cumbersome to write out Layer 7, Layer 5 or the Transport layer. So whenever you see something like L7, L5 or Transport it refers to that layer.
There’s probably an official shorthand somewhere, but for the purpose of this journey, my notation will do.
What’s up with those parentheses?
Nice catch, it means you’re paying attention.
It’s because those layers are where things get a bit muddy. It’s also why I said JSON and REST when talking about the Application layer even though JSON is a format and REST is an architecture.
The Session and Presentation layers are commonly rolled into the Application layer because they come in fixed pairs 99% of the time. It would be weird to have, for example, a HTTPS request (L7) not using TLS (L6) on something that isn’t sockets (L5). Sockets here mean network sockets aka the thing that’s just above the TCP or UDP connections (we’ll get to it), and not WebSockets which are L7.
It’s easier to just omit L5 and L6 because L7 implies what they should be. If you’re doing something weird and need to specify everything, you can always do that. But for the rest of this article, we’ll just act as if they’re not there:
This also means that we’re basically talking about RFC-1122 aka the TCP/IP model. It’s a bit of a bait-and-switch, but it’s what the Internet is based on. All the concepts are analogous, but the exact details may vary a tiny bit in some places. We’ll just squint a bit and pretend that they’re the same thing and move on.
Blissful ignorance
An important thing to notice about the OSI diagrams up until now are the arrows. Each layer only knows what the layer below it is, and how to talk to it. The dotted arrow to the higher layer (the ones pointing downwards) represents communication in the sense that data is being transferred. That means, for example, that the Transport layer knows how to talk to the Network layer and what the Network layer is, but the network layer only knows that something is reading or writing to it and doesn’t care about who or what is being transmitted. It sounds a bit vague when said in the open, but it’s really basic after you grok it once.
Capsules all the way down
To achieve the not-caring about who or what is being transmitted, all layers are based on encapsulation. Each layer, aside from the data, has its own metadata. The metadata is usually in the form of headers and/or footers. It contains all the information that’s needed to route the data to the right place. The data the layer carries is usually some opaque blob of bits that the layer doesn’t really care about too much. For a HTTP request (Ethernet, IP, TCP, HTTP), it looks something like this:
Each layer is responsible for its own headers and footers. It adds them when it gets some data to process, modifies them when it communicates within the same layer, and removes them when it passes the data back to the higher layer. So the dashed arrow in the OSI diagram represents the data being sent, and nothing but the data.
L1: They’re in your walls
OK, first of all, let’s not forget that you’re still bashing your head against the table because your bank’s website has been loading a random image for the last 6 minutes.
Your cool wireless keyboard, that you modded yourself, just died (you forgot to turn off the RGB). You plug it into the computer and, while you wait for it to get some charge, you can inspect the first layer of our network onion.
It’s the physical layer, the layer that deals with the actual physical wires and transmitters. It is where all the tingly and spaky stuff lives. It’s also the only layer you can touch and taste (to check if the wire is live).
In this case, you just nudge your cable to see if it’s plugged in properly. You also remember to do it on both your PC and router. While walking from one to the other, you make sure the cable is not obviously damaged or severely twisted. If you have one, you can use a cable tester to check the cable for integrity. If you crimped your own cable, make sure that you got the wiring right (T568A vs T568B) and didn’t accidentally create a crossover cable:
If you’re not using a cable, check if your adapter is plugged in and working properly. You can look at the LEDs on the adapter to see if they’re blinking to indicate activity. If you have a smartphone or other Wi-Fi enabled device nearby, you can probably load up some sort of app to show you signal strengths and other info.
A common problem with wireless connections is having too many Wi-Fi networks nearby and on the same channel and frequency. This can cause interference and make it hard to connect to the network. Check if the network you’re trying to connect to is on a different channel from most of the other networks.
It’s pretty rudamentary, but there’s not much more you can do beyond this point without more specialized equipment. So if everything seems kinda right, we can move on to something a bit less tangible.
L2: A Link Between Worlds
OK, so you’ve checked the cable and the adapter. Your keyboard sucked up enough of those sweet, sweet electrons, and your fingers are itching to type. Hopefully not from fiddling with small cables in very dusty places. You are now ready to look into the Ether(net).
Link layer protocols transfer data between two directly connected devices. They also usually detect and mitigate minor errors that may occur in the physical layer. Ethernet is technically also a physical protocol, but it’s a part of OSI L2. It’s distinct from the IEEE 802.11 (WiFi), but here we’ll just treat them like they’re the same thing.
The data is sent in chunks called packets which contain frames. Why two names? Because packets are the things that L1 cares about and contain additional parts such as analog info (intra-packet delays) while frames are the digital bits that L2 handles.
There are a couple ways of looking at ethernet frames. They’re mostly “upgrades” that came along as the standard evolved. First came Ethernet made by DIX (Digital, Intel, Xerox) which evolved into Ethernet II a couple years later. That’s the specification that’s used basically everywhere nowadays. The IEEE also made a standard based on Ethernet called IEEE 802.3 aka Carrier-sense multiple access with collision detection (CSMA/CD). They’re the basically the same thing.
A packet is made up of the preamble, the SFD (start frame delimiter), and the frame. The frame is made of the header, the payload, and the FCS (frame check sequence). The header contains the source and destination MAC addresses, the type of frame, and the length of the frame. The payload is the actual data being transferred (some data from the higher layers). The FCS is a CRC-32 checksum that’s used to detect errors. If the FCS doesn’t match, the frame is discarded.
