Oversimplifying the Internet: TCP

When two people discuss, they need to be able to speak the same language. With computers, the same applies – but instead of languages, they use protocols to communicate with each other. Without a common language, the risk for a misunderstanding can be quite high.

I have introduced a few protocols earlier in this blog: DNS and HTTP in this post, and HTTPS in this post. These protocols are application layer protocols – meaning, the language that the applications use to speak with each other. In my previous examples, a browser (the application which you use to look at internet sites) communicates with a server using either HTTP or HTTPS. But for this information to flow between the applications that exist on different computers, it needs to be transported from one computer to the other. And to transport information from one computer to another, a transport layer protocol is needed.

In this post, I will introduce one of the most common protocols used for transporting information in the internet, TCP. I will follow this post later with another one introducing another common transport layer protocol, UDP. Despite having the same purpose of transporting information between two computers, these two protocols have quite differing uses. Many other transport layer protocols exist, but when you are going about your regular daily internet tasks, it is highly likely that your data is being transported with one of these two.

As a grossly oversimplified introduction: TCP is reliable but slow, UDP is unreliable but fast. When it is crucial that all data packets are received in the right order (such as a file transfer), TCP is used. When it is more important that there is as little latency as possible (such as a video conference), UDP is used.

It needs to be said that TCP includes other benefits than just to keep track of the sent and received data. TCP is able to control how fast data can be sent between the endpoints, which means that an overly enthusiastic device cannot flood a more sluggish counterpart. In addition, TCP uses an algorithm which is able to detect and prevent congestion in the network. However, in the spirit of gross oversimplifications, we will skip these aspects in this post.

Now, let us give the stage to our trusty carrier pigeon, who will demonstrate TCP to us, in a way that only a pigeon can. I will also include screen captures to demonstrate what the same data looks like in Wireshark.

TCP

In all simplicity, a TCP connection has a beginning and an end, with data being sent in the middle. I will explain each part separately in their own sections below: TCP 3-way handshake, TCP data transfer and Closing the TCP connection. But I will first introduce some of the concepts that we need in our TCP connection.

For a server to be able to accept TCP connections, it has to have a TCP port open and listening for new connections. Imagine the server as an apartment building, with 65535 windows; some are open, some are closed. The pigeon needs to know which window to fly to.

The same applies the other way around as well: for the server to be able to send us a reply back, it needs to know which port we are listening for the reply at. So when we decide to initiate a connection to the server, we need to pick a port for listening, and let the server know the port number.

In this example, we will form a TCP connection to the port 1000 on the server – this will be the destination port of the packets that we send. We will also pick the port where we will be listening at for the server’s replies. Let us select the port 33388 – this will be the source port of the packets that we send.

As a side note: in most TCP connections, the server port will be a low port (below 1024) and the client port will be a higher port. The ports below 1024 are called system ports, and the ports over 49151 are called ephemeral ports. The ports in between, 1024-49151, are called user ports. System ports and user ports can be assigned for a certain service by the Internet Assigned Numbers Authority (IANA), whereas the ephemeral ports will never be assigned.

In addition to the ports, we will include some additional information in each packet as well. Among other things, each TCP packet will include the information of how many bytes of data the sender of the packet has sent thus far in the connection, and how many bytes of data they have seen from the other endpoint in the connection. This will be done using the sequence number and acknowledgement number information, respectively. With this information it is possible to notice if packets have disappeared from the middle, and resend the missing data.

TCP 3-way Handshake

As any proper new acquaintance, we will begin the interaction with a handshake! The TCP handshake begins with a SYN packet. In the SYN packet, we also send our initial sequence number, which is a random number. To simplify things, we will select the initial sequence number to be 0. We also send the acknowledgement number as 0, as we currently do not know what sequence number the server will use.

The server is happy to form a TCP connection with us, so they respond to our handshake request. They send a SYN ACK packet back, indicating that they acknowledged our SYN packet. Again, the server picks a random number as their sequence number – to simplify things, the server will now pick their initial sequence number to be 0. In the acknowledgement number the server will now do two things: it acknowledges the sequence number we had selected, which was 0, and it indicates that the next byte it is waiting to see from us is the first data byte, so it increments our sequence number by 1. Thus, in the SYN ACK packet, the server uses acknowledgement number 1.

To finalize the three-way TCP handshake, we send an ACK packet back. From now on, we will use the sequence number to express the next byte in the connection that we will send, and the acknowledgement number to express the next byte in the connection that we expect to see. And as we have not sent any data yet, the next byte we will send, and thus the sequence number, is 1.

TCP data transfer

As the first pigeon flies off with our ACK packet, we decide that we do not want to wait – the TCP connection is now open, and we can send the first data packet immediately. The first data packet that we need to transport is 8 bytes long. The sequence number remains at 1, as the next byte that we are sending will be the first byte, and the acknowledgement number remains at 1, as we have not seen any data from the server yet.

The server receives our first data packet, and thus has now received 8 bytes of data from us. They send us an ACK packet back, acknowledging that the next byte that they are expecting is byte number 9, but that they have still not sent any data so their sequence number remains at 1.

After sending the ACK packet, they send us a data packet back. Their data packet contains 11 bytes of data. They still use sequence number 1, as the next byte they are sending is the first byte, and they use the acknowledgement number 9, as that is the next byte they expect to see from us.

We again acknowledge receiving the data packet by sending an ACK packet back to the server. We use the sequence number 9, as that is the next byte we would be sending, and we use the acknowledgement number 12, as that is the next byte we would expect to see from the server.

Closing the TCP connection

This was enough for us – we will now close our side of the TCP connection. We do this by sending a FIN packet to the server. The FIN packet will be regarded as if sending one byte of data by the following sequence and acknowledgement numbers. The server acknowledges our FIN packet – and as said, it increments its acknowledgement number by 1, using now the acknowledgement number 10. The server also wishes to close its side of the TCP connection, so when acknowledging our FIN packet, it does it with a FIN packet of its own.

We receive the server’s FIN packet, and acknowledge it. Our sequence number is now 10, as sending the FIN packet incremented it by 1. And again, we increment our acknowledgement number by 1.

Both sides have now sent a FIN packet and acknowledged the other side’s FIN packet, which finally closes the TCP connection.

—

This was an example of a short and trouble-free TCP connection. All pigeons got to their destination without any issues and they did not fumble up the contents of the packets they were carrying. If this was always the case, then the added latency from the handshake and the closing of the connection would not really be justifiable – then the benefits gained from a UDP connection would be more desirable.

But, in the internet, that is not always the case. The pigeons get lost on the way. One pigeon may face a massive headwind while the rest of the pigeons get to their destination swiftly. One sender can be too eager and send pigeons out faster than the other side is able to receive them. It may even be that when the pigeon arrives at the destination, it turns out that there is such a massive crowd of other pigeons that the pigeon cannot access the window they are headed at.

These are the events where TCP gets to shine. When one pigeon accidentally chose to rather visit a nice bird bar on the way than to head to their destination, TCP is able to recover and resend the data the drunkard pigeon was supposed to transport. Both sides will let the other one know the maximum amount of data they can at any point handle, and if the sender sends the maximum amount of data, they need to wait until the other side confirms that they can take more in. And finally, both sides will constantly evaluate how fast the pigeons are being sent and received, and use that information to avoid causing congestion on either end.

This makes TCP sound like the sole saviour of the internet. There are however cases where it is more desirable that the traffic flows as fast as it can. Where it does not matter if some pigeons find something better to do, or if some pigeons fail to reach the right window due to it being overflown by other pigeons. These cases are more suitable for UDP – and, as promised, more on that in a later post.

There we go again – one more protocol simplified! Thank you for reading, and see you again next time.

Oversimplifying the Internet: TCP