Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Let's code a TCP/IP stack (2016) (saminiir.com)
440 points by peter_d_sherman on June 27, 2021 | hide | past | favorite | 49 comments


Past related threads:

Let's code a TCP/IP stack, 1: Ethernet & ARP (2016) - https://news.ycombinator.com/item?id=17316487 - June 2018 (47 comments)

Let's Code a TCP/IP Stack: TCP Retransmission - https://news.ycombinator.com/item?id=14701199 - July 2017 (30 comments)

Let's code a TCP/IP stack, 1: Ethernet and ARP - https://news.ycombinator.com/item?id=11234229 - March 2016 (49 comments)


Have a look at a TCP/IP stack written for ham radio, originally for CP/M: http://www.ka9q.net/code/ka9qnos/


Nostalgia anecdote: we used KA9Q as the routing software on a 286 PC with an ISDN uplink.


ISDN was super fast - I remember playing Warcraft 2 on an ISDN connect with a friend down the street :)


growing up in the UK stuck with 56Kbps modems, when prompted to select network speeds I remember seeing things like ISDN (128Kbps) and T1/T3 Cable (1Mbps) and thinking wow, those Americans have lightning Internet


Residential ISDN was very rare in the US. I rented a house in the 90s with a few fellow nerds and we got ISDN in our house, but it was definitely expensive (and required a network terminal device) and you always had to be routed to a specialist at the phone company if there were any issues, since most of the support people had never heard of it.


You could get ISDN2 installed on domestic premises in the UK. It wasn't cheap though, because as soon as you wanted a bonded 128k channel the cost could become somewhat eye watering for a domestic user (or even a small business).

The good thing though was that you got a guaranteed 64k rather than the vagueries of a 56k modem that in practice never actually worked at 56k, more like 30-45k depending on phone line quality.

On top of that you could (if your wallet was fat enough) get ISDN30 (E1) which was akin to a T1 but ~500k faster.

I did a fair amount of commissioning of these things back in the 90's.


Oh how the tides have turned.


ISDN was the only way to get < 300ms to Blizzard EU services from South Africa…

WCIII was p2p IIRC so this let me play ranked matches against EU players.

Shout out to anyone who remembers “Warcraft III ZAF-1” — I swear that sequence of characters is burned into my muscle memory.



Coming from C/C++ and Having worked for 5 years within the java world now, i must admit that the naming "conventions" and variable abbreviation in c code or more specifically Linux kernel code form a high entry barrier for me. Why is it so hard to write things out? We all look up to the various Linux philosophies (keep it simple, do one thing, etc), but to me it feels that this kind of code style is not written with a human reader in mind, but comes from the programmer perspective of "what is the shortest way and type as few characters as possible."

This kind of overhead seems so unnecessary and alienating.


C and Unix are from the early 1970s. They date from an era that predates the prevalence of large, bitmapped displays. There is a famous photo of Dennis Ritchie and Ken Thompson at a teletype in 1972 (http://www.columbia.edu/cu/computinghistory/teletype/ken-and...). When the luxuries of modern IDEs, bitmapped displays, and high amounts of untapped computational power are unavailable, the programming environment and operating system need to adapt to the technologies that are available. The convenience of terse names is more pronounced in an environment of teletypes.

On the flipslide, Smalltalk, also developed in the 1970s, has much longer names. But Smalltalk was developed for the Xerox Alto (https://en.wikipedia.org/wiki/Xerox_Alto), an expensive machine with a bitmapped display, luxurious even by 1980s standards. Lisp also gradually embraced longer names as technology improved; compare the early Lisps of the early 1960s to Common Lisp, which first appeared in the mid 1980s.

C and Unix are products of their relatively spartan environments from the early 1970s, whereas Java, a product of the 1990s, was influenced by Smalltalk, which was born under less austere conditions.


> On the flipslide, Smalltalk, also developed in the 1970s, has much longer names. But Smalltalk was developed for the Xerox Alto (https://en.wikipedia.org/wiki/Xerox_Alto), an expensive machine with a bitmapped display, luxurious even by 1980s standards.

Long selectors in Smalltalk make sense as they're split and interleaved with arguments. Even a single programmer would probably write the same code for both environments differently when it comes to naming things.


> are from the early 1970s

But why do we still have to suffer in 2021?


We’re still using QWERTY keyboards, some things are just a product of history.


I feel the same way. This limitation might have made sense in times of teletype, limited variable names and small displays, but that was decades ago! We have powerful IDEs that automatically finish every identifier we write, huge, high-resolution displays and exhaustive online documentation available at all times. I don't see any reason to continue writing code like that: In the comments of every single coding style article posted on HN, people are quick to throw stuff like "code is written once, but read many times" around. Why wouldn't this apply to C or Linux kernel code?


I think the name shortening is very common in practically every science - and when we are not restricted to ASCII it is even worse, you would see mostly single letter names, combined with subscripts, superscripts, fonts, extra alphabets like Greek and so on.

At least things like ifr_flags are easy to read, easy to type, and unique enough you ca search for them.


They literally had size limits on variable names back then.


Not just variables. When I was doing 68k assembly, even around year 2000, the assembler had a limit of 8 bytes for all labels. So, functions, branches, jumps, etc all limited to 8 byte labels. I was in college at the time and any sort of external linker or anything was beyond us at the time, so it meant all of our programs were in one monolithic file. It was hard to manage and give meaningful names by hand to labels in a 40k LOC assembly file where every label had to both be unique and only the first 8 bytes mattered.


This is an awesome reference but I wish the code was shown as part of the tutorials.

The Github code today is somewhat more complicated to follow.

If you know of any tutorials that walk through a TCP/IP stack with all relevant code I'd love to hear of it.


Have you seen Richard W. Stevens book TCP/IP Illustrated? Specifically volume 2 'the Implementation', which is, I think what you're looking for (albeit slightly dated).


Yes I own the series but their focus is a little more on completeness. I know I'm getting picky :) but TCP is pretty complex so I keep looking out for the most minimal code and walk through to understand all the state management and other flow.


Take a look at Adam Dunkels' lwIP and uIP - http://dunkels.com/adam/software.html.

Also Jeremy Bentham's TCP/IP Lean: Web Servers for Embedded Systems is pretty good.


Woah, I love Adam's work. Thanks for sharing.


Not a tutorial but did you look at PicoTCP ?

https://github.com/tass-belgium/picotcp


That version is abandoned and has quite a few severe bugs. There is a GPL fork which is maintained https://github.com/virtualsquare/picotcp


In addition to the Stevens books already mentioned, there's also volume 2 of Comer's Xinu OS book:

Operating System Design, Vol. 2: Internetworking with Xinu, ISBN-13: 978-0136374145 (https://www.amazon.com/Operating-System-Design-Vol-Internetw...).

Unfortunately, this book was written in 1987, so it's a bit dated (e.g. no IPv6). I think it's still useful to learn the basics of a TCP/IP implementation.


Interesting, thanks for the reference.


Jon Gjengset has a series of videos implementing TCP in Rust.

https://youtu.be/bzja9fQWzdA


There is also a book : TCP/IP Lean http://www.iosoft.co.uk/tcplean.php


Ethernet, ARP, IP, ICMP and UDP are all the fun, until you get to do TCP.



Notice these classic technologies. They’re SIMPLE. That got lost in later development up the stack.


Ethernet and IP are simple. But TCP is super complex to me with all the state management and buffers and sliding windows. It works well but I find it terribly daunting to dig into (I've tried a few times, and still aim to succeed sometime).


Ethernet was simple, with VLAN.. it isn't so simple anymore..


I mean ethernet, the protocol. It just wraps packets with some headers and a checksum. Are you saying it gets more complex than that? (I haven't dug into ethernet much more deeply than I describe).


I don't think I would describe TCP/IP as "simple". there are parts of the spec that contradict each other! it's impossible to implement fully.


Can you elaborate?


The layer system used by TCP/IP (layer 2 -> 3 -> 4) makes it "simpler" because everything can be compartmentalized without worrying because it was designed to be that way. The real problem comes when you use TCP/UDP for higher level protocols and that's where the real difficulty is.


There's really no such system. In TCP/IP, we refer to "layer 3" and "layer 4" protocols, in reference to OSI, but TCP/IP isn't an OSI protocol, the lines between L2 and L3 are blurred by things like ARP and DHCP (or completely eliminated in overlay networks and multicast), and none of the rest of the layers are even informally specified for TCP/IP.

If all you're saying is that TCP/IP has, among its many goals, a separation of concerns, sure. But most major systems designs have separations of concerns.


The fundamental problem is that once your break/compartmentalize into layers it's harder apply global optimizations at local layers. You are trading off ease of implementation and compositionality for performance. You have to bubble up optimizations up the stack.

This is partly why QUIC is based on UDP instead TCP. You don't have all the information needed at the TCP layer to do the kind of optimizations that QUIC does.


This lack of "global optimization" I think is something that should be solved by a new protocol stack. (If it could be done without replacing the whole stack i'd be in favor of it, but I have no idea how)

Say you have a backend tcp http REST app that takes requests to modify a widget. To receive the request, the payload will pass over a series of networks and protocols, all with potentially different restrictions and modifications, until it gets to a load balancer, and then maybe a service mesh router, to a network service listening on the node of some container orchestrator, and finally read by the backend app.

All of the layers will have been replaced numerous times, and possibly the actual data payload modified. But the only thing your backend app knows is the final state. And then, somewhere along the way, there is an error. It could have happened anywhere along the chain, for who knows why. But you and the app server won't know when or where or why the error happened because none of the information about the changing layers (or network operations along the way) is carried along.

If the entire stack were not actually a stack, but instead a kind of commit log of transactions and instructions, we could see every single step in a transaction (ideally verified by cryptographic checksum). This would make it faster to diagnose problems and allow automation to work around known issues - in addition to making it harder for attackers to mess with traffic. In the simplest examples it would look like a stack, but for more complex paths it would look like a log. We could also potentially use this method to do away with service meshes.


Sounds a bit to me like that would necessitate describing your internal network to every outside server or client it trades requests with, unless your routers become much more sophisticated.

I cannot imagine that would ever be acceptable for most companies.


Yes. I found it was quite possible to code up TCP/IP from scratch for my own tiny (arguably toy) bare metal environment here https://github.com/billforsternz/bmz


The Internet Computer is going to fix that (https://dfinity.org/)


Is there something similar but in python or rust or go? Not really keen on reading C.


Jon Gjengset here implements TCP in rust. It's long, but it is very good https://youtu.be/bzja9fQWzdA






Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: