Years ago, I attempted to build a user-space network stack in C [0] that processes raw packets through the TUN interface and got it working to a certain point. It currently includes a simple shell that allows configuring IP addresses, routes, and such. A hybrid structure reminiscent of both mbuf and sk_buf is used to hold the network packets. However, after completing the UDP implementation I didn't find the time or motivation to implement TCP. If you want to check it out, here's the link:
Many years ago, I wrote a pcap/tcpdump parser in pure bash, because it's all I knew how to write "programs" with. It was, of course, the slowest and most brittle thing of all time, but it did actually work. And was kinda fun. Wish I still had that code somewhere.
If you compile a minimal linux kernel without a tcp/ip stack -> 400KB.
If you add a tcp/ip stack -> 800KB.
For a project where I should just send the temperature, I just made a small C program in userspace that sent the value over a crafted UDP message, saved a lot of space (and complexity) :-).
The majority of the Linux kernel's source code is device drivers. The overwhelming majority of that is not included in the kernel image by default, but instead made available as kernel modules you can enable as needed. E.g., your thermostat probably doesn't need support for an obscure game controller, so doesn't have those drivers, but it could if you were so inclined.
Modern TCP/IP stacks have a lot of extra code, including for anti-spoofing, performance enhancements (eg zero-copy integration with hardware network cards), various attack prevention measures (SYN floods, randomization of sequence numbers, etc) support for various hardware offloading (including many network cards that will do checksum offloading, etc), IPv6 (that also originally mandated IPSec integration), support for lower layer 2 protocols (mostly just ARP for Ethernet, but there are still others around).
If you disable ARP, you can have a group of servers on the same network configured with the same IP! and if a server acting as a routing frontend can forward packets to a backend server's network interface by mac address (need a kernel extension for this trickery), that backend server will recognize itself as the destination, swap the source/dest IP and respond directly back to the client (without going back through the routing frontend)
Alternatively, you can accomplish the same without disabling ARP and by just adding the common IP address as an alias to the loopback interface, which allows the backend to recognize itself as the destination, but avoids ARP conflicts.
This was a trick used by IBM's WebSphere software load balancer back in the 90's-00's
> This was a trick used by IBM's WebSphere software load balancer back in the 90's-00's
Cicso IOS SLB can work in a similar way - a virtual IP added as an alias to loopback on each server in a farm. An advantage over more widely used L3 balancing that there is need to rewrite headers in IP packets.
>If you disable ARP, you can have a group of servers on the same network configured with the same IP!
The downside to this is that a switch/bridge will not learn the MAC address and continue to flood/broadcast these packets to every port in that segment. So if you do decide to do this make sure you make a dedicated VLAN. :)
ARP is for the LAN devices. L2 switches don't rely on ARP to build up their forwarding tables, they can just inspect the source MAC of every Ethernet frame they receive, and correlate it with the port they receive it on. Frames with unknown destination MACs are broadcast, but that stops as soon as every device in the LAN has sent at least one frame.
I did a similar thing in Python[0]. Probably not as well written and, to be honest, I just made up the address resolution algorithm. I got as far as pinging an internet host with ICMP. I like that mine is completely contained in a (short) notebook, though (the OP article misses many details that are in the larger source code that is referenced).
I hadn't seen this article and did mine all from Wikipedia! There is a huge jump in complexity for TCP, though, and I lost interest a bit. Part 3 of this covers that so maybe one day I'll read that and finish mine.
I found it very rewarding and it's definitely something that is doable by any level of programmer if you're interested in networking.
Years ago I instrumented a nuclear power plant. I did the client-side development on Sun workstations. I actually got hired because of my TCP/IP experience - which I got from taking "Operating Systems" at CMU. The plant computer on the other hand was a mini computer that had no TCP/IP stack and so that team had to create one.
One minute into it, the article says, "The dmac and smac are pretty self-explanatory fields"
This immediately turns off anyone reading it who doesn't know what those things mean. The thought process will be, "Oh, this article is for those for whom these fields are self-explanatory. Since it's not for me, I'll stop reading"
The full quote would be "The dmac and smac are pretty self-explanatory fields. They contain the MAC addresses of the communicating parties (destination and source, respectively).", it does explain them. However, this is an article about how to make a network stack, it is safe to assume the reader should know something about networking before hand.
I don't get where the author get the 10.0.0.4 IP address from, the one used to test ARP resolution. What is it supposed to be the address of? A fake device accessible to the made up Ethernet device programed here? Or is it an actual device on the author network?
Can someone explain that?
A TAP device is like a software emulated ethernet link (or any layer2?). So if you send packets into it they get sent directly to your user-level program. It's then up to you program to decide what IP address(es) it wants to have and reply to ARPs etc. Normally this kind of thing is handled by the OS and adding IP addresses to the interface requires root permissions (as does opening the TAP device). Networking is largely cooperative and a bad actor with root permissions on your network can do bad things.
Forgetting to mention that explicitly in the article is a big miss, I think. It makes the ARP part feel like it's missing crucial information or is not actually entirely explained, while it's the previous part that misses something.
[0] https://github.com/cakturk/unet
reply