Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Teaching Greybeard IT
60 points by Damogran6 on Oct 14, 2022 | hide | past | favorite | 93 comments
I'm finding that I picked up a LOT of IT knowledge over the past 40 years...and that some of our new hires are missing out on that foundation...skills are weak on networking, Bottleneck troubleshooting, understanding what a healthy or unhealthy process looks like.

There's stuff online for mutexes and race conditions and VonNeumann bottlenecks in general, but I'm having a hard time finding the concept in general.

Do you have any suggestions on where I can point people, or should I just start throwing stuff together myself?




I was just thinking the other day that a non-profit wiki for sysadmins would be invaluable.

It could cover all the basics but also feature case-studies of real-world examples of solving problems, so there could be a sort of knowledge base too that would attract seekers and lead them into the foundational content.

10 or 15 years ago, I remember being able to find detailed problem solving information on Google. Today the search results are a morass of affiliate and SEO content multiplied by plagiarism served from a mountainous dung heap of user-generated blather.


I'd love to share security experience, suitably redacted, for experience learned from past engagements...it's easy to do at the water cooler, but that audience is limited.


Worked in info sec for a while now. Allow me to explain how every one of those stories go.

"We told them that you can't be running a production service on a Windows 7 desktop underneath the developers desk. They responded there was no budget for a server this year and that they couldn't take the time to port it anyway since they were too busy. We went to management and they told us to stop bothering the developers and they didn't give a damn about security that was our job so fix it. [3 minutes later] so anyway after they exfilled with the passwords and the social security info of all the employees we managed to get things restored from our 3 month old backups, and the CEO fired everyone in security. So that's why I'm currently in the job market."

Or

"Ya so we got this meeting invite from some PM and get to his meeting with like 3 execs two dozen devs and enough middle management to make a B2B salesman weep for joy. They start the meeting by thanking everyone for the past two years of diligent work and the nights and weekends and promised to reimburse everyone for the legal costs associated with the ongoing divorces. Finally they ask us as security if we can give it the final approval so it can go into production at midnight. We explain we've never even heard of this project till now and what the hell is going on. Then one of the anonymous herd of grey suited PHB explains that they didn't invite us to meetings or ask for our help because they needed to "move fast and break things" and that security would've just slowed down their rockstar ninja wizard developers. Meanwhile my coworker has been poking at this for the past 10 minutes of the meeting and says there is no way in hell this thing is ready to ship. When asked why he pointed out that passwords were being sent in plaintext via a GET parameter in every request, every field was a SQL injection vulnerability and for some reason he was able to randomly kill processes by PID running on the server if he created a username that had a non ASCII character. The PM who called the meeting said we couldn't let good be the enemy of perfect and they were shipping anyway, which was met with thunderous applause. Well then you know what happened after, the lawsuits have mostly died down and the good news is my defense proved I didn't have any personal liability."


I blame responsible disclosure. If your wide open infrastructure and customer data is not fair game for anyone who wants to host warez and bankrupt you, then the risk of security vulnerabilities is just another low priority business risk, we'll deal with it when it happens. It's not an existential risk.


I think a LOT of this can be distilled down to a pessimistic anecdote...that doesn't mean they're not relevant.

Like how the 3rd part vendor likes to make passwords like %companyName%%YearOfEngagement%%symbol%

and how that might be bad, especially if your NAS admin console is discovered to be internet facing.


100x this.

If you’re not in an org that takes this stuff seriously, don’t walk, run.


They take it seriously enough to spend millions of other peoples money to cover thier ass with handwavy hire-a-bunch-of-security engineers and then lock them away with thier toys and completely ignore them until they need used as a scapegoat.


Where I work it’s a pretty respectable setup. Security efforts aren’t half assed.

So I’m staying.


Costs of ongoing divorce - ahha, made my day!


This; something open ended, between a wiki and stackoverflow, would be a wonderful resource to offer curious minds.

For anyone wishing collaborate on this, I want to help or at least be in the loop so I can contribute an article or three! Let's band together.

Email in profile.


I was just thinking the other day that a non-profit wiki for sysadmins would be invaluable.

That would be awesome. I also know that some non-profits are actually looking for help, mostly due to remoteness and lack of connections. A non-profit wiki with the sysadmins with a section on non-profits that need help would be amazing.


I admire the idealism but I don't see it working.

This is exactly the kind of thing the ServerFault.com community talked about back in 2009-2010 when it launched-- creating "canonical" questions, etc. I was keen to answer questions about products and concepts that were important to me. I think a lot of other people were, too.

In the beginning there were well-researched and thoughtful questions, and there was a lot of fertile ground to write about basic concepts. Then it became like Stack Overflow-- "Shut up and give me teh c0dez!" kinds of questions. Then it became Not Fun(tm) anymore.

You can see how well it "worked".


Hehe so how did they assess which answer should be canonical? how did they manage the different answers?

Because this is still what we should be doing but we have got to be a bit smarter about it than just appointing experts.


Good catch! The questions were canonical, the answers were voted on the by community. I edited my post to use the right word.

Canonical questions: https://meta.serverfault.com/a/1987


Yeah I think this is a great idea, the trick is: how do we organize the knowledge?

The reason I ask is because I am working on that question (heyho! :) so my answer is that we need a way to prioritize the information so that people don't drown in it and that there is a clear way forward. Making these assessments about information sources should be a foundation in how we communicate so that we don't get as bad a case of tragedy of the commons as other wikis have.


I am on board. Pretty much running DevSecOps 24/7 anyway ;)

A place to share old war stories would be appreciated. There was a post about an intern live upgrading and it reminded me of the pain of doing things like sendmail major version shifts without any tape backups available.

Thing I miss the most is having an IRC channel of trustworthy netizens. You know, someone to just request a ping from, or test out a service quickly!


Oh those places still exist, for example I would trust #cl-school or the people on the guix matrix.


#greybeards on https://web.libera.chat/ is now live come join me ;)


That's a fantastic idea. I got my start in tech by being a Jr Linux Admin and I haven't had the opportunity for mentorship beyond a few roles.

I've learned a lot since in the last five years, but I still have a lot of respect for the IT Greybeards.


Scott Simpson over at Lynda.com and now LinkedIn Learning has done a fantastic job of covering much of the entry level stuff over the past decade with some courses on intermediary concepts. It is however behind a pay wall. So yes a non-profit wiki / maybe even online academy for "Greybeard IT" would be very valuable. It seems those coming into the work force today not only lack skills to triage and troubleshoot issues across systems but then have had no exposure to why things are the way they are. From personal experience it took years to collect enough stories and insights from those who came before to get where I am today.


Write your own blog. Start with just plain text files served via lightweight web server of choise just to get over the starting hurdle.


I second this. The people who want the knowledge read.


Setting up a LAMP stack used to be the starting point for learning web dev. I like the idea of starting with plain text first.


I would recommend this.


Ask the new hires if they care. They probably realize they need to learn AWS, Terraform, Kubernetes and Leetcode, anything else is secondary.

The guys 5-10 years of experience are likely more interested.


But when the black box stops working, how do you find out where to look to fix it? I get that the start of the art is advancing ever onward...but there's still a MAC address involved...waaaaay down the stack.


I am in the same kind of ball park as OP. Been doing sysadmin for a long time, and I teach at the university too... see a lot of students graduating and their network skills are weak. Bad troubleshooting skills at linux/unix fundamentals.

When I talk to friends doing sysadmin work about this, they kind of gravitate towards: "I throw this in AWS, need k8s, terraform, cloud-whatever-hot-tech, etc, and if it breaks, my json/ansible/CI/CD will just restart it" as the prevalent attitude now.

I feel OP, but I do wonder if we are just moving to a space where people do not optimize/troubleshoot at that level? I do a LOT of hands on grunt work, performance optimizing servers / kernels / networks, but I am at a university. I think my skills are probably worthless in todays "real IT" world. (which is sad, I would love to get a new job :)


The optimizing end is still valid right?

Unless you're doing serverless (which, of course, more and more of us are).


Sure. And if you happen to actually venture into that low end of the stack, you learn. But the idea that you should know everything, just in case (or just because us olds had to learn it) is extremely wasteful.

I use a black box (car) to get around, and I'm fine with that. I didn't need to learn more than "change oil", "inflate tyres", "how to (dis)connect a battery". I'm good.

I use a black box (TV) to view my entertainment content. All I know is "cycle the power so it works again".

I treat my computers the same. Yes, I've built multipliers from individual transistors, and I've seen and used the blinkenlights, and I've written entire kernels and networking stacks: I don't need it much these days. And if I need it, guess what, there are still people who make it their business to know all the things, and I'm happy to give them money so they can figure things out. (Because most of that knowledge has collected dust by now)

We spend our mental capacity where it's needed. Not everything is equally important, and abstractions exist so not everybody has to learn everything. (I mean, you don't fab your own ICs either, do you?)


Where the car analogy breaks down is the mechanic. If the mechanic can't reliably fix my issues, I'm taking my business elsewhere.

If the cloud provider doesn't … well it takes a lot to move a stack to a different cloud provider.

And what matters is results: sometimes with a car, I need to know a bit about how the car functions in order to help the mechanic. If I know more, I can sometimes point the mechanic in the right direction. And sometimes, you're broken down on the side of the road and you don't have a mechanic, and you need "how to change tires", "how much force is too much force on a lug?", "how to jump", "how to repair/replace battery enough to get to the mechanic".

Sometimes, I get better answers from cloud provider if I can describe to them exactly where they've gone wrong.

> and abstractions exist so not everybody has to learn everything.

In an ideal world. Problem with cloud stuff is that support is usually dumb as bricks, and the only way to get anything done is to break the glass on the abstraction & get your hands dirty.


But junior devs aren't expert mechanics, they're more like delivery drivers. You're the mechanic and in a few years some of those drivers will graduate there, others won't.

AWS support has always been excellent IMO and highly technical if required.


I get so excited thinking about the opportunities that will arise over the next 10-20 years for people who understand the lower-level (or are willing to figure it out) well-enough to clean up the AWS mess.


Yeah sorry I regret posting the comment, was a bit snarky. I've been working 25 years now. Everything is different now. I dont understand a lot of where infrastructure is going, it seems every year more complicated and expensive with little upside. I do think people are interested in your deeper knowledge, but it isn't the green people. Its the guys doing 2-3rd line support for more advanced problems. They're likely overwhelmed already though, everyone is drowning.


While there is a movement to serverless codeless whizbang cloudhosting....I think it's over represented on HN. There's still a ton of Enterprise IT that's hasn't made the transition, and it's often being managed by green staff.


I really wish there was a "no bullshit guide to cloud" that describes how it works behind the scenes - stuff like what a "serverless VPC connector" really is behind the scenes.

Sadly the only people who could write such a book are likely under NDA from the cloud provider.


Just deploy a second black box and load balance them. (I wish this was sarcastic.)


But that's the point isn't it. Stuff has become more commoditised so people don't need to care unless it keeps breaking. Then they call in the handful of contractors with the 10+ years experience to fix it


That's why you have junior and senior devops staff.

Juniors do the stuff as it should work, seniors troubleshoot it when stuff doesn't work as advertised.

Interested juniors learn from seniors and become seniors themselves, uninterested/unmotivated stay juniors


Turn it on and off again.


That will leave the system turned off?! I mean no code, no bugs, but still...


Why fix it?

That sort of knowledge is undervalued. The new hires make more not understanding that stuff.

Why not just take the knowledge with you when you retire, and make no attempt to pass it on?

Why is it always up to a lone determined engineer to pass the knowledge on, and not the VPs job to listen to the recommendations to do so, and provide support and training?

Because they think you love it so much, you won't be able to help yourself.

Break this chain. Give them the suggestion - you provide training, time to make content, introduce it to the new people. Explain if they don't get it, what will happen...

And when they say no, take the knowledge and burn it. It doesn't have to be your job to throw yourself on the fire to save their systems. The iron will endure or it won't.

Use the extra time to garden instead.


Please do it yourself! Your level of knowledge is rare and could be helpful to a lot of people. I'd happily buy a copy of "The Greybeard IT Bible" if you wrote/published it as I'm sure many others would. Would even be cool to have a pro version where you do short podcasts/talks on certain tricks of the trade included.


As a retired sysadmin with 30 years experience, I can tell you with a good amount of certainty that nobody will care if they aren't already sysadmins. It is viewed as entirely unimportant to understand these things, and the only thing you need to get a job is good whiteboard/leet coding skills.

The world changed.


It's unimportant to a certain kind of IT person that's always been out there. But it's useful to people that are curious on how things work...I really don't think those people went away.


> and the only thing you need to get a job is good whiteboard/leet coding skills

I have over 8 years experience and don't find this is the case. Maybe it is only for junior roles nowadays? Sure I've been leetcode'd in interviews (and sometimes failed!) but normally it's more conversation driven with a little project and/or code review.


I’m not a sysadmin, but want to at least read the contents page - after 20 years full-stack dev I’m painfully aware there’s a lot of things I don’t understand lower down the stack.


What is a "full stack" developer? This is something that always bothered me...


IMO it should be somebody who can build a Linux kernel module or a single page web app. They can do explicit memory management in C, or query a DB via ORM. But the way it's used is usually just "can write both server and client JavaScript."


A developer that writes Javascript and sometimes runs it in Node as a server process or as a desktop app in Electron, but actually doesn’t know the /full/ stack at all.


I always assume someone who can build a small production web app from DB Schemas to the front end, including authentication and payments but nothing more specialist than that. Able to jump in anywhere along that stack and be dangerous enough. Probably abstracted via frameworks, so knowing one in-depth front-end and back-end.


I'm thinking pancakes.


I’ve had similar thoughts. I wrote a lot of docs on an internal wiki at a prior company, and also wrote some public docs on my site. In the end nobody really read it and it was too much effort to keep it up to date. That said, I would love to learn from you even though I’m a PM now. I have an insatiable curiosity about how things /really/ work that served me well for more than a decade as an SRE/DevOps/SysAdmin.


What brought this up was a constant and unending battle with our IT department over the use of our Endpoint security tool. It indexes the local disk so that you can query for file hashes across the enterprise at scale.

That means you want to be really careful on what parts of the OS you whitelist. A print queue folder has NO value being indexed and can cause performance issues. And this product persists in having a reputation for being the culprit, even though it hasn't been for more than 5 years.

What we're encountering though, is that the IT staff have NO bottleneck troubleshooting skills. They can't tell you a system is running slow because it's CPU, RAM, or I/O bound...they don't know that the underlying compute could be an issue, they don't know to check /var/log for hints that something might be wrong...they just point, rigidly, at the EDR saying 'it occasionally pops to the top of the process list, therefore, it's the reason the system is slow.'


I hear you. I have the same feelings on "web developers" these days. A lot of them start with some framework and cannot write a simple Form submission in HTML. Many cannot explain the difference between a regular Form submission and an AJAX submission but they surely know how to do ajax.


At a previous employer, our software eng org migrated from on prem infrastructure supported by a systems engineering team to infrastructure in the cloud owned by the dev team.

One challenge was that, while we were lucky to score some talented systems engineers who joined dev teams, it wasn't easy to communicate to leadership that owning and managing infrastructure from the public internet all the way into the database requires knowledge and experience that software engineers might not have.

I tried putting together a PluralSight channel for us to use internally to help onboard new team members. We had access to AWS sandbox accounts where we had no proprietary data, and could build infra as needed to learn and/or experiment with ideas/designs/new services.

You can target people with this experience when hiring, but just getting qualified devs was hard a year ago, let alone people who also had systems experience/knowledge. A lot of the candidates worked somewhere with a QA team who tested their code, a dev ops team who managed a CI/CD process and managed releasing their code, a database team who owned the database cluster, a networking team who owned networking infrastructure, etc.

Really what you're looking for is somebody who likes to code, but also is interested in systems enough to at some point go read the RFC, run Wireshark, etc. I know a little about networking because I ran my own firewall on OpenBSD years ago, have been hosting e-mail and web for over 20yrs, etc. We got lucky with a few hires who lived and breathed code and systems engineering, but it will be challenging to retain people like that in a hot market.

At the end of it all, I couldn't really come up with a way to effectively accelerate learning these skills and getting that experience.

I started the same day as a new grad, and will never forget when, at the end of the day, he said "you mean I have to keep learning?" Yes, yes we do. Still do.

edit: I'd be interested in working on putting content together if people think it would be valuable.


You’ve just identified the (obvious to systems engineers) problem of every movement that tries to bypass IT Operations and has Developers just do it. System engineering is actually a completely different skill than development, and past a very small size, it’s not reasonable to expect the same person to know everything.

DevOps and the Cloud are just the most recent marketing campaigns trying to promise that one can get rid of “those pesky Ops people who just cost you money anyway”. Things can run fine when there are no problems, but then you realize that nobody who’s left is able to fix things when they go wrong. The mentality of “just reboot (or redeploy the container)” only take you so far.


It's not quite aimed at IT/System admin, but Handmade Hero[0] is an excellent resource for learning how a computer works from the ground up. It starts with an excellent intro to C on Windows[1] and works its way up from there. It's really long and involved, but I've found that internalizing the general ethos is quite valuable, even if you just watch a handful of episodes.

[0] https://handmadehero.org/ [1] https://www.youtube.com/watch?v=F3ntGDm6hOs&list=PLEMXAbCVnm...


Maybe it is wrong of me to say it like this, but it seems like there is a trichotomy in the world of working people.

You have people like you, that will voluntarily study their hearts out to find the textbook or proper way to solve things,

You have people that seemingly don’t care, but learn by doing, making mistakes, learning holistically

And then you have the workers that just don’t care and wanna be done with their 9 to 5 at 8 and 4.

You’re going to get a wildly skewed sample looking at HN or Reddit, because, by and large, most people here are the first type.

But as someone that is throughly the second type, I don’t give a flying fuck about VonNeumann and mutexes and race whatever because attaching a word doesn’t do anything to help me understand the problem and the solution. I didn’t spend years reading about theory, I spent years looking at logging software, memorizing errors, and looking at data. I have no doubt that I could look at a server with an issue and eventually figure out the nuance of and solve the issue without ever having heard of any of the proper names.

I promise you I am not trying to be rude or condescending when I type all this, but the honest answer of it is true: some people aren’t academics, and any attempt to give them knowledge in the form of “book smarts” and higher level reading is gonna blow up in your face.


> 9 to 5 at 8 and 4.

Hey we're WFH now, its 10->4 with a few hour long breaks inbetween.


Downvote because it doesn't really have anything to do with what he is asking.

You claim that someone who does the job is valuable, nobody disagrees.

He asks; where can I find good explanations of concepts that I've found useful throughout my career and I want to pass on to my juniors?

See how this justifies a downvote?


The statement is that if he cannot verbally pass that information on and have them get interested/excited, no amount of documenting and cataloging that info is going to make them suddenly want to read it. No level of explanation is going to be helpful to them when they don’t care enough to listen/read.

And I claimed nothing about “someone who does the job is valuable”. My claim was that there are, generally speaking, three types of workers and the vast majority of workers are unlike OP, in the sense that they probably don’t care in the slightest because they are 9-5 workers or someone who would rather learn themselves.


You are ascribing an extreme personality to these people while criticizing the other extreme and I'm saying that the most reasonable thing is somewhere in the middle; just help the man with what he intends to do.


I’m trying to stop him from wasting his time. If he wants to make a repository to help the world at large, that is a fantastic idea because there is someone who will read it. But if he wants to do this specifically with regards to passing on information to his juniors, he should just actually train/teach them. If they are receptive to the info they will ask questions. If they aren’t, no amount of cataloguing will change that. That was my whole point. The dude is clearly looking for a project and is barking up a tree that is gonna make him feel useless.


Okay, that is a fair take, in the end it will depend on the student what kind of teaching methodology will work best but it is indeed unlikely that they will gain more from him trying to write a book than him sitting down with them.

Maybe with that as a baseline understanding the next interpretation of his question becomes: where can I find others in a similar predicament so that we can collaborate on building the book (as I gain experience coaching these juniors and need support network when I am flummoxed)?


Have you thought of a "retirement" career as a troubleshooting consultant. Hang out your shingle and have two weeks of "fun" every other month and keep the ball rolling.

Incidentally, I have learned much by closing my mouth and listening to those, like yourself, with more experience. So Thanks and keep up the good work.


I think you probably have a lot of valuable knowledge that should be written down.

What knowledge about networking do new hires lack? How does one get comfortable with networking? I'm an undergrad, and my plan for that topic was to learn how to read pcap files and also learn to use wireshark to monitor my network.


it's all so ad-hoc and random. Firewalls have expensive and cheap operations...passing a packet? Cheap. Opening the packet and rewriting the header? Expensive.

That can be leveraged for evil.

How do you even organize that kind of random trivia?


> How do you even organize that kind of random trivia?

Having been slowed down in the past by this kind of thinking: My advice is don't.

You throw the information together and forget about organization (as in a final presentable organization), you just write. Use systems like tags (metadata) to attach concepts and use queries on the tags to find related things. As you build up the knowledge base you can start organizing it more properly, into real essays or chapters and sections. Not all knowledge is amenable to pre-organization before you begin working, and even if you try to have some initial organization as you start writing you will find missing details and then have to fit them in.


I don't know much about firewalls. I wasn't aware they could alter traffic, for example. What do you mean it can be leveraged for evil?

As for organization, I just dump everything I learn in a markdown file on that topic, then look for structure later. So I would just open a file, call it `firewalls.md` and write that down in there. You can accrete a lot of written knowledge over time this way.


They can be used for all types of stuff.

Imagine two companies merging. They both used 10.100.0.0/16. If you were in one of the companies wouldn't it be nice if the other re-ip'd?

... Guess what.. they won't and you don't want to. So you make a box, that turns your ips into 10.101.0.0/16 and theirs into 10.102.0.0/16... and you leave your IPs alone :).

Yeah, some things won't work across that bridge. But it'll get you started.

Firewalls can do really, really evil stuff. DNS? Yeah, we'll answer ALL dns queries.. Even ones you send to 8.8.8.8, or 1.1.1.1...

The BOFH playbook is large, and varied. :)


Networking, not just firewalls, is about looking at a packet, opening it up, and either passing it along to the next link in the chain, or making a change to (typically) the header and sending it along.

There's a TTL or Time To Live value to make packets eventually die, otherwise, packets could potentially move from router to router forever. So at a minimum, your firewall, or router, rewrites the header to Decrement the TTL by one with every hop.

Which spawns a really interesting thought experiment that results in the creation of the traceroute utility.


One way is to just dump all the info like a reference. Another would be to write some of the stories describing the scenarios that forced you to learn this.

Take a look at Julia Evans’ writing for inspiration.

https://jvns.ca/

Also, https://rachelbythebay.com/w/

Providing the real life scenario makes the reference memorable - the human brain is set to remember things that matter and discard things that don’t. A story establishes the usefulness of the things.


One way is to just dump all the info like a reference. Another would be to write some of the stories describing the scenarios that forced you to learn this.

Take a look at Julia Evans’ writing for inspiration.

https://jvns.ca/


I really like the resources reccomended on teachyourselfcs.com


I went ahead and registered greybeardit.org I will create a github pages open source project where people can submit articles, it will be 100% open source.


https://gitlab.com/greybeardit/greybeardit.org open to anyone that wants to contribute


Probably more examples, less theory. Maybe set up labs for them ex. demonstrate unhealthy processes eating up memory and what tools you can look at to see.


That's certainly aspirational...but a long way from getting words down on a blog.


Fair point.


"How Linux Works" from no starch press is very good for a bunch of stuff that is not taught in CS.


Oh man no one cares any more. They fudge through it slowly when there's a problem and forget about it immediately. I've seen teams waste 3 months unable to fix a simple issue and I've walked in and seen it right away. Can I pass these skills on? No because no one gives a fuck. And quite frankly my soul died enough that I don't either. Just over 5000 days until I'm retired.


These people are at a different point in their career path than you are. They don't have the same decades of experience that you do, to immediately see the problem. Perhaps they will now, having solved it once, as you have many times before. And it may well be that they do not have the same level of passion and dedication that you had at that point in your career. The field is much larger now than it was then, meaning fewer places for people to drop out.

Regardless, I hope you enjoy your retirement when it comes. You have earned it.


Most of the time it's no the answer I know, it's how to find it. All I want to do is walk them through the thought process and instil some rigour in diagnostics.


Here is what I would suggest. Write it down in a blog post or something and make it easy to find. After that, it's out of your hands. If you think it has value there are at least a few other people who would as well.


The people are in a different quality and time than IT used to be. We need more people, and the result wasn't more of the same, but widening the gradient of what is acceptable.

It used to be that you can't get people fresh out of school without basic network knowledge, and over the past decade or so, you can. Heck, it's more likely that new candidates don't know how things work, because they have been trained to pass hiring barriers instead of actually having to know how things work.

This makes sense in other areas, especially those that have existed for a long time (i.e. accounting, metal work) and aren't layered in abstractions covering 10 different disciplines. But IT is relatively young, yet still works on the same foundations it always did. Even things like ethernet that's used by mainframes, minicomputers, microcomputers but also clouds, IoT and workforce, is simply ignored because it's supported to magically work and abstracted away by 10 sys calls, 100 node packages and 20 layers of XHR promises. When it breaks, you suddenly don't know how to fix it, and because you also don't know what those 100 node packages are doing (which would be unrealistic anyway) but also don't even know what a syscall is, you will neither know how to find out if that is the problem, and even if you did, you won't know how to fix it.

Some basic understanding of the ecosystem in which IT exists wouldn't be too much to ask, but because of the high demand we're just accepting overspecialised people for jobs that don't exist (no role in a company is "react-redux integration specialist"), because it used to work out in the past (but in the past you had to have the same foundational knowledge).

This is the equivalent of throwing spaghetti at the wall and seeing what sticks, because it's cheaper to hire 10 new people and hoping one of them lucks out and accidentally fixes something than hiring or training one person that can actually consistently diagnose and resolve issues.

And before someone assumes this means some people are "bad" because they don't know "everything" (not talking about knowing everything): this is mainly a disconnect between what is needed and what is applied for. And that in turn is because it's hard to describe what is needed, and in return is hard to prevent how as a candidate you would fulfil that need. (i.e. job postings with 15 years of senior Kubernetes experience come to mind)

If you know a bit about tasks, queues, control loops, you can pretty much fill a job for legacy systems like servers running bare metal linux or windows, but also containers, container orchestrators and orchestration tooling for infrastructure as code, because at the end of the day they all are implementations of the same principles, but slightly specialised for the problem at hand. They all do the same thing: you specify what you want, and it is up to the system to make it real and report how that's going. Reading the output of invoke-rc.d, service, systemctl, kubectl, docker and nerdctl is specialised on resource names, but identical in terms of action flow. They all have a concept of a resource, a concept of something that needs to happen and a concept of telling you how it went/how it's going. But teaching people that you need to understand the principles and apply them on different implementations and abstraction levels is hard, so we just teach people the contents of a MAN page instead of actually telling them that the man command exists and you can just read it whenever you like...


TBH even the foundation has changed dramatically. It's hard to convince me we have rock solid foundations in a quickly changing industry.

Most of the feedback loop is if you can get jobs of ever higher compensation. If that loop stops considering old school sysadmin foundation skills, then people will stop learning those skills because they can get jobs without them.

There's been lots of progress on making things convenient for developers/IT, so that's also why: you don't need those old skills to use those, or at least your entire team doesn't.


Well, that's just it, the foundations really haven't changed. We still use TCP and UDP, we still use Ethernet, we still use Linux (and Windows and macOS), we still have the pattern of an application needing a configuration stored somewhere where it can read them to exhibit the desired behaviour. We still have the concept of 'services' where it's the job of a (semi-)automatic management or orchestration system to ensure that they are operating within parameters. Nothing about that has changed, regardless of your daily tasks.

If you write in Go, Rust, Java, C#.NET, Python or TypeScript, all of those foundations stand and are practically unchanged. Some layers might have been added, i.e. when you describe how your service is supposed to run in a Docker manifest, or in a Nomad configuration, or just a systemd unit file. But those are just formatting issues with plenty of manuals to go around. Yet still, when something goes wrong (i.e. application exits with some status integer), an engineer used to be able to understand that:

  - Your application or its management system detected a problem
  - The OS provided an indication that your application has an issue
  - If you don't know all codes, you know that they are perhaps called "exit code" or "return code", and you at least know how to look those up

Say it turns out that your application is throwing a SEGFAULT, as an engineer, aren't you supposed to know that a segmentation fault happened, and you can suspect some shared library to be the first thing to check before going deep with some debugger? Maybe run a command to find out what libraries are needed, check if they are accessible and contained within the context of your application (i.e. if not in a complete OS, but a container or jail or chroot).

While in theory only a part of an entire team might need to know this, having mostly teams where 0% of the team members know this might be a problem, don't you agree?

Same goes for "I am calling this HTTP API but it doesn't work", that's just a lack of information to begin with. This question is asked plenty of times, and usually the first thing we need to know is "what is the error" because for some reason the developer doesn't understand that it could be a myriad of things and asking for help because "the thing had a booboo" doesn't mean much. Then onwards to:

  - Is the URL valid
  - Is the FQDN valid
  - Are you even using an FQDN or did you think everything in the world runs on your DHCP-supplied search domain (common in windows land)
  - Can you resolve the name via DNS
  - Can you connect to the IP on the port (commence lesson about netcat and similar tools, every single time, again and again)
  - Did you ensure the security groups or legacy firewalls are actually configured for your desired traffic
  - Is the service on the other end actually up and does it even exist

None of this is deep systems understanding for greybeards, it's just basic client-server model programming. But still, there are more team members that don't know how to verify connectivity than teams that have at least 1 who does.

Every time someone needs to reach out to that special someone they think magically knows everything in the world, that's a waste of time for two people, and a break from concentration. Advanced issues like "my packet payload get truncated and I don't know why, they are just TLS and below the MTU size", sure, can't expect everyone to understand how to check that stuff out. But we're talking about the basics here.

We're running low on greybeards and we're not getting new ones because "they don't need to know" (developers, ops), and this is going to be a problem that gets bigger, not smaller.


Depending on the org it's can be worse than "no one cares". I was at a large, growing cancerously "tech" company and this type of problem solving was seen as a threat to the large number of managers that had neither management experience nor significant IC/technical experience, whose sole purpose in life was to grow head count so they could increment their level.

For these people needless complexity was a benefit. I saw complex neural networks poorly built for the most laughably simple problem, but arguing that it could be solved more simply got you a talking to, and anyone who actually did solve the problem was quickly let go.

Part of the consequences of being in a tech bubble that has lasted so long is that reality stops being important, so technical skills don't matter and at the extreme threaten more important illusions.

In a company where you're expected to make more money than you spend, simplicity and technical competency can be hugely valuable. In a world were 'profit' is a foreign concept, trying to save time and resources is seen as a threat to the complexity theater other people are creating to boost the importance of their work/team.

Don't get me wrong, there was plenty of bs in the pre-bubble world, but there was a time when technical teams had greybeards you would consult and learn from. People like this are one of the reasons I started a career in tech to begin with.


I don't know that anybody ever cared. The barrier to entry re: running an IT infrastructure has fallen so far that gross incompetence still delivers a serviceable-enough product.


Eh, I wouldn't say it's fallen. There's just as much incompetence per capita as there ever was. I just think there are a lot more people working in the tech sector than ever before so it's more noticeable.


There are so many more tools to allow the incompetent to "deliver" a reasonable-enough result. There's a certain amount of "don't let perfect be the enemy of the good" in there, for sure, but like the grandparent poster, I've walked into many gigs where "long-standing problems" turn out to be easily fixed. Nobody with sufficient background knowledge and experience ever looked at them and they just festered.


That's a depressingly accurate portrayal of the situation.


I doubt that is accurate.

Humans are a diverse population. Apathy is one response to facing overwhelming pressure of legacy systems, monolithic organizations. Perseverance is another.

Look at George Hotz and https://comma.ai/ - I am skeptical he and his organization would "no one gives a fuck" about broken shit.

- https://www.youtube.com/watch?v=FOuCKQfybz8 personality snippet




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: