This, to me, is the main reason why it's a bad idea to host and maintain a presence with a CMS yourself. Most likely you are not an entity that has the capabilities necessary to do so, neither are most of the Hosters that even offer "managed Drupal/WordPress/... Hosting".
It takes a relatively large team to reliable shield software as complex as modern CMS from abuse and the mitigations have to be employed so timely that one cannot wait until this is high enough on someone's to-do list that has to do this for 10 different CMS.
If you try to bring it up there's always some inexperienced overseas contractor with 6 months of experience who will say "It' s a few clicks to install and a few plugins to protect the security 100% no problem, what's so difficult? why do you make a big deal out of it?".
Say hypothetically if a got my Drupal site for cheap and now I'm hacked. I pay someone another $100 to install the patch and get rid of the malware, I still come out ahead.
This is the catch. Depending on what was done to your site, there's no "getting rid of the malware" for $100.
Those cheap hosting providers don't provide automated backups of your database. You probably have a backup of your site's files -- that's probably not perfectly current, but close enough maybe -- but I bet you haven't got regular backups of your database.
Drupal is one of those many CMSs that stores tons and tons of code in the database, including executable PHP. So how do you go about ensuring that all of that is clean, and changing all the passwords that may've been compromised, and making sure there are no other backdoors or shells left behind, for $100?
As with the majority of security issues it was done for convenience. Not every user has access to the hosting provider so it was done out of convenience.
Thankfully they removed this option in Drupal 8, the latest version. You could also restrict users from accessing the functionality so it wasn't that terrible. In practice few sites actually use the option, but when they do it can make troubleshooting a giant pain in the ass.
The fact that php.module ever existed in the codebase is a downright travesty. As soon as any privileged user was compromised (i.e. someone with "administer users" or "administer site configuration" permissions) the attacker had arbitrary remote code execution.
My projects had a patch to remove that entire module from core on each build.
One example I did many years ago: I want to make this download page vary what it serves by user agent, providing instructions for any things they’ll need to install first (because .NET stuff used to go in the user agent string), and providing the appropriate download link for their platform. Most of this could be done in JavaScript, but there was at least one part of the mix that couldn’t be, but could still be detected from the server.
Now which is easier—make a new module to serve this page or filter the output of that page, or just enable PHP code for this page, and write it directly in PHP on this page only?
I've never seen someone make an informed trade-off decision like that and acknowledge it.
Instead they just believe that they are 100% secure then when they get hacked they act all surprised and with great hypocrisy say "security is our number one priority at shitshow.com. we take security extremely seriously.".
Otherwise I'm not suggesting that there's some great incentive. As we've seen with huge hacks like Equifax and many more companies we know right now they just get a slap on the wrist and so they continue to try to use "we are sorry" PR statements after the fact as their strategy.
Seems like the right thing to do for most CMS type situations is something that generates the site offline... and only uploads a static site to the web.
I'm a big fan of Jekyll (and more recently, Gatsby), and I think you're correct in saying that in most cases where a CMS is used, a static site generator would be a better option.
However, I see a couple of problems standing in the way of more widespread adoption of static site generators:
1) I've never seen a static site generator that non-technical users felt comfortable with. Programmers like terminal interfaces. Your average user in 2018 has probably never opened the terminal.
2) Possibly even more significant, I don't think the people making the choice to use a CMS for a simple business-info or blog site are aware of the trade-offs that they're making, or of the benefits that a static site will provide down the line. For a lot of small business owners I've spoken with, Wordpress is basically synonymous with "web presence."
I wonder if there's a market for a web-based GUI like what Wordpress provides, but which runs jekyll under the hood and uploads the files to s3 or something like that. Is this what companies like Squarespace and Wix do?
The first point is why I built Jekyll+ [1], which I initially built for the new Starbucks' website [2] (which runs on Jekyll). It supports multilingual content and we're now adding a whole bunch of new features (you can check the TODO list in the README [3]).
We also have a pretty neat solution for site generation/hosting (JekyllPro) which is rolling out next month.
Using these two, none of the contributors realize they're using Jekyll. It's not as user-friendly as Squarespace, but it's pretty darn good.
The project I work on, Netlify CMS (https://www.netlifycms.org/), is almost precisely what you described in your last paragraph.I say "only" because it only officially supports GitHub as a backend at the moment, from which you can deploy to a host. (I'm also currently working on supporting more backends, so that will change soon.) The CMS is an open-source (MIT license) CMS built as a static web page which connects to an API for an arbitrary backend from your browser and edits the content stored there - from there, you can build the content with whatever static site generator you want. This lets you build sites that non-technical users can keep up-to-date without tying yourself to a specific tech stack for the actual website.
Please put that description somewhere on the site. I tried to try out Netlify CMS there times because I love Netlify the service but couldn't figure out what it is any of those times.
That said, it didn't help that you don't support Gitlab, which is where all my sites are, and which is the more important reason why I couldn't get it to work.
Yup, improving backend support (and refactoring the backend API to make developing custom backend support easier) is currently my top priority. Initial GitLab support (without the editorial workflow) is very close to being complete - I hope to be releasing an initial PR this coming week if all goes as planned. My latest update in that thread is here: https://github.com/netlify/netlify-cms/pull/517#issuecomment...
I'm a huge fan of Netlify, everything you guys do is great.
We (Graphia) took a similar approach for our document management system. Essentially it looks and acts like a regular CMS but it sits on top of a git repo instead of a database; and publishes via Hugo.
It's not intended to be a fully-fledged CMS but the API would support it without much work; the UI is definitely the time consuming portion.
http://www.graphia.co.uk for anyone who is interested, not quite ready for prime time just yet. Soon.
That sounds really similar to Netlify CMS, especially the use of Git for content storage+version history and the focus on a decoupled UI for editing content. It's really cool to hear from somebody exploring the same space! Git as a backend for content is a really interesting concept that's worked very well for us so far, and I'd be interested to hear how you're implementing that and what issues you've run into. The internationalization approach you describe on the features page is pretty intriguing as well - that's a feature that we should improve our support for in Netlify CMS.
Feel free to ping me using the contact info in my profile or at @benaiah in our Gitter room (https://gitter.im/netlify/NetlifyCMS) if you're interested in discussing this elsewhere.
Why not use wordpress as the backend for your static site? The user can create pages and posts like normal and once they are done you can build the static page from the wordpress api.
Some players are moving into this space. Headless CMS providers like Contentful or Prismic let you build a very nice content entry UI and asset manager. Then you can consume content dynamically via the API or set up triggers to build the site statically whenever content is updated.
They definitely appeal to IT teams which is a very different sales pitch than you would see for most enterprise CMS products, but rest assured the actual product is very user-friendly. The content entry UI is very straightforward and has easy to use hooks to live preview like any other CMS.
I spent a lot of years working in tech consulting and did a lot of CMS integrations. We desperately wanted to push clients to Contentful and away from ornery beasts like AEM, but Adobe can sell AEM by saying it's part of the "marketing cloud" and that it's magic power is in user engagement and analytics and whatever other marketingspeak lies. Contentful was the developer favorite, but it was hard to convince businesses it was in their best interests even though it totally was.
There definitely is a market for a product like this. I think the best thing would be a native app with deep git integration. Engineers can see the entire project and edit templates while end users get a GUI to drag and drop those templates to make an article. The output from the user is encoded to either json or protobufs, and then passed through a function that outputs html which is then uploaded to s3. Everything would be under version control so you could just operate as if it were a custom designed webpage. You could fund development by having a store that takes a cut of templates that other people create.
I would work on this if I wasn't already running another startup.
Publii looks promising, but while it's "open source," the source is all wrapped up in an Electron app that could easily be reclosed at a later date.
There's also Netlify CMS, but I haven't done much with it. I think the GitHub requirement puts people off, and work on the GitLab--which has free private repositories--support issue is slow.
What we need is something like Publii that's not packaged in a way that's obviously meant to make closing the source easy. That story is too old, and too common, to believe their intentions are good.
With something like Hugo, you only need a single executable and all you have to do is decide on a theme, write Markdown files, and compile it on a shell/cmd/powershell with a command you can save in a text file for later reference.
Imho if you are really wanting to write a blog and host it yourself, this should not be too hard to learn. Wordpress doesn't install itself automatically, either.
As a sysadmin who has seen and fixed drupal/wp/joomla/dreamweaver etc nightmares, I personally have settled on a combination of emacs org-mode html export (don't forget how beautiful the latex export can be!, and that you can run programs inside org-mode, and the google calendar integration, etc) and asciidoc and asciidoctor(nicer default html export settings, but I suggest custom pure css3/html5 no js!).
That said, back on subject:
Look, this is the main reason you devs tend to hear disdain from us sysadmins when we hear you want to run some new webservice in whatever language fad of the day (node, python, perl, php, asp.net, ruby, scala, go, etc) instead of generating files with your app that can then be serviced to the user using the tried and true tools (apache, nginx, hiawatha, haproxy etc) that have had sometimes decades of public internet exposure to claw thier way through.
Ah, but what about $functionality. Most webdevs think they need a scripting language backend when they really don't. There are uses for it and it has it's place, but it's been way overdone.
Sidenote: I think everyone should check out hiawatha webserver, over at (https://www.hiawatha-webserver.org/). Hugo wrote it with security in mind and it by default addresses many things that no other webserver does. It's also gplv2 (which is a big deal for me), and it's performance is second to none. I'm not a big php person, but if I were to be doing a webapp where a backend like that would be needed, I would probably try to go PHP-FPM with hiawatha(mbed TLS)/haproxy and a really nicely hardened boxen and edge firewall(you are using nftables now right?).
If you like Hiawatha, you should take a look at Banshee (https://www.banshee-php.org/). A secure PHP framework made by the same person as Hiawatha. A good alternative for Drupal, Wordpress and other pieces of spaghetti code.
Yeah I like it but I have tended to prefer to use static pages for the very reasons mentioned. I'll have to try it again sometime. I really wish Hugo would explain why he took it MIT though... as I've been working hard at what I call gnu'izing my stack.
Well it's a bit old now but a few years ago on the Hiawatha blog a test of load under attack was done and Hiawatha was able to handle a simulated (d?)dos very well.
I used to work a LONG time ago for a company that built Yahoo Small Business ecommerce websites. They used a tool called RTML ("Robert T Morris Language"), which was essentially HAML, powered by Perl, and constructed using a GUI (no text editor interface). Select where you want a node > New node > select node type > Save...each action was a full page load. The process was excruciating, and I was in charge of developing tools to automate the upload of "template" files from the developer's machine to the client's Yahoo Small Business store.
At the end of the day, the customer would load up all their goods into the Yahoo database and hit "Publish", at which point a multi-hour job would kick off to run their RTML against the database. The result was a completely static ecommerce website. Only the cart/checkout pages were dynamic, and those were using shared code.
The development process was a nightmare. The client experience was really bad (want to fix a typo? You're going to wait a couple hours). But I'll be damned if it was possible for anyone to hack those things. There was nothing to hack.
I've recently found the perfect middle ground: Django with Wagtail and Bakery. Django is Django, Wagtail is a very well designed CMS, and Bakery allows you to export Django views as static files. That way, I managed to convert my dynamic website to a static one with a few lines of code (basically a loop that iterates over all URLs on my site).
That's of course a good solution but then you want your Ajaxy Web 2.0 Things in there, some more complicated forms maybe and you start looking at something that takes quite some effort to build compared to what you get from WordPress/Drupal etc.
Funnily enough, having worked on all kinds of websites for the past 15 years, i'm still at a loss when asked what the all-round good solution is.
Maybe the problem is the phrase "all round". There are trade offs to be made and complex cost/benefit calculations that are required.
To further complicate things, things change. A website that was a perfect fit for a Wordpress template easily grows beyond that. And a site that seemed ideal for a custom Django/Rails backend might eventually settle into the kind of functionality groove where you begin to wonder if an off the peg solution might not have been a better idea.
WordPress at least automatically updates itself. There's been a few times when auto updates broke but, overall, it has been a tremendous help in reducing the number of exploited web servers out there.
I have no idea. We more or less took one look at all the crazy sh*t they were doing with pixels, trackers, inbound marketing, outbound marketing, SEO, sitemaps, special codes for FB, GA, dynamic landing pages...
You know it's all trivial garbage...except if you want to replace it you need to support it, train them to use these tools and at every point you're going to encounter this resistance about WHY you're doing it and you say well security and speed and they're like it's fast enough and we've never been hacked.
So until the site grinds to a halt and they get also get hacked they basically view you as a blocker.
That's how the early CMS's like Vignette worked and they evolved to solutions like WP and Drupal which are better for the web publishing use case - easier to maintain, develop, and capable of real-time UGC and publishing (real-time is crucial for SEO in commercial publishing).
Static site generators are simply a file-system caching layer. Drupal used to have one of those and it was a terrible idea as the number of page variants on a commercial site very quickly hits file-system limitations and performance is appalling. Better to use a proper CDN for a similar result but with massively reduced latency due to content being served from the edge.
I've always wanted to try it. I'm more interested in Statamic which is premium, but seems better supported and I'd be more comfortable putting clients on it. From an outside perspective without having used either, that is.
The original Drupalgeddon vulnerability was seriously problematic in the speed of exploitation. The automated exploits were developed only hours after the announcement, and ~7h later as far as I remembered widespread enough to consider all unpatched installations compromised.
This exploit was far more lenient in timing, there was an advance notice of a serious vulnerability, and the automated exploits were weeks later this time.
There are certainly many people running a CMS themselves that don't have the capabilities to keep them reasonably secure. But it's not that hard if you have a reasonably small attack surface, e.g. few plugins and no user-generated content/accounts. And of course you need to subscribe to the relevant security mailing list.
Sure I could run a CMS reasonably secure Drupal Site. But I'm at a successful startup, have had years of experience hardening Websites and know my it-sec hygiene. This also makes me not the target audience for CMSes like Drupal. Most of us agree that having static site generators are a much better solution for most use-cases. The people needing WordPress and Drupal don't know what Usecases they need, they are exploring the possibilities often times.
Currently they have to deal with at least 5 different CMS, 4 shopping systems, and some hand made sites with at least 4 different frameworks. 2 backend languages. And legacy code with self written frameworks in PHP 4 on at least 2 important sites.
Most CMS security is basic stuff like file permissions (where Wordpress sites often get into trouble), applying community patches quickly (both Drupalgeddons), and managing JS library dependencies (a notable hole in the Drupal patching ecosystem).
It takes some expertise to do these correctly, but it's not rocket science. I taught myself and have self-hosted Wordpress and Drupal sites for 8+ years with very few security incidents.
That said, I'm moving my Wordpress and Drupal sites to platforms like WP Engine and Pantheon. But not because of security concerns--because the platforms will accelerate our development process by making it easier for developers to create staging instances and deploy to production.
I'd change that to "enterprise CMS". Experienced sole proprietor web devs have often moved to lightweight, non-enterprise, non-feature-packed CMSes, this being one reason.
Which leads to concentration of hosting to a handful of large enough providers (e.g. AWS, Google, Azure)? I completely get your point, but struggling to see a way forward here.
Windows gets exploits too, but that doesn't mean it'd be a good idea to replace it with Chrome OS and lose all the software that doesn't exist out of Windows.
> It takes a relatively large team to reliable shield software as complex as modern CMS from abuse
No. The article is about a security holes from 2014. It does not need a team to keep your software updated. I have been running a bunch of CMS for many years now and keeping them updated takes me almost no time. If I accumulate all the time I spent on it, it's probably a day per year or so.
My mistake. But it still holds that keeping a single system up to date takes like a few hours a year max. The updates themselfes are usually done in minutes. Only once every few years there will be some issue that takes you some hours to deal with.
>But it still holds that keeping a single system up to date takes like a few hours a year max
This... simply isn't true.
Sure, you can just set your system to update and reboot once a week, and then when an update is incompatible with some of your software, well, spend the time to clean up the mess.
This is going to break every few years, and that mess is going to take significant effort, especially if you haven't touched that software for a few years.
A more proactive approach is to watch the security mailing lists, updating when appropriate, but that's an hour or two a week of effort... remember, you have to keep your full stack updated; your cms, your webserver, your os, etc, etc... you need to watch for holes in each of these things, then when there is a hole, you need to figure out if applying the update will break your current configuration, and then apply said update
Either way, it's a significant amount of skilled labor.
Sysadmins don't get paid what SWEs get paid, but we ain't free.
To go deeper into what others will respond to this with, yes, distributions cover some of this. But consider that every few years your distro will go out of support.... and you will have to figure out how to upgrade that, which is often pretty painless, but it does require either a high tolerance for risk or a lot of reading.
Then consider that other bits under your stack; your webserver, your php, your cms, all those packages/versions are sometimes changed mid-release and often changed on those major releases.
This is all true, but it's not unique to CMS software. Everything you write here is true of operating hosted software in general.
Hosting a Drupal site on Pantheon or Acquia does NOT relieve the site owner of testing and applying patches, fixing the site when a patch breaks something, etc.
In terms of hosting, the LAMP stack itself is very stable; setting Linux, Apache, and MySQL to auto-apply stable updates is very unlikely to break Drupal. Even in PHP, going from 5.x to 7.x required some careful testing, but minor updates are generally fine.
The security advantage of open-source CMS software is that the community update process reminds site owners to do this stuff, and gives them the code to do it.
In contrast, a custom-developed CMS can sit there with security holes in it for months or years, and the owner will have no idea until their site is hacked.
it's a bad idea to host and maintain
a presence with a CMS yourself
I would have thought saying so on Hacker News addresses the readership of Hacker News. Aka myself. So I objected.
Talking about schools. I would say it depends. If the kids run the CMS I would say the learning outweights the downsides of having some VM hacked that only holds public information. So for a CMS with public information I would say: Let the kids have fun with it!
There might also be use cases where that's not a good idea. Really depends on what the CMS is used for.
Well I'd guess many readers such as I have worked for schools, churches and other institutions that needed a website, it's a very common type of customer. The hard problem is, you can't usually really help them with this level of problem with their level of budget for it when choosing the common CMS Solutions.
The Drupal security team is doing a great job, but "abandoned" installs of similar tools are everywhere. Even high profile users of Drupal like Cambridge Analytica* continue to use a vulnerable version. Though their install is not exploitable due to disabled features, AFAIK.
Well, after several years not much went wrong. There was a single case where the updated broke the update process, thus forcing users to a single manual (still one-click) update.
I.e. the worst that happened is that in one instance wordpress was as bad as any other major CMS by not having working automated updates.
What went right however is that probably thousands of wordpress infections have been prevented by automated updates that work 99% of the time.
Wordpress surely isn't perfect, but the fact that they have automated updates that work most of the time means they're more secure than any of their major competitors for now.
You're talking about whether something has gone wrong with the auto-update system, specifically.
However, the parent comment was referencing the risk inherent in granting a PHP application write permissions on its own files. Doing so means that almost any exploit in the application can result in persistence, because malware can be written into (hidden in) core Wordpress files. You would not necessarily know how often this happens, because it happens to individual sites.
The Wordpress community decided that the risk of not patching quickly outweighed the risk of allowing Wordpress to write its own files. I happen to agree with them; unpatched vulnerabilities are like candy for script kiddies and fall easily to automated attacks. And most people running Wordpress are not sophisticated.
That said, for teams that are on the ball about managing updates, running Wordpress with locked down filed/directory permissions is definitely more secure.
If your auto-update system also means a single RCE vulnerability can then embed malicious content into the program itself, that is worse than non automatic updates.
> I would trust wordpress to keep their signing keys safe.
What signing keys? Wordpress's automatic updates aren't signed, so your trust is horrendously misplaced.
Someone already did the work for them to implement it[0], and rather than commit it, a Wordpress developer wrote a blogpost saying signing isn't really that important[1].
Can't reply to stephenr because comment thread is too deep, but it's perfectly possible to run WordPress without write permissions anywhere near executable code and then to schedule automatic updates at a different privilege level.
Since this is done extremely rarely, that leads me to believe that its more of a cultural problem (perhaps with the WordPress team, even) than a technical one.
Spent 5 years in shared PHP web hosting, can confirm that most of the people are muddling through with the bare minimum of understanding to get WordPress going.
If php has permission to extract an archive with new files over the current ones, it has permission to write a malicious file over a current one.
I’m not talking about a vulnerability on the wordpress central infra that delivers a malicious update via the legitimate update path.
I’m talking about a vulnerability in the millions of wordpress installs that have filesystem permissions to overwrite themselves, to facilitate “auto update”.
This is almost always at the core of WP exploits. Sure, there have been some SQL injection and XSS, but almost every WP exploit I've seen in person has been related to files being able to be written to the filesystem and become a publicly accessible URL immediately.
I don't do as much with WP these days, but have worked with it some over the years, and work with multiple folks in the geographic area who primarily do WP work (friends with folks who run WP meetings and do all-WP work, etc).
I've seen dozens of exploited systems in the last 2-3 years, and been asked to help 'clean them up'. Every single one of those was some variation on some bot exploiting some code (in core or in a plugin) that allowed for some obfuscated code to be written to a publicly accessible directory, with names like "includes.php" or "sys.php" (things like that) - common words that no one who is not a WP expert would even think to question in the first place (no one leaves BACKDOOR_EXPLOIT.php as a file name, for example).
Yes, you can install stuff like wordfence and similar tools to continually scan your disk to look for 'infections', but... why allow those to be written to the drive in the first place? I know the answer is "convenience", but it's been a huge community price to pay.
I'd really like if WP offered an inbuilt way of doing installs/updates via ssh. defaults seem to require plaintext FTP credentials to itself. I know there was an ssh plugin, and I know it couldn't be the only option (or perhaps even default), but having it as an available option on install would still be nice...
You can always just grab a tarball, uncompress it locally, and then connect to the server with your favorite SFTP client to upload the new files to the server. It's not exactly elegant but it is possible.
I did that more than once back in the good ol' days before the fancy schmancy automatic updates. After you do it a few times, though, you'll probably just give up and modify the file permissions so that WordPress can update itself at which point...
If the user that WordPress (or the web server stack, I suppose) is running as has write permissions on the directory where WordPress lives, it can update itself directly without having to FTP to localhost or whatever. Of course, typically we'd say it would be a bad idea for the application to be able to do that but blah blah risk assessment blah blah trade-offs blah blah ...
For a long while, this was a widely accepted/standard way of hosting multiple customers on a shared web server, using something like suexec+php-cgi or (even better, IMO) mpm-itk. Then, each customer's WordPress site would run under its own uid/gid in order to isolate them from one another. A single customer could break their own instance or allow it to get hacked over and over again, but at least it couldn't get to other users' sites. I'm not sure how the large shared web hosts are handling this nowadays or if anything has even changed.
(Edit: After reading your other comments, mgkimsal, it's obvious that you already know all of this but I'm not going to delete/edit my comment now.)
no worries - thanks. I'm sometimes deeply aware and involved in it, then move on to other areas, and don't have to think about it for a while. the apparent moves to nginx/fpm in the php as a 'default' for many situations has, I admit, perplexed me a bit - my muscle memory has not adapted to newer ways of dealing with this approach.
So many of the problems in WP stem from the extreme 'hackability' from the early days, and a resistance to push for more security by default (the lack of ability to even attempt to install updates via ssh is, imo, a good example).
I’m not sure that really improves the situation much. It’d still have effective write permission to itself.
On shared hosting I’d have thought a user-account cronjob calling the app via cli (not via curl) could do the updates - but wordpress has always veered for “offer a shit solution to everyone” over “offer a secure solution to some subset of everyone”.
by no means is it a big improvement, but it's forced me, on more than one occasion, to end up needing to enable an FTP server just to accommodate this behaviour. Just... more work (and increased insecurity, even if only for a small window of time).
my default for wp installs is to just take write permission off everything all the time, then re-enable it when I know I will be doing something needing to write (updates, etc). There are likely some actual WP users who are not technical yet actually use the system to do their own content updates, but almost everyone I know who's using wordpress is managing it on behalf of their clients, doing all the content updates for them. These people could/should probably get familiar enough with ssh and stronger security practices, but the 'one click update' for themes/plugins/core seems to give a very strong (but ...problematic) sense of security.
You misunderstand: in WordPress, this is completely automated, enforced, opt-out, one might say. It's rather hard to disable, but only applies for security patches.
The exploit is in file uploads. Nothing to do with comments. The first published exploit pointed at user registration which had a profile picture upload field.
The last time switching off comments helped (as far as I can remember but note I only remember the more serious secholes) was 13 years ago and the only reason that wasn't called Drupalgeddon because barely anyone used it back then and naming them wasn't in fashion (and we had two more RCE bugs the first half of 2005 anyways before we kicked out the old XML-RPC library and replaced it with a better one) ... but DRUPAL-SA-2005-002 was very, very long ago. And while Adrian wrote form API around the same time for theming purposes, we ran with it to avoid that kind of bug happening again. We? Or... just me? should I take the blame? It was an ... emerging decision but a lot of it were on me. As they years has passed, form API became frighteningly complex and a lot of that is certainly on me, someone misused it and bam! Drupalgeddon2 . It makes me sad but I do not feel guilty. I felt guilty for Drupalgeddon because that was a silly one, more of a process fail than a technical one, this was not silly, this was just too complex. Frankly, catching the bug in D7 was very near impossible, whoever ported this code to D8 should've seen it because it clearly violated the most fundamental form API principles but when you are busy porting so much code, it slipped through the cracks. Just as Rachel said for Drupalgeddon: I shouldn't feel guilty, others could've caught it too. And, after all, I am out now so it has nothing to do with me any more. Or that's what I am telling myself these empty, sad days and nights.
I'm taking down my old Drupal site and moving the content I care about to Medium. But there are a lot of links out in the wild to the Drupal pages.
So I'd like to configure a bunch of redirects to bounce people from the old Drupal urls to the new Medium pages.
Will I have to do it by hand for each URL I care about, or are there some tools to help doing that, like scraping google or my logs for a histogram of incoming links, to determine which ones matter the most?
I'd appreciate any links or suggestions about tools or techniques to migrate out of a CMS like Drupal and redirect old links to the new pages. How to play well with google and SEO, how to normalize the urls, what kind of redirects to use, what nuances are there about which urls to redirect, etc?
What you're asking about there could amount to a fair amount of work and it probably too much to fully answer in a comment reply so an alternative...
If you wanted to just do something quickly and then in slower time make the finer adjustments you could simply spider and download your current site with wget and then drop the resulting static html into Netlify.
It's free and I just did that with an old Wordpress site that I didn't want to take the time to upgrade and deal with but needed to make secure right away.
I kept a copy of the Wordpress install to work with locally and the live site now just returns static html hosted on Netlify that can't be attacked.
I migrated my personal site to Hugo after this latest. Wrote about some of the reasoning and methods[0], involving drupal2hugo export and some regexes to keep the previous URLs. Submitted new sitemap to Google etc. Still haven't written up the details, but it wasn't complicated.
EDIT: I have another Drupal site that actually needs to stay as such, and have other Hugo sites already, some of which were D7 sites. Just for context.
D5: https://www.drupal.org/files/issues/2018-03-28/sa-core-2018-...
D6: https://www.drupal.org/files/issues/2018-03-28/SA-CORE-2018-...
D7: https://cgit.drupalcode.org/drupal/rawdiff/?h=7.x&id=2266d2a...
D8: https://cgit.drupalcode.org/drupal/rawdiff/?h=8.5.x&id=5ac87...