I also don't really get why we don't apply spam filtering retroactively a few seconds/minutes after it arrived. At first you don't know the domain yet, but at some point you can go "oh, this domain now reached more than 1000 different inboxes for the first time, and of the 50 users who saw the message, 45%* marked it as spam. Must be spam, let's move these and future messages to spam folders for all who haven't downloaded/opened the message yet." There is so much spam that is (nearly) identical and reaches a large percentage of people, but it's just left in the inbox for years by outlook/gmail/etc., even if I don't log into the account for months at a time.
If the spammers can only reach an initial small sample and the domain is next to useless after that, even 99ct domains should not be worth it.
* Or whatever is a normal number. I know lots of people just leave the message as 'read' and don't bother marking it. I don't know how many users do this. Maybe one could also keep track of users who regularly mark something as spam and only count the percentage among those.
Gmail and Yahoo will do retroactive classification to some extent, but not broadly. It's a little more common to defer delivery by varying amounts based on historical sender reputation, but again that impacts a minority of legitimate email - maybe 10-20%-ish, off the top of my head - and mostly only by a few minutes.
And yes, every major consumer email provider tracks complaint and response rate metrics (as well as many other metrics and indicators) and uses those as part of their filtering. A spam ratio of > 2% is often enough to cause filtering - that's actually toward the very top end of the complaint rate spectrum for messages that are delivered to the inbox.
Spam filtering already relies on throttling. If you're a domain with unknown reputation, you're generally forbidden [by the recipient] from sending a lot of email until a reputation is established.
This suggests that a weakness of the global email system is being exploited by spammers: receiving mail servers aren't gossiping enough information about new mail sending domains.
Presumably mail server operators are reporting obvious spammers to (centralised) blacklists, but it would perhaps be possible to better tune a heuristic (and increase the cost to spammers) by sharing information on the number of non-spam messages received.
This could actually be done in a provable and relatively privacy-preserving way, if sending mail servers included signatures of the hashes of the emails they were sending. Every email that was received by a domain of unknown reputation could have its hash+sig sent to a public distributed log somewhere.
If this was combined with some sort of good-behaviour bond that domain registrars required (for domains that send email) and which was paid back after a reputation was established, it would make cheap domains much more expensive for spammers.
This is tricky though. You don't want to make these heuristics public, or spammers will just access them to switch domains, and if you say "you can't see our reputation list unless you're Gmail, Yahoo, or outlook", that can start to look a lot like collusion.
You're right, but my hope is that forcing spammers to switch domains will increase their costs to the point that spamming is no longer financially viable.
If the cost of domains is already such a significant expenditure that they need to look for sub-one dollar registrations, then requiring, say, a $10 bond on all domains with an MX record might erase their profit margins completely.
(There is a question of what constitutes "good behaviour" and whether that can be gamed by having spam domains reporting each other as sending legitimate email, but if these ratings are public then people can choose which ratings to trust. Domain age would probably be a good heuristic there too.)
I think that's how spam filtering started for large providers (in addition to classic keywords and such). But the temptation to do something smarter is strong, and as usual, "smart" things end up being wrong.
An issue with this is method is that company A may subscribe to company B's mailing list en masse, and then tag its mails as spam, causing all mails from B to be classified as spam. (Maybe that's what happened to get Paypal and Stripe banned from Gmail, who knows.)
You're right; complaint rate and similar metrics tracking were implemented by major consumer email providers in the early 00s, if I recall correctly. The technique has been around long enough for those providers to have systems in place to control for edge cases like the one you mentioned - there are enough subtle behavioral differences between an average user reporting something as spam, vs. a deliberate 'complaint brigade', in order to be able to discern reasonably well between the two - or at least, to reduce the impact of the latter. Users marking 'this is not spam' for messages that mistakenly land in spam is a key metric as well, and is part of how a lot of senders recover from short-term spam folder delivery.
Whether or not additional complexity is justified is something that can be measured in a case like this - and that's exactly what Gmail does, using a variety of different metrics, which allows them to make an informed decision about whether to take on the additional complexity cost.
Should they weight false positives for this message type more heavily in their accuracy metrics? Definitely, these are particularly critical to classify correctly.
That's one of the reasons why greylisting used to be effective; when a mail-delivery is attempted you give it a soft-fail, disconnecting the sender.
The expectation is that by the time they try to deliver a second time you'll get "spam" results if you query DNS-based blacklists, etc, as other people have reported it already.
These days greylisting doesn't seem to be so useful, as >50% of the spam I receive is sent from gmail/yahoo/similar. Hosts too big to block (sigh).
If the spammers can only reach an initial small sample and the domain is next to useless after that, even 99ct domains should not be worth it.
* Or whatever is a normal number. I know lots of people just leave the message as 'read' and don't bother marking it. I don't know how many users do this. Maybe one could also keep track of users who regularly mark something as spam and only count the percentage among those.