Most personal *mail servers* won't, but most *sources of e-mail* sending un-auth...

jsprogrammer · on June 29, 2015

Do you have any data on the percentage of spam email sources (of all email sources) that send 1-10 emails per day?

I'd expect that for a spam source to be effective, it'd need to send many more messages. Some spammers may try to run many low-rate sources, but you could likely do statistical analysis on the content of the messages to differentiate the spam sources from the non-spam sources.

Further, allowing the occasional doesn't-look-like-spam-but-is-actually-spam message (if such a thing exists), isn't the end of the world. The user can mark it as spam and have the source blocked forever.

The real spam problem that needed to be addressed was mailboxes being completely flooded with unsolicited messages. So flooded that it becomes impossible to read or even find the legitimate messages.

msandford · on June 29, 2015

> Do you have any data on the percentage of spam email sources (of all email sources) that send 1-10 emails per day?

Do you have any method by which all the different mailservers worldwide can successfully collaborate to determine precisely how many emails a particular mailserver sends per day?

You're proposing that mailservers with a low emails-per-day metric be trusted. How is that metric established? How does everyone agree on it? Do we have to implement a new protocol in addition to SMTP in order for this to be viable?

The most obvious response to this is "well each service provider can just count the stats internally and use that" but it's a non-starter. That means that Google with their, what, 1000 email servers (or more) now has to figure this out. And every other service provider too.

Finally, given all the service providers out there, and the total number of "consumer" internet connected computers, giving all the computers a free 1-10 email pass means that spam is back to the worst of 2000-2010 levels. Here's some math:

There are 115mm households in the US http://quickfacts.census.gov/qfd/states/00000.html

81% internet usage https://en.wikipedia.org/wiki/Internet_in_the_United_States#...

That's 93mm consumer connections

Let's suppose that there are 20 big players in the mail provider category, Google, Yahoo, Microsoft (hotmail, office365, etc), AOL, comcast, at&t, time warner, etc. 20 seems like a good number.

Since these systems can't be made to communicate with one another easily about the sent-count of every IP address in the world, they just do it locally. That means that the virus that infected your computer and turned it into a spam-bot gets:

10 free messages * 20 major providers * 93mm computers = 18 billion spams per day

And that's just from the computers in the US. Once you take this global, you're probably talking at least 100 major providers and 300-500 million computers. Then you're talking something like a trillion spams a day.

Seems like a reasonable idea until you look at how it would actually be implemented. Then it doesn't seem so great.

jsprogrammer · on June 29, 2015

Modern spam filtering is not purely based on white/black-lists. Statistical analysis of the content is performed.

You don't need to auto-spam-box sources that are sending low volume emails to the same set of addresses, especially when the messages don't get categorized as spam by the content analysis.

You don't need the kind of global coordination you are talking about. Further, whitelists don't solve the problem. Under your scenarios, you can just as easily receive low volume spam from millions of fake accounts at the big providers. You're just relying on the provider to solve your spam problem.

vidarh · on June 30, 2015

Sorry, but these claims just demonstrate that you have not operated a large volume mail servers.

Yes, most of us do need to automatically block such sources because content analysis does not do a good job, and experience demonstrates that the vast majority of such messages are still spam.

> You're just relying on the provider to solve your spam problem.

Yes, we are. And you should be extremely happy they put in the effort they do, or your e-mail would be completely useless.

jlgaddis · on June 29, 2015

> Modern spam filtering is not purely based on white/black-lists. Statistical analysis of the content is performed.

Not purely, no, but content analysis is typically performed after whitelist/blacklist checks. If a host connecting to my mail servers is on a blacklist, there is no content analysis because the mail will be rejected before it gets that far.

c22 · on June 30, 2015

> 10 free messages * 20 major providers * 93mm computers = 18 billion spams per day

I don't get this, is this assuming all 93 million consumer connections are infected with the same malware?

vidarh · on June 30, 2015

It's assuming that they're all infected by some malware. Of course not all of them are, but realistically there are also thousands of providers, not 20.

vidarh · on June 30, 2015

How would my mail server know that the sender is a low volume source?

It's trivial to randomly cycle through the domains you're sending to in order to deliver at a low rate to any single domain.

> Further, allowing the occasional doesn't-look-like-spam-but-is-actually-spam message (if such a thing exists), isn't the end of the world.

It is when you allow the occasional such message from each of tens of millions of compromised machines worldwide.

Recently we had to block inbound mail from most of Ukraine because hundreds of thousands of IP addresses belonging to Ukrainian ISPs were used to spam us. I started manually blocking /24's, but couldn't keep up, so we moved to blocking /16's, and even that was going too slowly to avoid affecting our systems.

Another time recently we had to block a particularly notorious Asian ISP that basically don't care that the millions of addresses in their net bocks all gets hijacked for spamming purposes (or quite possibly they do care as a result of being paid to look the other way).

This is business as usual when you operate anything but the smallest mail servers where you might get lucky and not get detected by spammers.

We still receive spam sent from malware addressed at domains of clients we haven't hosted for half a decade, that's been picked up from peoples mail clients....

This is why a lot of people decide it's easier to just blanket block all dynamic IPs. It costs a lot of money to stay on top of spam, and a lot more if you decide to try to avoid collateral damage.

> The real spam problem that needed to be addressed was mailboxes being completely flooded with unsolicited messages. So flooded that it becomes impossible to read or even find the legitimate messages.

The practices you complain about are the reason why the spam problem is now contained sufficiently for e-mail to still be usable.

jsprogrammer · on June 30, 2015

Sounds like you can map out the compromised addresses and selectively block them.

Avoiding auto-spam-boxing new domains (as opposed to IP addresses), would be helpful. Any strategy that attempts to leverage multiple machines would still need to register and maintain many domain names, making such a spam strategy costly and more technically difficult. Further, an easy check for malicious users is to examine the websites at the domains suspected of spamming. Or even just WHOIS info.

jcranmer · on June 30, 2015

If spam detection requires human intervention, it is too slow. Domain reputation needs to converge within 3-5 minutes---even 30 minutes means you have lost the spam race.