Ask HN: open source Posterous-style email validation?

patio11 · on June 20, 2010

pyspf (Google it) will do SPF checking for you. If you'd rather do it yourself, SPF is really, really simple to validate in your language of choice. However, not everybody uses SPF.

As for "validating" the rest of the email headers, well... I want to strike a balance between "sure you can do that, good luck!" and "the entire anti-spam community has tried this and it is basically impossible, which is why we rely heavily on IP reputation and Bayes-based approaches which do not treat the contents of the headers as semantically meaningful, since they are in the hands of the enemy".

davi · on June 20, 2010

Thanks very much, that's helpful. Maybe good enough for a small, experimental project (i.e. one step beyond 'nothing'). An open source effort to take a crack at the larger scope you lay out would be a good thing.

frognibble · on June 20, 2010

Here's a sketch for checking the validity of the sender. It does not handle all cases and I am sure it has some holes. I am interested in feedback on this. Are there other things to check? Are these checks "safe" for some definition of safe?

Step 1: If DKIM header present, then use result of DKIM validation.

Step 2: If sending domain has SPF record, then use result of SPF validation.

Step 3: If message passes SPF check using a conservatively guessed SPF record, then treat the message as valid.

Step 4: If message came from same IP address as other messages for user and some headers match headers from previous messages (fuzzy match on message id?), then treat the message as valid.

Step 5: What next? Messages will make it past the previous steps.

JoachimSchipper · on June 20, 2010

Well, each of these has problems.

DKIM, which is not widely deployed, typically protects the message, From: and To: headers, and other headers. If this is actually used, you only have to worry about replayed messages (a hacker sends 1,000,000 copies of a legitimate blog post), which is doable. Unfortunately, you can't do anything if this header is not present - even if I have a Yahoo/GMail/... address, which would otherwise be DKIM'ed, I may have sent this message via another mail server.

SPF, which checks that the server sending the mail is authorized to do so, would work reasonably well, or at least hand off the issue to the administrator of the sending mail server. Unfortunately, there are quite a few domains without SPF or which SOFTFAIL all; worse, prank-loving coworkers may have access to the same mailserver.

"Same IP address" falls afoul of the pranking coworkers again, and is a very weak heuristic anyway.

There are at least two solutions that work. The actually secure one is requiring the user to PGP- or S/MIME-sign all mail; the other one is to send back a challenge. Mailing lists managers typically do this - send a message with "Subject: 23dsaf2: please confirm post" and accept any response that contains 23dsaf2 in the subject.

frognibble · on June 20, 2010

The context of this thread is creating a Posterous-style email validation. Posterous does not use either of the two solutions that you suggest.

It's OK that DKIM is not widely deployed because the logic falls back to other mechanisms when the DKIM header is not present. DKIM is deployed on GMail and Yahoo Mail, so it is worth doing. Replay attacks are easy to defeat by not posting duplicate content. It's probably a good idea do to dup detection to handle the case where the user accidentally sends the message twice.

JoachimSchipper · on June 21, 2010

Hmm, yes, I was just pointing out that there are other solutions.

Yes, I agree that DKIM+duplicate detection is fairly good; you just can't rely on it being present, and if it isn't you have to fall back to much less reliable stuff.

japherwocky · on June 20, 2010

Zed Shaw's Lamson project (http://lamsonproject.com) has some solid code for handling most of the messiest parts of dealing with email - bounces, unicode, etc.

It's structured in a way that makes it very easy to snip out the parts you want to use without necessarily using all the rest.

phreeza · on June 20, 2010

Wasn't it shown yesterday that posterous has basically no security at all?

http://news.ycombinator.com/item?id=1441997

convel · on June 20, 2010

This security hole is now fixed. We had a specific problem with the way we dealt with SPF records. Dustin didn't set any up, and there was a specific way that Robin Duckett's email server responded that caused us to flag it as a false negative for spoofing.

http://news.ycombinator.com/item?id=1443143

MichaelApproved · on June 20, 2010

What keeps someone else behind the same smtp server from spoofing an email?

_delirium · on July 4, 2010

A lot of SMTP servers implementing SMTP AUTH will add an annotation "(Authenticated sender: localusername)" or similar, which will let you distinguish between different users of the same mail server, even when they spoof the From: header. Not sure if that's the solution Posterous is using, or how widespread it is, though.

risotto · on June 20, 2010

No it was shown there was a bug.

Plus this validates the OPs question. It's not hard to receive emails somewhere, but it is hard to clean them, validate the sender with SPF and a few other email non-standards.

I've been using google app engines email service but it chokes and rejects a ton of emails that I forward to it.

Maybe Posterous could spin off an email cloud service? It'd be a distraction from their core competency, but I could see a lot of value.

risotto · on June 20, 2010

Or maybe sendgrid already is this? Haven't used it but they are a big name in cloud email services

karimyaghmour · on June 20, 2010

I'm still wondering what Posterous plans to do when they reach enough of a critical mass that spammers will actively try to impersonate existing accounts. Generalized, non-sender-server-enforced sender authentication does not exist. That's why SPF and DKIM came along ... I'm sure they've had to pour over this. Anyone have a link on design/discussion?

pyre · on June 20, 2010

They could always go with GPG/PGP.

MichaelApproved · on June 20, 2010

Validating with headers is like securing a webpage by keeping the URL a secret or browser user agent and ip address. It gives a false sense of security and is very vulnerable to cracks.

If you're going to validate with headers then feel free to call it usable but don't call it security.

kljensen · on June 20, 2010

Is there any degree of "free" validation if you route all the emails through another service that probably does some of this. E.g. a gmail account that forwards all incoming mail onto your servers.

quadhome · on June 20, 2010

Am I missing something obvious against backtracking the headers to the server immediately before your own? If the IP of that machine differs in future emails, ask for confirmations?

MichaelApproved · on June 20, 2010

Since you can't validate beyond the SMTP server you're going to have trouble when two or more people are behind the same server.

quadhome · on June 21, 2010

Presumably it's their server's issue at that point.

If a server allows multiple people to send from the same address, then you can't validate further than that.