Email Address Validation for the Masses
Posted by: Mordred in Uncategorized, tags: email, validationThe problem:
The user is expected to enter his email address in a form field.
The solution(s):
- Pass it through a simple check to see if it contains a @ sign.
- Pass it through a regexp checking for “name@domain.com” format.
- Pass it through a really complex regexp checking for full RFC 2822 correctness.
- Pass it through a really working really complex regexp checking for really full RFC 2822 correctness.
- Pass it through a really working really complex regexp checking for really full RFC 2822 correctness, and then you check if the domain in the domain part exists.
- Send a unique random token (in a link back to your site) to the email.
Email addresses can take really really complex forms, and still be RFC-valid. They can be RFC-valid, but the particular email provider may like only a subset of what the RFC requires (for example, for the local part - think “username” - of the email gmail allows only alphanumerics and dots, 6-30 characters). The domain name may not exist. It may exist, but not have an MX record (i.e. it will not accept mails). The domain may accept mails, but the local part may point to an non-existing account.
So what do we do? We use the last of the above options: we send a mail, and wait for proof that it’s readable by the user. But why don’t we do a validity check first? At least some sanity check? Okay, but keep it as a warning only: tell the user if the address doesn’t “look” valid, but don’t expect that your Really Correct Check ™ is actually correct, so don’t deny the form submission. Unless you’re okay with angry users / customers of course.
What if you want an email address that does not belong to the user (think e-cards). Well, then you sigh, pick one of the checking methods above except the last one, use it as a warning only (not as a fail test), and just send the mail and hope for the best.
Obviously, in both cases (whether the email is owned by the user or not), there is the problem that we’ll send a mail to a given email address without further questions. As we saw, we can’t really ask any meaningful question that will answer if the email is “valid”, besides sending an email, so we have a problem with spambots here. The solution is well known - deny access to the bots by protecting the form with a good strong CAPTCHA.