Email Regular Expression Pattern
^[_A-Za-z0-9-]+(\\.[_A-Za-z0-9-]+)*@ [A-Za-z0-9]+(\\.[A-Za-z0-9]+)*(\\.[A-Za-z]{2,})$
Description
^ #start of the line [_A-Za-z0-9-]+ # must start with string in the bracket [ ], must contains one or more (+) ( # start of group #1 \\.[_A-Za-z0-9-]+ # follow by a dot "." and string in the bracket [ ], must contains one or more (+) )* # end of group #1, this group is optional (*) @ # must contains a "@" symbol [A-Za-z0-9]+ # follow by string in the bracket [ ], must contains one or more (+) ( # start of group #2 - first level TLD checking \\.[A-Za-z0-9]+ # follow by a dot "." and string in the bracket [ ], must contains one or more (+) )* # end of group #2, this group is optional (*) ( # start of group #3 - second level TLD checking \\.[A-Za-z]{2,} # follow by a dot "." and string in the bracket [ ], with minimum length of 2 ) # end of group #3 $ #end of the line
Whole combination is means, email address must start with “_A-Za-z0-9-” , optional follow by “.[_A-Za-z0-9-]“, and end with a “@” symbol. The email’s domain name must start with “A-Za-z0-9″, follow by first level Tld (.com, .net) “.[A-Za-z0-9]” and optional follow by a second level Tld (.com.au, .com.my) “\\.[A-Za-z]{2,}”, where second level Tld must start with a dot “.” and length must equal or more than 2 characters.
Example in Java
package com.mkyong.regex; import java.util.regex.Matcher; import java.util.regex.Pattern; public class EmailValidator{ private Pattern pattern; private Matcher matcher; private static final String EMAIL_PATTERN = "^[_A-Za-z0-9-]+(\\.[_A-Za-z0-9-]+)*@ [A-Za-z0-9]+(\\.[A-Za-z0-9]+)*(\\.[A-Za-z]{2,})$"; public EmailValidator(){ pattern = Pattern.compile(EMAIL_PATTERN); } /** * Validate hex with regular expression * @param hex hex for validation * @return true valid hex, false invalid hex */ public boolean validate(final String hex){ matcher = pattern.matcher(hex); return matcher.matches(); } }
Email that match:
1) “mkyong@yahoo.com”, “mkyong-100@yahoo.com”,”mkyong.100@yahoo.com”
2) “mkyong111@mkyong.com”, “mkyong-100@mkyong.net”,”mkyong.100@mkyong.com.au”
3) “mkyong@1.com”, “mkyong@gmail.com.com”
Email that doesn’t match:
1) “mkyong” – must contains “@” symbol
2) “mkyong@.com.my” – tld can not start with dot “.”
3) “mkyong123@gmail.a” – “.a” is not a valid tld, last tld must contains at least two characters
4) “mkyong123@.com” – tld can not start with dot “.”
5) “mkyong123@.com.com” – tld can not start with dot “.”
6) “.mkyong@mkyong.com” – email’s first character can not start with dot “.”
7) “mkyong()*@gmail.com” – email’s is only allow character, digit, underscore and dash
8 ) “mkyong@%*.com” – email’s tld is only allow character and digit
9) “mkyong..2002@gmail.com” – double dots “.” are not allow
10) “mkyong.@gmail.com” – email’s last character can not end with dot “.”
11) “mkyong@mkyong@gmail.com” – double “@” is not allow
12) “mkyong@gmail.com.1a” -email’s tld which has two characters can not contains digit
Unit Test – EmailValidator
package com.mkyong.regex; import org.testng.Assert; import org.testng.annotations.*; /** * Email validator Testing * @author mkyong * */ public class EmailValidatorTest { private EmailValidator emailValidator; @BeforeClass public void initData(){ emailValidator = new EmailValidator(); } @DataProvider public Object[][] ValidEmailProvider() { return new Object[][]{ {new String[] { "mkyong@yahoo.com", "mkyong-100@yahoo.com", "mkyong.100@yahoo.com" ,"mkyong111@mkyong.com", "mkyong-100@mkyong.net","mkyong.100@mkyong.com.au", "mkyong@1.com", "mkyong@gmail.com.com" }} }; } @DataProvider public Object[][] InvalidEmailProvider() { return new Object[][]{ {new String[] { "mkyong","mkyong@.com.my","mkyong123@gmail.a", "mkyong123@.com","mkyong123@.com.com", ".mkyong@mkyong.com","mkyong()*@gmail.com", "mkyong@%*.com", "mkyong..2002@gmail.com", "mkyong.@gmail.com","mkyong@mkyong@gmail.com", "mkyong@gmail.com.1a" }} }; } @Test(dataProvider = "ValidEmailProvider") public void ValidEmailTest(String[] Email) { for(String temp : Email){ boolean valid = emailValidator.validate(temp); System.out.println("Email is valid : " + temp + " , " + valid); Assert.assertEquals(true, valid); } } @Test(dataProvider = "InvalidEmailProvider", dependsOnMethods="ValidEmailTest") public void InValidEmailTest(String[] Email) { for(String temp : Email){ boolean valid = emailValidator.validate(temp); System.out.println("Email is valid : " + temp + " , " + valid); Assert.assertEquals(false, valid); } } }
Unit Test – Result
Email is valid : mkyong@yahoo.com , true
Email is valid : mkyong-100@yahoo.com , true
Email is valid : mkyong.100@yahoo.com , true
Email is valid : mkyong111@mkyong.com , true
Email is valid : mkyong-100@mkyong.net , true
Email is valid : mkyong.100@mkyong.com.au , true
Email is valid : mkyong@1.com , true
Email is valid : mkyong@gmail.com.com , true
Email is valid : mkyong , false
Email is valid : mkyong@.com.my , false
Email is valid : mkyong123@gmail.a , false
Email is valid : mkyong123@.com , false
Email is valid : mkyong123@.com.com , false
Email is valid : .mkyong@mkyong.com , false
Email is valid : mkyong()*@gmail.com , false
Email is valid : mkyong@%*.com , false
Email is valid : mkyong..2002@gmail.com , false
Email is valid : mkyong.@gmail.com , false
Email is valid : mkyong@mkyong@gmail.com , false
Email is valid : mkyong@gmail.com.1a , false
PASSED: ValidEmailTest([Ljava.lang.String;@1a626f)
PASSED: InValidEmailTest([Ljava.lang.String;@1975b59)
===============================================
com.mkyong.regex.EmailValidatorTest
Tests run: 2, Failures: 0, Skips: 0
===============================================
===============================================
mkyong
Total tests run: 2, Failures: 0, Skips: 0
===============================================Reference
1. http://en.wikipedia.org/wiki/E-mail_address
2. http://tools.ietf.org/html/rfc2822#section-3.4.1
Want to learn more about regular expression? Highly recommend the best and classic book – “Mastering Regular Expression”



Hi everyone!
mkyoug, thnks very much for this expression.
Learning to use regex and to satisfy the + and – characters, I slightly modified your code.
here is the final regex which seem to work correctly
^[_A-Za-z0-9-+]+(\\.[_A-Za-z0-9-+]+)*@[A-Za-z0-9-]+(\\.[A-Za-z0-9]+)*(\\.[A-Za-z]{2,})$
Cheers everyone,
Yashvin
From Mauritius.
Hi Yashvin,
Thanks for sharing your code
Such e-mail as test@mky-ong.com will fail validation.
ya..you can customize the regEx easily, just add the “-” after the @ group… [A-Za-z0-9-]+
[...] ==> See the explanation and example here [...]
I found errors in the local, domain and top level part:
———–
The local part is not limited to characters (c&p from your links):
———–
* Uppercase and lowercase English letters (a-z, A-Z)
* Digits 0 to 9
* Characters ! # $ % & ‘ * + – / = ? ^ _ ` { | } ~
* Character .
———–
The Domain name is limited to 63 characters, the sum of domain name part to 256!
———–
The Top-Level-Domain can limited by {2,5} (was {2,}) unless the asian idna-coded top level domains arraived the wild…
———–
You are returning false on the valid case:
mykong+100@gmail.com
Gmail parses e-mails sent to it with a + character; mykong@gmail.com will receive the above e-mail, but it will be filtered as a “100″ e-mail.
This is useful functionality when submitting valid e-mails to potential spam processes; for example a business promotion. You can see if your e-mail is getting re-sold. Unfortunately for the consumer, many validators disallow the + character.
It is not possible to validate an emailadress with a regular expression in the general case. The only way to validate an emailadress is to send a mail and wait for a response.
The best you can do with a regexp is to validate that the syntax is correct as defined in rfc2822 but the regexp that covers that is about one page long and much more complicated than your example. To make matters worse only a few syntactically correct emailadresses actually have someone or something that receives mail on it.
My advice is that if you have to check that someone gives you an emailadress do not try to hard. Make do with checking for an @-sign and perhaps at least one dot in the domain (this of course forbids uucp-style adresses) or even that the length of the string is large enough to be able to contain an email adress.
OK, this is a nice exercise to play around with but do not under any circumstances use it in a real world application!
Sorry for the rant but I have seen too many that uses these simplistic measures to validate webforms and unknowingly forbids people to contact them because their emailadresses contains “strange” characters.
hi stefan,
Thanks for sharing your personal experience and advise.