How to validate email address with regular expression
Email Regular Expression Pattern
^[_A-Za-z0-9-]+(\\.[_A-Za-z0-9-]+)*@ [A-Za-z0-9]+(\\.[A-Za-z0-9]+)*(\\.[A-Za-z]{2,})$
Description
^ #start of the line [_A-Za-z0-9-]+ # must start with string in the bracket [ ], must contains one or more (+) ( # start of group #1 \\.[_A-Za-z0-9-]+ # follow by a dot "." and string in the bracket [ ], must contains one or more (+) )* # end of group #1, this group is optional (*) @ # must contains a "@" symbol [A-Za-z0-9]+ # follow by string in the bracket [ ], must contains one or more (+) ( # start of group #2 - first level TLD checking \\.[A-Za-z0-9]+ # follow by a dot "." and string in the bracket [ ], must contains one or more (+) )* # end of group #2, this group is optional (*) ( # start of group #3 - second level TLD checking \\.[A-Za-z]{2,} # follow by a dot "." and string in the bracket [ ], with minimum length of 2 ) # end of group #3 $ #end of the line
Whole combination is means, email address must start with “_A-Za-z0-9-” , optional follow by “.[_A-Za-z0-9-]“, and end with a “@” symbol. The email’s domain name must start with “A-Za-z0-9″, follow by first level Tld (.com, .net) “.[A-Za-z0-9]” and optional follow by a second level Tld (.com.au, .com.my) “\\.[A-Za-z]{2,}”, where second level Tld must start with a dot “.” and length must equal or more than 2 characters.
Java Regular Expression Example
Here’s a Java example to show the use of regex to validate an email address.
package com.mkyong.regex; import java.util.regex.Matcher; import java.util.regex.Pattern; public class EmailValidator{ private Pattern pattern; private Matcher matcher; private static final String EMAIL_PATTERN = "^[_A-Za-z0-9-]+(\\.[_A-Za-z0-9-]+)*@ [A-Za-z0-9]+(\\.[A-Za-z0-9]+)*(\\.[A-Za-z]{2,})$"; public EmailValidator(){ pattern = Pattern.compile(EMAIL_PATTERN); } /** * Validate hex with regular expression * @param hex hex for validation * @return true valid hex, false invalid hex */ public boolean validate(final String hex){ matcher = pattern.matcher(hex); return matcher.matches(); } }
Emails that match:
1. “mkyong@yahoo.com”, “mkyong-100@yahoo.com”,”mkyong.100@yahoo.com”
2. “mkyong111@mkyong.com”, “mkyong-100@mkyong.net”,”mkyong.100@mkyong.com.au”
3. “mkyong@1.com”, “mkyong@gmail.com.com”
Emails that doesn’t match:
1. “mkyong” – must contains “@” symbol
2. “mkyong@.com.my” – tld can not start with dot “.”
3. “mkyong123@gmail.a” – “.a” is not a valid tld, last tld must contains at least two characters
4. “mkyong123@.com” – tld can not start with dot “.”
5. “mkyong123@.com.com” – tld can not start with dot “.”
6. “.mkyong@mkyong.com” – email’s first character can not start with dot “.”
7. “mkyong()*@gmail.com” – email’s is only allow character, digit, underscore and dash
8. “mkyong@%*.com” – email’s tld is only allow character and digit
9. “mkyong..2002@gmail.com” – double dots “.” are not allow
10. “mkyong.@gmail.com” – email’s last character can not end with dot “.”
11. “mkyong@mkyong@gmail.com” – double “@” is not allow
12. “mkyong@gmail.com.1a” -email’s tld which has two characters can not contains digit
Unit Test
Here’s the unit test for above email validator.
package com.mkyong.regex; import org.testng.Assert; import org.testng.annotations.*; /** * Email validator Testing * @author mkyong * */ public class EmailValidatorTest { private EmailValidator emailValidator; @BeforeClass public void initData(){ emailValidator = new EmailValidator(); } @DataProvider public Object[][] ValidEmailProvider() { return new Object[][]{ {new String[] { "mkyong@yahoo.com", "mkyong-100@yahoo.com", "mkyong.100@yahoo.com" ,"mkyong111@mkyong.com", "mkyong-100@mkyong.net","mkyong.100@mkyong.com.au", "mkyong@1.com", "mkyong@gmail.com.com" }} }; } @DataProvider public Object[][] InvalidEmailProvider() { return new Object[][]{ {new String[] { "mkyong","mkyong@.com.my","mkyong123@gmail.a", "mkyong123@.com","mkyong123@.com.com", ".mkyong@mkyong.com","mkyong()*@gmail.com", "mkyong@%*.com", "mkyong..2002@gmail.com", "mkyong.@gmail.com","mkyong@mkyong@gmail.com", "mkyong@gmail.com.1a" }} }; } @Test(dataProvider = "ValidEmailProvider") public void ValidEmailTest(String[] Email) { for(String temp : Email){ boolean valid = emailValidator.validate(temp); System.out.println("Email is valid : " + temp + " , " + valid); Assert.assertEquals(true, valid); } } @Test(dataProvider = "InvalidEmailProvider", dependsOnMethods="ValidEmailTest") public void InValidEmailTest(String[] Email) { for(String temp : Email){ boolean valid = emailValidator.validate(temp); System.out.println("Email is valid : " + temp + " , " + valid); Assert.assertEquals(false, valid); } } }
Unit Test – Result
Here’s the unit test result.
Email is valid : mkyong@yahoo.com , true
Email is valid : mkyong-100@yahoo.com , true
Email is valid : mkyong.100@yahoo.com , true
Email is valid : mkyong111@mkyong.com , true
Email is valid : mkyong-100@mkyong.net , true
Email is valid : mkyong.100@mkyong.com.au , true
Email is valid : mkyong@1.com , true
Email is valid : mkyong@gmail.com.com , true
Email is valid : mkyong , false
Email is valid : mkyong@.com.my , false
Email is valid : mkyong123@gmail.a , false
Email is valid : mkyong123@.com , false
Email is valid : mkyong123@.com.com , false
Email is valid : .mkyong@mkyong.com , false
Email is valid : mkyong()*@gmail.com , false
Email is valid : mkyong@%*.com , false
Email is valid : mkyong..2002@gmail.com , false
Email is valid : mkyong.@gmail.com , false
Email is valid : mkyong@mkyong@gmail.com , false
Email is valid : mkyong@gmail.com.1a , false
PASSED: ValidEmailTest([Ljava.lang.String;@1a626f)
PASSED: InValidEmailTest([Ljava.lang.String;@1975b59)
===============================================
com.mkyong.regex.EmailValidatorTest
Tests run: 2, Failures: 0, Skips: 0
===============================================
===============================================
mkyong
Total tests run: 2, Failures: 0, Skips: 0
===============================================
According to http://tools.ietf.org/html/rfc3696#page-5 the following characters are legal in the local part of an E-Mail address:
! # $ % & ‘ * + – / = ? ^ _ ` . { | } ~
So the resulting regex would be:
i need the code to be changed so that it should not accept .com.com, the repeated things
Please help me.
Hi,
This validation works fine but it is not validating if i give like this.
Example:abc@software.co.zasdsjjh
Please anybody give solution for this.
Thanks
Sireesha
Thanks a lot….
Great stuff, thanks! Really helped me.
Turns out this somehow ended up being contributed to Struts. You might be interested in seeing how it turned out; https://issues.apache.org/jira/browse/WW-2805
[...] Source Code Download It – Spring3-EL-Regular-Expression-Example.zip (6 KB)ReferencesEmail regular expression exampleSpring EL ternary operator (if-then-else) example [...]
Final version, we use, covering emails, we received so far, is:
“^[_A-Za-z0-9-+]+(\\.[_A-Za-z0-9-+]+)*@[A-Za-z0-9-]+(\\.[A-Za-z0-9-]+)*(\\.[A-Za-z]{2,})$”
If somebody will get emails which can’t be fit in this regexp – please post in this thread.
This regexp leaves out a character, which I need in my regexp/validator, meaning &
Is there any way to use this in the regexp?
What do you mean by “leaves out a character”?
BTW I use simple caching class – for efficiency and to compile patterns only once:
Usage:
Lots:
user+mailbox@example.com
customer/department=shipping@example.com
$A12345@example.com
!def!xyz%abc@example.com
all legal according to http://tools.ietf.org/html/rfc3696#page-5
In spite of fairly simple being of the java.util.regex I am still struggling to make it work, but having read through this article along with the codes in the comment section I will give it a go again tonight…thnaks for the code MKYong
[...] Note For detail explanation about the above regular expression pattern, please refer to this “Validate E-mail with Java regular expression” [...]
I think this wld help shorter and easier code…..valid and email stored in two separate file from *.csv file
Thanks for your invaluable sharing
Hi everyone!
mkyoug, thnks very much for this expression.
Learning to use regex and to satisfy the + and – characters, I slightly modified your code.
here is the final regex which seem to work correctly
^[_A-Za-z0-9-+]+(\\.[_A-Za-z0-9-+]+)*@[A-Za-z0-9-]+(\\.[A-Za-z0-9]+)*(\\.[A-Za-z]{2,})$
Cheers everyone,
Yashvin
From Mauritius.
Hi Yashvin,
Thanks for sharing your code
This one worked really well and catered for gmail ‘plused’ addresses e.g. blah+test1@gmail.com – I have however added
: an aditional – to cater for @nz.blah-stuff.com – the “nz.” made yours error
: I also added the 4 limit at the end otherwise @blah.asdfasdf was valid – however it makes no sense – make it 5 or 6 if you like but it’ll just add a little more validation
[_A-Za-z0-9-+]+(\\.[_A-Za-z0-9-+]+)*@[A-Za-z0-9-]+(\\.[A-Za-z0-9-]+)*(\\.[A-Za-z]{2,4})
Thanks a heap for this one!!!!
Note: I tested this on http://www.cis.upenn.edu/~matuszek/General/RegexTester/regex-tester.html (however I had to change the double \\ to a single \ ).
Simon
Such e-mail as test@mky-ong.com will fail validation.
ya..you can customize the regEx easily, just add the “-” after the @ group… [A-Za-z0-9-]+
[...] ==> See the explanation and example here [...]
I found errors in the local, domain and top level part:
———–
The local part is not limited to characters (c&p from your links):
———–
* Uppercase and lowercase English letters (a-z, A-Z)
* Digits 0 to 9
* Characters ! # $ % & ‘ * + – / = ? ^ _ ` { | } ~
* Character .
———–
The Domain name is limited to 63 characters, the sum of domain name part to 256!
———–
The Top-Level-Domain can limited by {2,5} (was {2,}) unless the asian idna-coded top level domains arraived the wild…
———–
You are returning false on the valid case:
mykong+100@gmail.com
Gmail parses e-mails sent to it with a + character; mykong@gmail.com will receive the above e-mail, but it will be filtered as a “100″ e-mail.
This is useful functionality when submitting valid e-mails to potential spam processes; for example a business promotion. You can see if your e-mail is getting re-sold. Unfortunately for the consumer, many validators disallow the + character.
It is not possible to validate an emailadress with a regular expression in the general case. The only way to validate an emailadress is to send a mail and wait for a response.
The best you can do with a regexp is to validate that the syntax is correct as defined in rfc2822 but the regexp that covers that is about one page long and much more complicated than your example. To make matters worse only a few syntactically correct emailadresses actually have someone or something that receives mail on it.
My advice is that if you have to check that someone gives you an emailadress do not try to hard. Make do with checking for an @-sign and perhaps at least one dot in the domain (this of course forbids uucp-style adresses) or even that the length of the string is large enough to be able to contain an email adress.
OK, this is a nice exercise to play around with but do not under any circumstances use it in a real world application!
Sorry for the rant but I have seen too many that uses these simplistic measures to validate webforms and unknowingly forbids people to contact them because their emailadresses contains “strange” characters.
hi stefan,
Thanks for sharing your personal experience and advise.