Domain name regular expression example

domain name

Domain Name Regular Expression Pattern


^((?!-)[A-Za-z0-9-]{1,63}(?<!-)\\.)+[A-Za-z]{2,6}$

Above pattern makes sure domain name matches the following criteria :

  1. The domain name should be a-z | A-Z | 0-9 and hyphen(-)
  2. The domain name should between 1 and 63 characters long
  3. Last Tld must be at least two characters, and a maximum of 6 characters
  4. The domain name should not start or end with hyphen (-) (e.g. -google.com or google-.com)
  5. The domain name can be a subdomain (e.g. mkyong.blogspot.com)

Description


^			    #Start of the line
 (			    #Start of group #1
	(?! -)		    #Can't start with a hyphen
	[A-Za-z0-9-]{1,63}  #Domain name is [A-Za-z0-9-], between 1 and 63 long
	(?<!-)		    #Can't end with hyphen
	\\.		    #Follow by a dot "."
 )+			    #End of group #1, this group must appear at least 1 time, but allowed multiple times for subdomain 
 [A-Za-z]{2,6}		    #TLD is [A-Za-z], between 2 and 6 long
$			    #end of the line
Note
This regular expression pattern should be able to match most of the "real-working" domain names.

List of valid domain names

  1. www.google.com
  2. google.com
  3. mkyong123.com
  4. mkyong-info.com
  5. sub.mkyong.com
  6. sub.mkyong-info.com
  7. mkyong.com.au
  8. g.co
  9. mkyong.t.t.co

List of invalid domain names, and why.

  1. mkyong.t.t.c - Tld must between 2 and 6 long
  2. mkyong,com - Comma is not allow
  3. mkyong - No Tld
  4. mkyong.123 , Tld not allow digit
  5. .com - Must start with [A-Za-z0-9]
  6. mkyong.com/users - No Tld
  7. -mkyong.com - Cannot begin with a hyphen -
  8. mkyong-.com - Cannot end with a hyphen -
  9. sub.-mkyong.com - Cannot begin with a hyphen -
  10. sub.mkyong-.com - Cannot end with a hyphen -

1. Java Regular Expression Example

A simple Java example to validate a domain name with above regular expressions pattern.

DomainUtils.java

package com.mkyong.regex;

import java.util.HashSet;
import java.util.Set;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class DomainUtils {

	private static Pattern pDomainNameOnly;
	private static final String DOMAIN_NAME_PATTERN = "^((?!-)[A-Za-z0-9-]{1,63}(?<!-)\\.)+[A-Za-z]{2,6}$";
	
	static {
		pDomainNameOnly = Pattern.compile(DOMAIN_NAME_PATTERN);
	}

	public static boolean isValidDomainName(String domainName) {
		return pDomainNameOnly.matcher(domainName).find();
	}

}

2. Unit Test with jUnit

A jUnit example.

DomainUtilsTestParam.java

package com.mkyong.regex;

import static org.junit.Assert.assertEquals;
import java.util.Arrays;
import org.junit.Test;
import org.junit.runner.RunWith;
import org.junit.runners.Parameterized;
import org.junit.runners.Parameterized.Parameters;

@RunWith(value = Parameterized.class)
public class DomainUtilsTestParam {

	private String domain;
	private boolean expected;

	public DomainUtilsTestParam(String domain, boolean expected) {
		this.domain = domain;
		this.expected = expected;
	}

	@Parameters(name= "{index}: isValid({0})={1}")
 	public static Iterable<Object[]> data() {
 		return Arrays.asList(new Object[][] { 
 		{ "www.google.com", true }, 
 		{ "google.com", true },
 		{ "mkyong123.com", true }, 
 		{ "mkyong-info.com", true }, 
 		{ "sub.mkyong.com", true },
                { "sub.mkyong-info.com", true }, 
                { "mkyong.com.au", true }, 
                { "sub.mkyong.com", true }, 
                { "sub.sub.mkyong.com", true }, 
                { "g.co", true },
                { "mkyong.t.t.co", true },	
                { "mkyong.t.t.c", false },      //Tld must at between 2 and 6 long
                { "mkyong,com", false }, 	//comma not allowed
                { "mkyong", false }, 		//no tld
                { "mkyong.123", false },	//digit not allowed in tld
                { ".com", false }, 		//must start with [A-Za-z0-9]
                { "mkyong.a", false },		//last tld need at least two characters
                { "mkyong.com/users", false },	// no tld
                { "-mkyong.com", false },	//Cannot begin with a hyphen -
                { "mkyong-.com", false },	//Cannot end with a hyphen -
                { "sub.-mkyong.com", false },	//Cannot begin with a hyphen -
                { "sub.mkyong-.com", false }	//Cannot end with a hyphen -
            }
 	);
     }

    @Test
    public void test_validDomains() {
	assertEquals(expected,DomainUtils.isValidDomainName(domain));
    }

}

Output, all passed.

domain-regex-junit

4. Unit Test with TestNG

A TestNG example.

DomainUtilsTestParam.java

package com.mkyong.regex;

import org.testng.Assert;
import org.testng.annotations.DataProvider;
import org.testng.annotations.Test;

public class DomainUtilsTestParam {

	@DataProvider
	public Object[][] ValidDomainNameProvider() {
	  return new Object[][] {{ 
			   new String[] { 
		"www.google.com", "google.com",
		"mkyong123.com", "mkyong-info.com", 
		"sub.mkyong.com","sub.mkyong-info.com", 
		"mkyong.com.au", "sub.mkyong.com",
		"sub.sub.mkyong.com", "g.co", "mkyong.t.t.co" } } };
	}

	@DataProvider
	public Object[][] InvalidDomainNameProvider() {
	  return new Object[][] { { new String[] { 
		"mkyong.t.t.c", "mkyong,com",
		"mkyong", "mkyong.123", 
		".com", "mkyong.a",
		"mkyong.com/users", "-mkyong.com", 
		"mkyong-.com",".com", "sub.-mkyong.com", "sub.mkyong-.com"} } };
	}
	
	@Test(dataProvider = "ValidDomainNameProvider")
	public void ValidDomainNameTest(String[] domainName) {

		for (String temp : domainName) {
			boolean valid = DomainUtils.isValidDomainName(temp);
			System.out.println("Valid domain name : " + temp);
			Assert.assertEquals(valid, true);
		}

	}

	@Test(dataProvider = "InvalidDomainNameProvider", 
              dependsOnMethods = "ValidDomainNameTest")
	public void InValidDomainNameTest(String[] domainName) {

		for (String temp : domainName) {
			boolean valid = DomainUtils.isValidDomainName(temp);
			System.out.println("Invalid domain name : " + temp);
			Assert.assertEquals(valid, false);
		}
	}

}

Result


Valid domain name : www.google.com
Valid domain name : google.com
Valid domain name : mkyong123.com
Valid domain name : mkyong-info.com
Valid domain name : sub.mkyong.com
Valid domain name : sub.mkyong-info.com
Valid domain name : mkyong.com.au
Valid domain name : sub.mkyong.com
Valid domain name : sub.sub.mkyong.com
Valid domain name : g.co
Valid domain name : mkyong.t.t.co
Invalid domain name : mkyong.t.t.c
Invalid domain name : mkyong,com
Invalid domain name : mkyong
Invalid domain name : mkyong.123
Invalid domain name : .com
Invalid domain name : mkyong.a
Invalid domain name : mkyong.com/users
Invalid domain name : -mkyong.com
Invalid domain name : mkyong-.com
Invalid domain name : .com
Invalid domain name : sub.-mkyong.com
Invalid domain name : sub.mkyong-.com
PASSED: ValidDomainNameTest([Ljava.lang.String;@4661e987)
PASSED: InValidDomainNameTest([Ljava.lang.String;@117b8cf0)

===============================================
    Default test
    Tests run: 2, Failures: 0, Skips: 0
===============================================

References

  1. RegEx - Look-Around Assertions
  2. Wikipedia - List of Internet top-level domains
  3. Wikipedia - Top-level domain
  4. Wikipedia - Regular Expression

About the Author

author image
mkyong
Founder of Mkyong.com, love Java and open source stuff. Follow him on Twitter, or befriend him on Facebook or Google Plus. If you like my tutorials, consider make a donation to these charities.

Comments

Leave a Reply

avatar
newest oldest most voted
Dominick
Guest
Dominick

looks like you have an extra space in your Description breakdown (not the code itself). Note the extra space between the ! and –

(?! -) #Can’t start with a hyphen

Chirag Katudia
Guest
Chirag Katudia

The above regex – ^((?!-)[A-Za-z0-9-]{1,63}(?<!-)\.)+[A-Za-z]{2,6}$ doesn't work for TLD having more than 6 chars (which is supported by Microsoft AD) and also will not work having numbers in co domain.

e.g. abcxyz.internal123

The better regex is – ^(?:[a-zA-Z0-9]+(?:\-*[a-zA-Z0-9])*\.)+[a-zA-Z0-9]{2,63}$

here is the longer list of available domains (specially XN — )
http://data.iana.org/TLD/tlds-alpha-by-domain.txt

ed
Guest
ed

This definition is wrong.

TLDs can be longer than 6 characters (.construction as one of many examples). Go ahead – dig it….

Letter-Digit-Hyphen is a convention for host names (which are a subset of domain names). That causes confusion. Strictly speaking a domain name can have any 8-bit value in any octet. (see RFC 1034/1035, not 1123.)

AM
Guest
AM

Are you trying to say that a domain name could start with a hyphen?

mkyong
Guest
mkyong

RegEx is updated to filter out the start and end of hyphen.

Huq
Guest
Huq

Hello MyKong. Regexp is one of my favourite topic but I do not know nothing about java. Wanted to compile and try your piece of code and share the joy.

Can you possibly write me few lines to instruct how to compile DomainUtilsTestParam.java and then test that it works.

Kind regards

Philip Tellis
Guest
Philip Tellis

Does this work with international domain names? ie, domain names that contain say European or Arabic characters.