How to validate HTML tag with regular expression

HTML tag Regular Expression Pattern

<("[^"]*"|'[^']*'|[^'">])*>

Description

<	  	#start with opening tag "<"
 (		#   start of group #1
   "[^"]*"	#	allow string with double quotes enclosed - "string"
   |		#	..or
   '[^']*'	#	allow string with single quote enclosed - 'string'
   |		#	..or
   [^'">]	#	cant contains one single quotes, double quotes and ">"
 )		#   end of group #1
 *		# 0 or more
>		#end with closing tag ">"

HTML tag, start with an opening tag “<" , follow by double quotes "string", or single quotes 'string' but does not allow one double quotes (") "string, one single quote (') 'string or a closing tag > without single or double quotes enclosed. At last , end with a closing tag “>”

Java Regular Expression Example

package com.mkyong.regex;
 
import java.util.regex.Matcher;
import java.util.regex.Pattern;
 
public class HTMLTagValidator{
 
   private Pattern pattern;
   private Matcher matcher;
 
   private static final String HTML_TAG_PATTERN = "<(\"[^\"]*\"|'[^']*'|[^'\">])*>";
 
   public HTMLTagValidator(){
	  pattern = Pattern.compile(HTML_TAG_PATTERN);
   }
 
  /**
   * Validate html tag with regular expression
   * @param tag html tag for validation
   * @return true valid html tag, false invalid html tag
   */
  public boolean validate(final String tag){
 
	  matcher = pattern.matcher(tag);
	  return matcher.matches();
 
  }
}

HTML tag that match:

1. “<b>” , “<input value=’>’>”
2. “<input value='<‘>” , “<b/>”
3. “<a href=’http://www.google.com’>”
4. “<br>” , “<br/>”
5. “<input value=\”\” id=’test’>” , “<input value=” id=’test’>”

HTML tag doesn’t match:

1. “<input value=\” id=’test’>” – one double quotes string is not allow
2. “<input value=’ id=’test’>” – one single quotes string is not allow
3. “<input value=> >” – single opening tag > is not allow , have to enclose with single or double quotes

Unit Test – HTMLTagValidatorTest

package com.mkyong.regex;
 
import org.testng.Assert;
import org.testng.annotations.*;
 
/**
 * HTMLTag validator Testing
 * @author mkyong
 *
 */
public class HTMLTagValidatorTest {
 
	private HTMLTagValidator htmlTagValidator;
 
	@BeforeClass
        public void initData(){
		htmlTagValidator = new HTMLTagValidator();
        }
 
	@DataProvider
	public Object[][] ValidHTMLTagProvider() {
    	   return new Object[][]{
		   new Object[] {"<b>"}, 
                   new Object[] {"<input value='>'>"},
		   new Object[] {"<input value='<'>"}, 
		   new Object[] {"<b/>"},
                   new Object[] {"<a href='http://www.google.com'>"},
		   new Object[] {"<br>"},
                   new Object[] {"<br/>"},
		   new Object[] {"<input value=\"\" id='test'>"},
                   new Object[] {"<input value='' id='test'>"}
	   };
	}
 
	@DataProvider
	public Object[][] InvalidHTMLTagProvider() {
	    return new Object[][]{
		  new Object[] {"<input value=\" id='test'>"},
	  	  new Object[] {"<input value=' id='test'>"},
	  	  new Object[] {"<input value=> >"}
	    };
	}
 
	@Test(dataProvider = "ValidHTMLTagProvider")
	public void ValidHTMLTagTest(String tag) {
 
	    boolean valid = htmlTagValidator.validate(tag);
	    System.out.println("HTMLTag is valid : " + tag + " , " + valid);
	    Assert.assertEquals(true, valid);
 
	}
 
	@Test(dataProvider = "InvalidHTMLTagProvider", 
                 dependsOnMethods="ValidHTMLTagTest")
	public void InValidHTMLTagTest(String tag) {
 
	   boolean valid = htmlTagValidator.validate(tag);
	   System.out.println("HTMLTag is valid : " + tag + " , " + valid);
	   Assert.assertEquals(false, valid);
 
	}
}

Unit Test – Result

HTMLTag is valid : <b> , true
HTMLTag is valid : <input value='>'> , true
HTMLTag is valid : <input value='<'> , true
HTMLTag is valid : <b/> , true
HTMLTag is valid : <a href='http://www.google.com'> , true
HTMLTag is valid : <br> , true
HTMLTag is valid : <br/> , true
HTMLTag is valid : <input value="" id='test'> , true
HTMLTag is valid : <input value='' id='test'> , true
HTMLTag is valid : <input value=" id='test'> , false
HTMLTag is valid : <input value=' id='test'> , false
HTMLTag is valid : <input value=> > , false
PASSED: ValidHTMLTagTest("<b>")
PASSED: ValidHTMLTagTest("<input value='>'>")
PASSED: ValidHTMLTagTest("<input value='<'>")
PASSED: ValidHTMLTagTest("<b/>")
PASSED: ValidHTMLTagTest("<a href='http://www.google.com'>")
PASSED: ValidHTMLTagTest("<br>")
PASSED: ValidHTMLTagTest("<br/>")
PASSED: ValidHTMLTagTest("<input value="" id='test'>")
PASSED: ValidHTMLTagTest("<input value='' id='test'>")
PASSED: InValidHTMLTagTest("<input value=" id='test'>")
PASSED: InValidHTMLTagTest("<input value=' id='test'>")
PASSED: InValidHTMLTagTest("<input value=> >")
 
===============================================
    com.mkyong.regex.HTMLTagValidatorTest
    Tests run: 12, Failures: 0, Skips: 0
===============================================
 
 
===============================================
mkyong
Total tests run: 12, Failures: 0, Skips: 0
===============================================

Want to learn more about regular expression? Highly recommend this best and classic book – “Mastering Regular Expression”



Tags :

About the Author

mkyong
Founder of Mkyong.com and HostingCompass.com, love Java and open source stuff. Follow him on Twitter, or befriend him on Facebook or Google Plus. If you like my tutorials, consider make a donation to these charities.

Comments