Main Tutorials

How to validate HTML tag with regular expression

HTML tag Regular Expression Pattern


<("[^"]*"|'[^']*'|[^'">])*>

Description


<	  	#start with opening tag "<"
 (		#   start of group #1
   "[^"]*"	#	allow string with double quotes enclosed - "string"
   |		#	..or
   '[^']*'	#	allow string with single quote enclosed - 'string'
   |		#	..or
   [^'">]	#	cant contains one single quotes, double quotes and ">"
 )		#   end of group #1
 *		# 0 or more
>		#end with closing tag ">"

HTML tag, start with an opening tag “<" , follow by double quotes "string", or single quotes 'string' but does not allow one double quotes (") "string, one single quote (') 'string or a closing tag > without single or double quotes enclosed. At last , end with a closing tag “>”

Java Regular Expression Example


package com.mkyong.regex;

import java.util.regex.Matcher;
import java.util.regex.Pattern;
 
public class HTMLTagValidator{
	
   private Pattern pattern;
   private Matcher matcher;
 
   private static final String HTML_TAG_PATTERN = "<(\"[^\"]*\"|'[^']*'|[^'\">])*>";
	  
   public HTMLTagValidator(){
	  pattern = Pattern.compile(HTML_TAG_PATTERN);
   }
	  
  /**
   * Validate html tag with regular expression
   * @param tag html tag for validation
   * @return true valid html tag, false invalid html tag
   */
  public boolean validate(final String tag){
		  
	  matcher = pattern.matcher(tag);
	  return matcher.matches();
	    	    
  }
}

HTML tag that match:

1. “<b>” , “<input value=’>’>”
2. “<input value='<‘>” , “<b/>”
3. “<a href=’http://www.google.com’>”
4. “<br>” , “<br/>”
5. “<input value=\”\” id=’test’>” , “<input value=” id=’test’>”

HTML tag doesn’t match:

1. “<input value=\” id=’test’>” – one double quotes string is not allow
2. “<input value=’ id=’test’>” – one single quotes string is not allow
3. “<input value=> >” – single opening tag > is not allow , have to enclose with single or double quotes

Unit Test – HTMLTagValidatorTest


package com.mkyong.regex;

import org.testng.Assert;
import org.testng.annotations.*;
 
/**
 * HTMLTag validator Testing
 * @author mkyong
 *
 */
public class HTMLTagValidatorTest {
 
	private HTMLTagValidator htmlTagValidator;
    
	@BeforeClass
        public void initData(){
		htmlTagValidator = new HTMLTagValidator();
        }
	
	@DataProvider
	public Object[][] ValidHTMLTagProvider() {
    	   return new Object[][]{
		   new Object[] {"<b>"}, 
                   new Object[] {"<input value='>'>"},
		   new Object[] {"<input value='<'>"}, 
		   new Object[] {"<b/>"},
                   new Object[] {"<a href='http://www.google.com'>"},
		   new Object[] {"<br>"},
                   new Object[] {"<br/>"},
		   new Object[] {"<input value=\"\" id='test'>"},
                   new Object[] {"<input value='' id='test'>"}
	   };
	}
	
	@DataProvider
	public Object[][] InvalidHTMLTagProvider() {
	    return new Object[][]{
		  new Object[] {"<input value=\" id='test'>"},
	  	  new Object[] {"<input value=' id='test'>"},
	  	  new Object[] {"<input value=> >"}
	    };
	}
	
	@Test(dataProvider = "ValidHTMLTagProvider")
	public void ValidHTMLTagTest(String tag) {
		
	    boolean valid = htmlTagValidator.validate(tag);
	    System.out.println("HTMLTag is valid : " + tag + " , " + valid);
	    Assert.assertEquals(true, valid);
	   
	}
	
	@Test(dataProvider = "InvalidHTMLTagProvider", 
                 dependsOnMethods="ValidHTMLTagTest")
	public void InValidHTMLTagTest(String tag) {
		
	   boolean valid = htmlTagValidator.validate(tag);
	   System.out.println("HTMLTag is valid : " + tag + " , " + valid);
	   Assert.assertEquals(false, valid);
	   
	}
}

Unit Test – Result


HTMLTag is valid : <b> , true
HTMLTag is valid : <input value='>'> , true
HTMLTag is valid : <input value='<'> , true
HTMLTag is valid : <b/> , true
HTMLTag is valid : <a href='http://www.google.com'> , true
HTMLTag is valid : <br> , true
HTMLTag is valid : <br/> , true
HTMLTag is valid : <input value="" id='test'> , true
HTMLTag is valid : <input value='' id='test'> , true
HTMLTag is valid : <input value=" id='test'> , false
HTMLTag is valid : <input value=' id='test'> , false
HTMLTag is valid : <input value=> > , false
PASSED: ValidHTMLTagTest("<b>")
PASSED: ValidHTMLTagTest("<input value='>'>")
PASSED: ValidHTMLTagTest("<input value='<'>")
PASSED: ValidHTMLTagTest("<b/>")
PASSED: ValidHTMLTagTest("<a href='http://www.google.com'>")
PASSED: ValidHTMLTagTest("<br>")
PASSED: ValidHTMLTagTest("<br/>")
PASSED: ValidHTMLTagTest("<input value="" id='test'>")
PASSED: ValidHTMLTagTest("<input value='' id='test'>")
PASSED: InValidHTMLTagTest("<input value=" id='test'>")
PASSED: InValidHTMLTagTest("<input value=' id='test'>")
PASSED: InValidHTMLTagTest("<input value=> >")

===============================================
    com.mkyong.regex.HTMLTagValidatorTest
    Tests run: 12, Failures: 0, Skips: 0
===============================================


===============================================
mkyong
Total tests run: 12, Failures: 0, Skips: 0
===============================================

Want to learn more about regular expression? Highly recommend this best and classic book – “Mastering Regular Expression”



About Author

author image
Founder of Mkyong.com, love Java and open source stuff. Follow him on Twitter. If you like my tutorials, consider make a donation to these charities.

Comments

Subscribe
Notify of
1 Comment
Most Voted
Newest Oldest
Inline Feedbacks
View all comments
Shibu Muzhangil
8 years ago

This is not working for me, I am getting false always the given text with html or without html tags