How to validate HTML tag with regular expression
HTML tag Regular Expression Pattern
<("[^"]*"|'[^']*'|[^'">])*>
Description
< #start with opening tag "<" ( # start of group #1 "[^"]*" # allow string with double quotes enclosed - "string" | # ..or '[^']*' # allow string with single quote enclosed - 'string' | # ..or [^'">] # cant contains one single quotes, double quotes and ">" ) # end of group #1 * # 0 or more > #end with closing tag ">"
HTML tag, start with an opening tag “<" , follow by double quotes "string", or single quotes 'string' but does not allow one double quotes (") "string, one single quote (') 'string or a closing tag > without single or double quotes enclosed. At last , end with a closing tag “>”
Java Regular Expression Example
package com.mkyong.regex; import java.util.regex.Matcher; import java.util.regex.Pattern; public class HTMLTagValidator{ private Pattern pattern; private Matcher matcher; private static final String HTML_TAG_PATTERN = "<(\"[^\"]*\"|'[^']*'|[^'\">])*>"; public HTMLTagValidator(){ pattern = Pattern.compile(HTML_TAG_PATTERN); } /** * Validate html tag with regular expression * @param tag html tag for validation * @return true valid html tag, false invalid html tag */ public boolean validate(final String tag){ matcher = pattern.matcher(tag); return matcher.matches(); } }
HTML tag that match:
1. “<b>” , “<input value=’>’>”
2. “<input value=’<’>” , “<b/>”
3. “<a href=’http://www.google.com’>”
4. “<br>” , “<br/>”
5. “<input value=\”\” id=’test’>” , “<input value=” id=’test’>”
HTML tag doesn’t match:
1. “<input value=\” id=’test’>” – one double quotes string is not allow
2. “<input value=’ id=’test’>” – one single quotes string is not allow
3. “<input value=> >” – single opening tag > is not allow , have to enclose with single or double quotes
Unit Test – HTMLTagValidatorTest
package com.mkyong.regex; import org.testng.Assert; import org.testng.annotations.*; /** * HTMLTag validator Testing * @author mkyong * */ public class HTMLTagValidatorTest { private HTMLTagValidator htmlTagValidator; @BeforeClass public void initData(){ htmlTagValidator = new HTMLTagValidator(); } @DataProvider public Object[][] ValidHTMLTagProvider() { return new Object[][]{ new Object[] {"<b>"}, new Object[] {"<input value='>'>"}, new Object[] {"<input value='<'>"}, new Object[] {"<b/>"}, new Object[] {"<a href='http://www.google.com'>"}, new Object[] {"<br>"}, new Object[] {"<br/>"}, new Object[] {"<input value=\"\" id='test'>"}, new Object[] {"<input value='' id='test'>"} }; } @DataProvider public Object[][] InvalidHTMLTagProvider() { return new Object[][]{ new Object[] {"<input value=\" id='test'>"}, new Object[] {"<input value=' id='test'>"}, new Object[] {"<input value=> >"} }; } @Test(dataProvider = "ValidHTMLTagProvider") public void ValidHTMLTagTest(String tag) { boolean valid = htmlTagValidator.validate(tag); System.out.println("HTMLTag is valid : " + tag + " , " + valid); Assert.assertEquals(true, valid); } @Test(dataProvider = "InvalidHTMLTagProvider", dependsOnMethods="ValidHTMLTagTest") public void InValidHTMLTagTest(String tag) { boolean valid = htmlTagValidator.validate(tag); System.out.println("HTMLTag is valid : " + tag + " , " + valid); Assert.assertEquals(false, valid); } }
Unit Test – Result
HTMLTag is valid : <b> , true HTMLTag is valid : <input value='>'> , true HTMLTag is valid : <input value='<'> , true HTMLTag is valid : <b/> , true HTMLTag is valid : <a href='http://www.google.com'> , true HTMLTag is valid : <br> , true HTMLTag is valid : <br/> , true HTMLTag is valid : <input value="" id='test'> , true HTMLTag is valid : <input value='' id='test'> , true HTMLTag is valid : <input value=" id='test'> , false HTMLTag is valid : <input value=' id='test'> , false HTMLTag is valid : <input value=> > , false PASSED: ValidHTMLTagTest("<b>") PASSED: ValidHTMLTagTest("<input value='>'>") PASSED: ValidHTMLTagTest("<input value='<'>") PASSED: ValidHTMLTagTest("<b/>") PASSED: ValidHTMLTagTest("<a href='http://www.google.com'>") PASSED: ValidHTMLTagTest("<br>") PASSED: ValidHTMLTagTest("<br/>") PASSED: ValidHTMLTagTest("<input value="" id='test'>") PASSED: ValidHTMLTagTest("<input value='' id='test'>") PASSED: InValidHTMLTagTest("<input value=" id='test'>") PASSED: InValidHTMLTagTest("<input value=' id='test'>") PASSED: InValidHTMLTagTest("<input value=> >") =============================================== com.mkyong.regex.HTMLTagValidatorTest Tests run: 12, Failures: 0, Skips: 0 =============================================== =============================================== mkyong Total tests run: 12, Failures: 0, Skips: 0 ===============================================
Want to learn more about regular expression? Highly recommend this best and classic book – “Mastering Regular Expression”
