How to read XML file in Java – (SAX Parser)

SAX parser is working differently with a DOM parser, it neither load any XML document into memory nor create any object representation of the XML document. Instead, the SAX parser use callback function (org.xml.sax.helpers.DefaultHandler) to informs clients of the XML document structure.


SAX Parser is faster and uses less memory than DOM parser.

See following SAX callback methods :

  • startDocument() and endDocument() – Method called at the start and end of an XML document.
  • startElement() and endElement() – Method called at the start and end of a document element.
  • characters() – Method called with the text contents in between the start and end tags of an XML document element.

1. XML file

Create a simple XML file.


<?xml version="1.0"?>
<company>
	<staff>
		<firstname>yong</firstname>
		<lastname>mook kim</lastname>
		<nickname>mkyong</nickname>
		<salary>100000</salary>
	</staff>
	<staff>
		<firstname>low</firstname>
		<lastname>yin fong</lastname>
		<nickname>fong fong</nickname>
		<salary>200000</salary>
	</staff>
</company>

2. Java file

Use SAX parser to parse the XML file.


import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

public class ReadXMLFile {

   public static void main(String argv[]) {

    try {

	SAXParserFactory factory = SAXParserFactory.newInstance();
	SAXParser saxParser = factory.newSAXParser();

	DefaultHandler handler = new DefaultHandler() {

	boolean bfname = false;
	boolean blname = false;
	boolean bnname = false;
	boolean bsalary = false;

	public void startElement(String uri, String localName,String qName, 
                Attributes attributes) throws SAXException {

		System.out.println("Start Element :" + qName);

		if (qName.equalsIgnoreCase("FIRSTNAME")) {
			bfname = true;
		}

		if (qName.equalsIgnoreCase("LASTNAME")) {
			blname = true;
		}

		if (qName.equalsIgnoreCase("NICKNAME")) {
			bnname = true;
		}

		if (qName.equalsIgnoreCase("SALARY")) {
			bsalary = true;
		}

	}

	public void endElement(String uri, String localName,
		String qName) throws SAXException {

		System.out.println("End Element :" + qName);

	}

	public void characters(char ch[], int start, int length) throws SAXException {

		if (bfname) {
			System.out.println("First Name : " + new String(ch, start, length));
			bfname = false;
		}

		if (blname) {
			System.out.println("Last Name : " + new String(ch, start, length));
			blname = false;
		}

		if (bnname) {
			System.out.println("Nick Name : " + new String(ch, start, length));
			bnname = false;
		}

		if (bsalary) {
			System.out.println("Salary : " + new String(ch, start, length));
			bsalary = false;
		}

	}

     };

       saxParser.parse("c:\\file.xml", handler);
 
     } catch (Exception e) {
       e.printStackTrace();
     }
  
   }

}

Result


Start Element :company
Start Element :staff
Start Element :firstname
First Name : yong
End Element :firstname
Start Element :lastname
Last Name : mook kim
End Element :lastname
Start Element :nickname
Nick Name : mkyong
End Element :nickname
Start Element :salary
Salary : 100000
End Element :salary
End Element :staff
Start Element :staff
Start Element :firstname
First Name : low
End Element :firstname
Start Element :lastname
Last Name : yin fong
End Element :lastname
Start Element :nickname
Nick Name : fong fong
End Element :nickname
Start Element :salary
Salary : 200000
End Element :salary
End Element :staff
End Element :company
Warning
This example may encounter exceptions for UTF-8 XML file, please read this article about how to read the XML “UTF-8” file in SAX
Note
You may interest to read this How to read XML file in Java – (DOM Parser)

About the Author

author image
mkyong
Founder of Mkyong.com, love Java and open source stuff. Follow him on Twitter, or befriend him on Facebook or Google Plus. If you like my tutorials, consider make a donation to these charities.

Comments

avatar
84 Comment threads
24 Thread replies
0 Followers
 
Most reacted comment
Hottest comment thread
81 Comment authors
It ItNAVEEN REDDY (NAVEE)Amit ThakurTonyAp Recent comment authors
newest oldest most voted
Ap
Guest
Ap

I am trying to validate one xml , I am getting exception as cvc-elt.5.2.2.1:The value ’00’ of element ‘x’ does not match the fixed {value constraint}’0′. Can any one help to resolve this asap?

mehfatiem
Guest
mehfatiem

How can I import org.xml.sax. … I use eclipse as a development kid. Where can I find to org.sax library.

Mark Tielemans
Guest
Mark Tielemans

You can keep track of open/closed elements much more gracefully using a Map:

Map<String, Boolean> elements = new HashMap<String, Boolean>();
</pre

onStart:
<pre lang="java">elements.put(qName, true);

onEnd:

elements.put(qName, false);

Nice guide!

Erik
Guest
Erik

Hello! Nice guide. I had a very simple scenario with XML like: … … … This made it possible to just store the opening tag’s gName in a variable and then handle it accordingly in the characters-method: class XXX { /* Item defined elsewhere */ private final LinkedList items = new LinkedList(); private final SimpleDateFormat dateFormat = new SimpleDateFormat(“yyyy-MM-dd”); … private final DefaultHandler handler = new DefaultHandler() { private Item item; private String qName; @Override public void startElement( String uri, String localName, String qName, Attributes attributes) throws SAXException { this.qName = qName; if (“ROW”.equalsIgnoreCase(qName)) { item = new Item(); }… Read more »

Erik
Guest
Erik

You just have to cut and paste the code into an IDE and code format it…

NAVEEN REDDY (NAVEE)
Guest
NAVEEN REDDY (NAVEE)

if i gave xml file
Company 1

Doe
and i want output for only employee name .

Amit Thakur
Guest
Amit Thakur

Hi MKyong, Issue –> Actually I am facing issue with xml parsing (SAX Parser) in Unix Machine. Same Jar/Java-Code behave differently on windows and Unix Machine, why ? :( Windows Machine –> works fine , Using SAX Parser to load huge xml file , Read all values correctly and populate same values. Charset.defaultCharset() windows-1252 Unix Machine –> After then created JAR and deployed at Unix –> tomcat and execute the jar. Tried to load same huge xml file But noticed that some values or characters are populated empty or incomplete like Country Name populated as “ysia” instead of “Malaysia” or… Read more »

Tony
Guest
Tony

Hi ,

I want to do the same . Let say i have added one or more than one child attributes in that xml . I dont want to change my pojo /java file . But the output has to print correctly. I want it has to be generic for any xml .Can u please suggest me how to proceed

srikanth
Guest
srikanth

i am facing one problem while developing app that sax parser not reading “&” … it is giving crash when the data contains & … how can i resolve this issue.. thanks in advance.

Won-Hyung Park
Guest
Won-Hyung Park

Great example, but sometimes characters in xml tag are missing with above example. Please refer to this link : http://stackoverflow.com/questions/18460518/why-some-characters-are-missing-when-i-parse-a-xml-tag-using-saxparser

Pankaj
Guest
Pankaj

Great tutorial only one question if i pass a static file here

saxParser.parse(“c:\myxmlfile.xml”, handler) my code works without any issues but if pass a string
that i get as response from a server which is same as ticket.xml i get this error Exception in thread “main” java.net.MalformedURLException: no protocol

so what i am doing wrong?

Anil
Guest
Anil

Hello Mkyong

I am trying to parse one XML using SAX parser code similar to one you detailed in your article.

One of field has below type of data as shown for description tag which is not fully read for that field.
It will just read ABC and ignores the rest. Can you please advice on this, how can I handle this?

ABC
[[0]: PCR, [1]: [0..5.0]%, [2]: [>5.0..10.0]%

Anil Kumar Pal
Guest
Anil Kumar Pal

was able to read if enclose whole description with

Anil Kumar pal
Guest
Anil Kumar pal

was able to read if enclose whole description with CDATA tags

Anil
Guest
Anil

Hello Mkyong

I am trying to parse one XML using SAX parser code similar to one you detailed in your article.

One of field has below type of data as shown for description tag which is not fully read for that field.
It will just read ABC and ignores the rest. Can you please advice on this, how can I handle this?

ABC
[0]: PCR, [1]: [0..5.0]%

suresh
Guest
suresh

i want to convert plain text with veriable field length to xml how should i achieve it

Avnish alok
Guest
Avnish alok

one can visit this blog to know more in java

Andre Santos
Guest
Andre Santos

yes!! Works perfectly

trackback
Service Stock di Java | Start from Bad Coding

[…] untuk menembus proxy bisa dilihat diposting ini. Sementara read XML dengan SAX parser bisa dilihat disini biar gampang saya buat jadi satu […]

Reji
Guest
Reji

How can we use this parse to access the set of multiple occurrence of data containing nested records

Like

John

21
BeerLand

KIM

20
High Spirits

Reji
Guest
Reji

How can we use this parse to access the set of multiple occurrence of data containing nested records

Like

John

21
BeerLand

KIM

20
High Spirits

sp
Guest
sp

Hi, can some one tell how to handle empty axml tags while parsing ? like or

Mal1990
Guest
Mal1990

Hi ,

I am totally new to java . I need to parse a EDI XML and display its attributes alone text file. The text file should be pipe delimited(|). ie; all the attributes should be separated by | symbol.

This has to be done using sax parser. Kindly help me on this

Mal1990
Guest
Mal1990

Hi ,

I am totally new to java . I need to parse a EDI XML and display its attributes alone text file. The text file should be pipe delimited(|). ie; all the attributes should be separated by | symbol. Kindly help me on this

sony
Guest
sony

too good!!!!!thanks mykong….

Quin
Guest
Quin

How can you sort the data, for example, if you want to sort by last name element on your xml file, output it out on java, need help.

JC
Guest
JC

DOM Parser post helped me alot.
But I am stuck with 1 thing,could you please suggest which parser to use for redundant tags.eg :-

empData

Tom
25
English
French

Swiss
25
English
German
French

Each config can have 1 or more tag and each can have 1 or more tag.With Dom parser,either I am getting only first tag or all for all employees,I want corresponding to each employee.

Thanks

sirj77
Guest
sirj77

Hey!
I’ve tried to run this example, but received an error:
“Error: Could not find or load main class Tasks.XML Parser SAX”
Could anyone help me?

123mo
Guest
123mo

Run using ” java ReadXMLFile “

Andres
Guest
Andres

Thanks for you post!!!
it helped me a lot

vijaya
Guest
vijaya

Thanks, it helped me a lot.

Mark Tielemans
Guest
Mark Tielemans

Bit lame I can’t edit that comment :P.

Gubs
Guest
Gubs

Hi Monk,

StartElement is not getting called in my sample below code. You have any idea ?
public class ReadXMLFileUsingSAXParser {

String fname = null;
String lname = null;
String sal = null;

/**
* @param args
* @throws IOException
*/
public static void main(String[] args) throws IOException {

SAXParserFactory factory = SAXParserFactory.newInstance();
try {
SAXParser parser = factory.newSAXParser();

DefaultHandler handler = new DefaultHandler();

parser.parse(“src/main/resources/testSaxParser.xml”, handler);

} catch (ParserConfigurationException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (SAXException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}

}

public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
if (qName.equalsIgnoreCase(“firstname”)) {
fname = attributes.getValue(“firstname”);
System.out.println(“Firstname ” + fname);
}
}
}

Nicholas
Guest
Nicholas

hi Gubs,

i have no idea.

Erik
Guest
Erik

startElement should be inside DefaultHandler… check the code in the article one more time.

hendi santika
Guest
hendi santika

I’ve tried the code, but i have a problem here.
The result is :
End Element :firstname
End Element :lastname
End Element :nickname
End Element :salary
End Element :staff
End Element :firstname
End Element :lastname
End Element :nickname
End Element :salary
End Element :staff
End Element :company

Why does it happen ???

Thanks

Erik Larsson
Guest
Erik Larsson

I have the same problem, please answer someone!

Sim
Guest
Sim

Same problem here too :(

Sim
Guest
Sim

Solution (found at: http://stackoverflow.com/questions/6301678/java-sax-program-doesnt-go-to-startelement-method):

Check the import statement for the Attribute parameter, it should be:
import org.xml.sax.Attributes;

krishnaveni
Guest
krishnaveni

Hi.,
This is really helped me..Thanks a lot for gave me such a nice tutorial…