SAX Error – MalformedByteSequenceException: Invalid byte 1 of 1-byte UTF-8 sequence.
Written on December 17, 2009 at 8:23 pm by
mkyong
Problem
When some special UTF-8 characters inside a XML file, and your SAX’s parser is not configure to parse the UTF-8 properly, the following exception will be thrown.
com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 1 of 1-byte UTF-8 sequence. ...
Solution
The solution is quite simple, get the content in UTF-8 format, and override the SAX input source.
File file = new File("c:\\file-utf.xml"); InputStream inputStream= new FileInputStream(file); Reader reader = new InputStreamReader(inputStream,"UTF-8"); InputSource is = new InputSource(reader); is.setEncoding("UTF-8"); saxParser.parse(is, handler);
You can read the full example here – how do read UTF-8 XML file with SAX parser
This article was posted in Java category.
Oracle Magazine - Free Magazine
Oracle Magazine contains technology strategy articles, sample code, tips, Oracle and partner news, how to articles for Java's developers and DBAs, and more.
Securing & Optimizing Linux: The Hacking Solution - Free Guide
A comprehensive collection of Linux security products and explanations in the most simple and structured manner on how to safely and easily configure and run many popular Linux-based applications and services.
The Windows 7 Guide: From Newbies to Pros - Free Guide
In this 46 page guide you will be introduced to Windows 7 and what it has to offer. This guide will go over the software compatible issues, you will learn about the new taskbar, how to use and customize Windows Aero, what Windows 7 Libraries are all about, what software is included in Windows 7, and how easy networking is with Windows 7 along with other topics.
All Java Tutorials
- Java Core Technology - Java RegEx, Java XML, Java I/O, Java Misc
- J2EE Frameworks - Hibernate, Spring 2.5, Spring MVC, Struts 1.x, Struts 2.x
- Build Tools - Maven, Archiva
- Unit Test - jUnit, TestNG
- Client Scripts - jQuery
[...] SAX Error – MalformedByteSequenceException: Invalid byte 1 of 1-byte UTF-8 sequence Common SAX error for XML file contains Unicode character. [...]
[...] you used normal SAX’s way to parse it, you may encounter this “Invalid byte 1 of 1-byte UTF-8 sequence” [...]