Java IO Tutorial

How to write a UTF-8 file in Java

In Java, the OutputStreamWriter accepts a charset to encode the character streams into byte streams. We can pass a StandardCharsets.UTF_8 into the OutputStreamWriter constructor to write data to a UTF-8 file.


  try (FileOutputStream fos = new FileOutputStream(file);
       OutputStreamWriter osw = new OutputStreamWriter(fos, StandardCharsets.UTF_8);
       BufferedWriter writer = new BufferedWriter(osw)) {

      writer.append(line);

  }

In Java 7+, many File I/O and NIO writers start to accept charset as an argument, making write data to a UTF-8 file very easy, for examples:


  // Java 7
  Files.write(path, lines, StandardCharsets.UTF_8);

  // Java 8
  Files.newBufferedWriter(path) // default UTF-8

  // Java 11
  new FileWriter(new File(fileName), StandardCharsets.UTF_8);

1. Write to UTF-8 file

This example shows a few ways to write some Chinese characters to a UTF-8 file.

UnicodeWrite.java

package com.mkyong.io.howto;

import java.io.*;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.Arrays;
import java.util.List;

public class UnicodeWrite {

    public static void main(String[] args) {

        String fileName = "c:\\temp\\test.txt";
        List<String> lines = Arrays.asList("line 1", "line 2", "line 3", "你好,世界");

        writeUnicodeJava7(fileName, lines);
        //writeUnicodeJava8(fileName, lines);
        //writeUnicodeJava11(fileName, lines);
        //writeUnicodeClassic(fileName, lines);

        System.out.println("Done");
    }

    // in the old days
    public static void writeUnicodeClassic(String fileName, List<String> lines) {

        File file = new File(fileName);

        try (FileOutputStream fos = new FileOutputStream(file);
             OutputStreamWriter osw = new OutputStreamWriter(fos, StandardCharsets.UTF_8);
             BufferedWriter writer = new BufferedWriter(osw)
        ) {

            for (String line : lines) {
                writer.append(line);
                writer.newLine();
            }

        } catch (IOException e) {
            e.printStackTrace();
        }

    }

    public static void writeUnicodeJava7(String fileName, List<String> lines) {

        Path path = Paths.get(fileName);
        try {
            Files.write(path, lines, StandardCharsets.UTF_8);
        } catch (IOException e) {
            e.printStackTrace();
        }

    }

    // Java 8 - Files.newBufferedWriter(path) - default UTF-8
    public static void writeUnicodeJava8(String fileName, List<String> lines) {

        Path path = Paths.get(fileName);

        try (BufferedWriter writer = Files.newBufferedWriter(path, StandardCharsets.UTF_8)) {

            for (String line : lines) {
                writer.append(line);
                writer.newLine();
            }

        } catch (IOException e) {
            e.printStackTrace();
        }

    }

    // Java 11 adds Charset to FileWriter
    public static void writeUnicodeJava11(String fileName, List<String> lines) {

        try (FileWriter fw = new FileWriter(new File(fileName), StandardCharsets.UTF_8);
             BufferedWriter writer = new BufferedWriter(fw)) {

            for (String line : lines) {
                writer.append(line);
                writer.newLine();
            }

        } catch (IOException e) {
            e.printStackTrace();
        }

    }

}

Output

utf-8 file

Download Source Code

$ git clone https://github.com/mkyong/core-java

$ cd java-io

References

About Author

author image
Founder of Mkyong.com, love Java and open source stuff. Follow him on Twitter. If you like my tutorials, consider make a donation to these charities.

Comments

Subscribe
Notify of
8 Comments
Most Voted
Newest Oldest
Inline Feedbacks
View all comments
BOB
6 years ago

how to use scanner input in java to convert unicode code point to utf8, utf16, utf32

Pranava
10 years ago

How to create a file with chinese characters in the file name

Kauhsick
5 years ago

Helped me a lot thanks

cyber
6 years ago

I had this problem, but simply switching from Gedit to VSCode worked for me

Rahulsingh
10 years ago

awesome !

nitin
11 years ago

Code ain’t working for me may be because of this : http://stackoverflow.com/a/4053854

Please provide another working solution asap. I guess there is something in Apache Commons for the same.

ganba
14 years ago

Hello there,
This code does not really work for me.
Here’s how I tested it. My test.txt file is saved with UTF-8 encoding and contains this line:
—————
w été jedn? stron? ôpèç Ûg ütà
—————

My test program below first reads the file in BufferedReader and then writes in Writer.

—————
package test;
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.OutputStreamWriter;
import java.io.UnsupportedEncodingException;
import java.io.Writer;

public class Test_temp {

public Test_temp() {

String sPath = “E:/workspace/project/src/test/test.txt”;

if (sPath != null && !sPath.trim().equals(“”)) {
try {
Writer out = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(sPath + “.new”), “UTF8”));
BufferedReader in = new BufferedReader(new InputStreamReader(new FileInputStream(sPath), “UTF8”));

String s = null;
while ((s = in.readLine()) != null) {
String UTF8Str = new String(s.getBytes(), “UTF8”);
System.out.println(“[” + UTF8Str + “]”);
out.append(UTF8Str).append(“\r\n”);
}
System.out.println(“Reading Process Completly Successfully.”);
in.close();
out.flush();
out.close();
} catch (UnsupportedEncodingException ue) {
System.out.println(“Not supported : ” + ue.getMessage());
} catch (IOException e) {
System.out.println(e.getMessage());
}
}
}

public static void main(String[] args) {
new Test_temp();
}
}
—————

The new generated file (test.txt.new) is also encoded with UTF-8 but characters are corrupted:
———–
?w ?t? jedn? stron? ?p?? ?g ?
———–
Could you please tell me what do I do wrong?
Thanks

Tony
14 years ago
Reply to  ganba

ganba

are you sure that you’re using a UTF-( reader to validate the file?

Regards,

Tony