Java – Count number of lines in a file
This article shows few Java examples to get the total number of lines in a file. The steps are similar:
- Open the file.
- Read line by line, and increases count + 1 each line.
- Close the file.
- Read the count.
Test Java Methods:
Files.lines
BufferedReader
LineNumberReader
BufferedInputStream
At the end of the article, we also show the performance of the different ways of counting the total number of lines in a large file that contains 5 million lines and 1053 characters per line.
1. Files.lines (Java 8)
This Files.lines
is the most straightforward implementation.
public static long countLineJava8(String fileName) {
Path path = Paths.get(fileName);
long lines = 0;
try {
// much slower, this task better with sequence access
//lines = Files.lines(path).parallel().count();
lines = Files.lines(path).count();
} catch (IOException e) {
e.printStackTrace();
}
return lines;
}
2. BufferedReader
This example uses BufferedReader
to read line by line and increases the count.
public static long countLineBufferedReader(String fileName) {
long lines = 0;
try (BufferedReader reader = new BufferedReader(new FileReader(fileName))) {
while (reader.readLine() != null) lines++;
} catch (IOException e) {
e.printStackTrace();
}
return lines;
}
3. LineNumberReader
This LineNumberReader
is similar to the above BufferedReader
.
public static long countLineNumberReader(String fileName) {
File file = new File(fileName);
long lines = 0;
try (LineNumberReader lnr = new LineNumberReader(new FileReader(file))) {
while (lnr.readLine() != null) ;
lines = lnr.getLineNumber();
} catch (IOException e) {
e.printStackTrace();
}
return lines;
}
4. BufferedInputStream
This BufferedInputStream
example is copied from this StackOverflow Answer.
public static long countLineFast(String fileName) {
long lines = 0;
try (InputStream is = new BufferedInputStream(new FileInputStream(fileName))) {
byte[] c = new byte[1024];
int count = 0;
int readChars = 0;
boolean endsWithoutNewLine = false;
while ((readChars = is.read(c)) != -1) {
for (int i = 0; i < readChars; ++i) {
if (c[i] == '\n')
++count;
}
endsWithoutNewLine = (c[readChars - 1] != '\n');
}
if (endsWithoutNewLine) {
++count;
}
lines = count;
} catch (IOException e) {
e.printStackTrace();
}
return lines;
}
5. Benchmark
5.1 Create a large file containing 5 million lines, 1053 characters per line, and file size of 5G.
public static void writeLargeFile() {
String fileName = "/home/mkyong/large-file.txt";
// 1053 chars per line
String content = "Hello 123456 ";
content = content + content + content;
content = content + content + content;
content = content + content + content;
content = content + content + content;
System.out.println(content.length()); // 1053
try (BufferedWriter bw = new BufferedWriter(new FileWriter(fileName))) {
for (int i = 0; i < 5_000_000; i++) {
bw.write(content);
bw.write(System.lineSeparator());
}
} catch (IOException e) {
e.printStackTrace();
}
}
5.2 Rerun 5-10 times the same method and get the average benchmark, here’s the result:
Files.lines
– 6-8 seconds.BufferedReader
– 6-8 seconds.LineNumberReader
– 6-8 seconds.BufferedInputStream
– 4-5 seconds.
The BufferedInputStream
(StackOverflow Answer), is the fastest way to count the number of lines in a large file (5G file size and 5 million lines). Still, the difference is not that significant, and the implementation is error-prone and a bit complicated. If we test with a smaller file like 1G file size and 1 million lines, and we hardly notice the difference.
At last, the Java NIO Files.lines
is simple to use, and the performance isn’t that much different, the best choice to count the number of lines in a file.
6. wc -l
On Linux, the command wc -l
is the fastest way to count the number of lines in a file.
$ time wc -l large-file.txt
5000000 large-file.txt
real 0m2.344s
user 0m0.113s
sys 0m1.306s
$ time wc -l large-file.txt
5000000 large-file.txt
real 0m0.630s
user 0m0.092s
sys 0m0.537s
Any inputs and ideas on the algorithm behind the wc -l
command?
Download Source Code
$ git clone https://github.com/mkyong/core-java
$ cd java-io
LineNumberReader lnr = new LineNumberReader(new FileReader(new File(“C:\data\error\MyFile.csv”)));
lnr.skip(Long.MAX_VALUE);
System.out.println(lnr.getLineNumber() + 1); //Add 1 because line index starts at 0
As pointed out earlier, it does not seem to be a efficient way to count number of lines. What if the file has huge number of lines, say 1,500,000 and I want a efficient way to read only the last 1,000 lines.
?
But this is not efficient from the performance perspective, as if a file contains a large number of long lines, say 2million lines with each line of 2000 characters.
This method would create a lot of String in the String pool since the readline method returns the read data as a string.
Another thing you could do is this:
int fileLength = 0;
ArrayList<String> allLines = (ArrayList<String>) Files.readAllLines(file.toPath());
fileLength = allLines.size();
Path path = Paths.get(“response.txt”);
long lineCount = Files.lines(path).count() ;
System.out.println(lineCount);
it’s new feature from Java 7,
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
sir how to find a particular word in java file.
how to read number of new line in a word document like doc or docx in java
It was very helpful. thanks a lot
Open text file. Click on Edit>Go To. Give a huge number than expected number of lines. say if you expect to have 2000 lines, give 5000 or just give 200000. It gives an error “The Line number is beyond the total number of lines” and shows up the the immediately available line. If you have 2379 lines it gives 2380. So whatever it shows up minus 1 is the actual number of lines in the text file. Cheers!!