How to get URL content in Java

In this Java example, we show you how to get content of a page from URL “mkyong.com” and save it into local file drive, named “test.html”.


package com.mkyong;

import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.MalformedURLException;
import java.net.URL;
import java.net.URLConnection;

public class GetURLContent {
	public static void main(String[] args) {

		URL url;

		try {
			// get URL content
			url = new URL("http://www.mkyong.com");
			URLConnection conn = url.openConnection();

			// open the stream and put it into BufferedReader
			BufferedReader br = new BufferedReader(
                               new InputStreamReader(conn.getInputStream()));

			String inputLine;

			//save to this filename
			String fileName = "/users/mkyong/test.html";
			File file = new File(fileName);

			if (!file.exists()) {
				file.createNewFile();
			}

			//use FileWriter to write file
			FileWriter fw = new FileWriter(file.getAbsoluteFile());
			BufferedWriter bw = new BufferedWriter(fw);

			while ((inputLine = br.readLine()) != null) {
				bw.write(inputLine);
			}

			bw.close();
			br.close();

			System.out.println("Done");

		} catch (MalformedURLException e) {
			e.printStackTrace();
		} catch (IOException e) {
			e.printStackTrace();
		}

	}
}

About the Author

author image
mkyong
Founder of Mkyong.com, love Java and open source stuff. Follow him on Twitter, or befriend him on Facebook or Google Plus. If you like my tutorials, consider make a donation to these charities.

Comments

Leave a Reply

avatar
newest oldest most voted
Adam
Guest
Adam

I have written a program that takes data from a certain API. But I want to continuously fetch data from the API and save it to a file such that I append the data (with some modifications) I get in the second instance to the first one, and so on, and that too for over thousand plus requests. A quick reply would be appreciated.

Guest
Guest
Guest

Thank you, it works like a charm :)

prakash thakur
Guest
prakash thakur

i want to get data of who’s who r hitting my website..?? how can i..??..plz help..??

Shafraz
Guest
Shafraz

hello, I want to retrieve the title from mvc framework which using the file tiles.xml to assign title and set body content.
I only have to pass trough the url to be able to retrieve the tiltle.
Any idea please?
thanks in advance

pushkar kamra
Guest
pushkar kamra

i need to get all the webpages of the websites no just the current webpage

umesh sharma
Guest
umesh sharma

what is the junit testing case for above program?

Rudradev Pathak
Guest
Rudradev Pathak

I want to read only text content from web page not java script, css html tag.So how should we the code.
used so many pattern to replace all the things as a space, but its not working.

sunayana
Guest
sunayana

Hello Sir,

I am getting below error when I execute the above program, please guide.

java.net.ConnectException: Connection timed out: connect

at java.net.DualStackPlainSocketImpl.connect0(Native Method)

at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source)

at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)

at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)

at java.net.AbstractPlainSocketImpl.connect(Unknown Source)

at java.net.PlainSocketImpl.connect(Unknown Source)

at java.net.SocksSocketImpl.connect(Unknown Source)

at java.net.Socket.connect(Unknown Source)

at java.net.Socket.connect(Unknown Source)

at sun.net.NetworkClient.doConnect(Unknown Source)

at sun.net.www.http.HttpClient.openServer(Unknown Source)

at sun.net.www.http.HttpClient.openServer(Unknown Source)

at sun.net.www.http.HttpClient.(Unknown Source)

at sun.net.www.http.HttpClient.New(Unknown Source)

at sun.net.www.http.HttpClient.New(Unknown Source)

at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(Unknown Source)

at sun.net.www.protocol.http.HttpURLConnection.plainConnect(Unknown Source)

at sun.net.www.protocol.http.HttpURLConnection.connect(Unknown Source)

at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)

at GetURLContent.main(GetURLContent.java:25)

Ram
Guest
Ram

Need to check the internet connectivity

Sama Ansari
Guest
Sama Ansari

i need help in peer to peer networking in android language.. plz help me

Mittal
Guest
Mittal

Hi MKYONG

how to test this program with JUNIT TEST.Please tell me the steps.

nick
Guest
nick

hey did u get the JUNIT Test Case for dis program then plz do mail me on nhnngpl@gmail.com ASAP… ITs urgent

Shrikant
Guest
Shrikant

Using this code I am able to get the source code of site but the source code I found in not complete some part of the site is messing
Please give me a solution that I will get the full source code from the site.

Dinesh
Guest
Dinesh

i am also getting only part of the HTML source.. I am using BufferedReader to read the inputStream..

Anand
Guest
Anand

Hi,
I’m trying to save the html content from web services to sd card in android,and i have the list of url’s that each contain corresponding html page how to download it to web services to sd card in android.

Manas Ranjan
Guest
Manas Ranjan
Hi, When I am trying to execute the above code in my local machine(Windows 7) it is working fine, but when I am trying to execute the same code it is giving java.net.UnknownHostException: http://www.mkyong.com at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:175) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384) at java.net.Socket.connect(Socket.java:546) at java.net.Socket.connect(Socket.java:495) at sun.net.NetworkClient.doConnect(NetworkClient.java:178) at sun.net.www.http.HttpClient.openServer(HttpClient.java:409) at sun.net.www.http.HttpClient.openServer(HttpClient.java:530) at sun.net.www.http.HttpClient.(HttpClient.java:240) at sun.net.www.http.HttpClient.New(HttpClient.java:321) at sun.net.www.http.HttpClient.New(HttpClient.java:338) at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:935) at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:876) at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:801) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1139) at GetURLContent.main(GetURLContent.java:22) Can you suggest me what would be the reason for this. Thanks in Advance.
Mittal
Guest
Mittal

please start your Connect your System with Internet and then try

Harsha
Guest
Harsha

I am getting

java.io.IOException: Access is denied exception.

Is there a way to bypass authentication

Harsha
Guest
Harsha

Sorry thats working, looks like I had permision issues with creating file.

Ajay
Guest
Ajay

Thank for this code, it would be great help if you tell me how to get website content in txt file. I was trying but this code showing the HTML codes with the content please help me out

Thanks

Ashim
Guest
Ashim

I want the reverse case …read file(html content ) from hardisk and display as html file in web (i m using java EE ,hibernate ,jsf2.0,and server glassfish 3+)
Your help will be highly appreciated.

Shmilfke
Guest
Shmilfke

How do I get the program to do this regularly, for example every five minutes?

Rohan Sethi
Guest
Rohan Sethi

Use a timer.

Shmilfke
Guest
Shmilfke

How to I get it so check the website at regular times, for example, every five minutes?

Thanks!

vivek ghavle
Guest
vivek ghavle

just put the code line

main(args);

at the end of the program.

Kurret
Guest
Kurret

Thank`s, that was really good! You do a great job!

peter
Guest
peter

Hi,
Thanks for your mkyong , but this code doesnt work for https sites
Any idea how to go about it

trackback
How to get Google PageRank (PR) in Java

[…] How To Get URL Content In Java […]

rasul
Guest
rasul

Thank You from these informations

??????? ?????????
Guest
??????? ?????????

nice post,
thanx

Jonathan
Guest
Jonathan

I want to write a program that gets my notifications from facebook. Do you think the code above would work?

Jawahar
Guest
Jawahar

Hi,

When I try the above program, I am getting error as below:

java.net.ConnectException: Connection timed out: connect

Please help

Cristian Rivera
Guest
Cristian Rivera

Hello Jawahar, I just tried out the above code and it seemed to work for me. The only part that that wasn’t included with the html document was the sites images. If you don’t mind me asking did you change the out put directory from String fileName = “/users/mkyong/test.html”; to your information?

Cristian Rivera
Guest
Cristian Rivera

Im sorry about my last reply. I misread your comment. A connection timeout can mostly occur if you have an inconsistent internet connection or if the site is having trouble. I just tried out the code a few minutes ago and i had no problems.