How to get URL content in Java

In this Java example, we show you how to get content of a page from URL “mkyong.com” and save it into local file drive, named “test.html”.

package com.mkyong;
 
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.MalformedURLException;
import java.net.URL;
import java.net.URLConnection;
 
public class GetURLContent {
	public static void main(String[] args) {
 
		URL url;
 
		try {
			// get URL content
			url = new URL("http://www.mkyong.com");
			URLConnection conn = url.openConnection();
 
			// open the stream and put it into BufferedReader
			BufferedReader br = new BufferedReader(
                               new InputStreamReader(conn.getInputStream()));
 
			String inputLine;
 
			//save to this filename
			String fileName = "/users/mkyong/test.html";
			File file = new File(fileName);
 
			if (!file.exists()) {
				file.createNewFile();
			}
 
			//use FileWriter to write file
			FileWriter fw = new FileWriter(file.getAbsoluteFile());
			BufferedWriter bw = new BufferedWriter(fw);
 
			while ((inputLine = br.readLine()) != null) {
				bw.write(inputLine);
			}
 
			bw.close();
			br.close();
 
			System.out.println("Done");
 
		} catch (MalformedURLException e) {
			e.printStackTrace();
		} catch (IOException e) {
			e.printStackTrace();
		}
 
	}
}
Tags :

About the Author

mkyong
Founder of Mkyong.com and HostingCompass.com, love Java and open source stuff. Follow him on Twitter, or befriend him on Facebook or Google Plus. If you like my tutorials, consider make a donation to these charities.

Comments

  • Sama Ansari

    i need help in peer to peer networking in android language.. plz help me

  • Mittal

    Hi MKYONG

    how to test this program with JUNIT TEST.Please tell me the steps.

  • Shrikant

    Using this code I am able to get the source code of site but the source code I found in not complete some part of the site is messing
    Please give me a solution that I will get the full source code from the site.

    • Dinesh

      i am also getting only part of the HTML source.. I am using BufferedReader to read the inputStream..

  • Anand

    Hi,
    I’m trying to save the html content from web services to sd card in android,and i have the list of url’s that each contain corresponding html page how to download it to web services to sd card in android.

  • Manas Ranjan

    Hi,
    When I am trying to execute the above code in my local machine(Windows 7) it is working fine, but when I am trying to execute the same code it is giving
    java.net.UnknownHostException: http://www.mkyong.com
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:175)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384)
    at java.net.Socket.connect(Socket.java:546)
    at java.net.Socket.connect(Socket.java:495)
    at sun.net.NetworkClient.doConnect(NetworkClient.java:178)
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:409)
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:530)
    at sun.net.www.http.HttpClient.(HttpClient.java:240)
    at sun.net.www.http.HttpClient.New(HttpClient.java:321)
    at sun.net.www.http.HttpClient.New(HttpClient.java:338)
    at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:935)
    at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:876)
    at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:801)
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1139)
    at GetURLContent.main(GetURLContent.java:22)

    Can you suggest me what would be the reason for this.
    Thanks in Advance.

    • Mittal

      please start your Connect your System with Internet and then try

  • Harsha

    I am getting

    java.io.IOException: Access is denied exception.

    Is there a way to bypass authentication

    • Harsha

      Sorry thats working, looks like I had permision issues with creating file.

  • Ajay

    Thank for this code, it would be great help if you tell me how to get website content in txt file. I was trying but this code showing the HTML codes with the content please help me out

    Thanks

  • Ashim

    I want the reverse case …read file(html content ) from hardisk and display as html file in web (i m using java EE ,hibernate ,jsf2.0,and server glassfish 3+)
    Your help will be highly appreciated.

  • Shmilfke

    How do I get the program to do this regularly, for example every five minutes?

  • Shmilfke

    How to I get it so check the website at regular times, for example, every five minutes?

    Thanks!

    • vivek ghavle

      just put the code line

      main(args);

      at the end of the program.

  • Kurret

    Thank`s, that was really good! You do a great job!

  • peter

    Hi,
    Thanks for your mkyong , but this code doesnt work for https sites
    Any idea how to go about it

  • Pingback: How to get Google PageRank (PR) in Java()

  • http://juge.ir rasul

    Thank You from these informations

  • http://www.asrepooya.com ??????? ?????????

    nice post,
    thanx

  • Jonathan

    I want to write a program that gets my notifications from facebook. Do you think the code above would work?

    • http://www.mkyong.com mkyong

      Not much comment on this, since I don’t know how FB handling the notification, look like it’s using the “push” technique to display the notification, well, you can try, but I don’t think above simple solution will work well, you may need to add the extra Facebook authentication handling.

      • Jonathan

        Thank you, you are right . I ended up using restFB.

  • Jawahar

    Hi,

    When I try the above program, I am getting error as below:

    java.net.ConnectException: Connection timed out: connect

    Please help

    • Cristian Rivera

      Hello Jawahar, I just tried out the above code and it seemed to work for me. The only part that that wasn’t included with the html document was the sites images. If you don’t mind me asking did you change the out put directory from String fileName = “/users/mkyong/test.html”; to your information?

    • Cristian Rivera

      Im sorry about my last reply. I misread your comment. A connection timeout can mostly occur if you have an inconsistent internet connection or if the site is having trouble. I just tried out the code a few minutes ago and i had no problems.