Jsoup – Check Redirect URL
In this article, we will show you how to use Jsoup to check if an URL is going to redirect.
1. URL Redirection
Normally, a redirect URL will return an HTTP code of 301 or 307, and the target URL will be existed in the response header “location” field.
Review a sample of HTTP Response header
HTTP code : 301 moved permanently
{
Location=https://mkyong.com
Server=GSE,
Cache-Control=no-cache,
no-store,
max-age=0,
must-revalidate
}
2. Jsoup Example
2.1 By default, Jsoup will follow the redirect recursively and display the final URL.
RedirectExample.java
package com.mkyong.crawler;
import java.io.IOException;
import org.jsoup.Connection.Response;
import org.jsoup.Jsoup;
public class RedirectExample {
public static void main(String[] args) throws IOException {
String url = "http://goo.gl/fb/gyBkwR";
Response response = Jsoup.connect(url).execute();
System.out.println(response.statusCode() + " : " + response.url());
}
}
Output
200 : https://mkyong.com/mongodb/mongodb-remove-a-field-from-array-documents/
2.2 To test the URL redirection, set followRedirects
to false.
Response response = Jsoup.connect(url).followRedirects(false).execute();
System.out.println(response.statusCode() + " : " + response.url());
//check if URL is redirect?
System.out.println("Is URL going to redirect : " + response.hasHeader("location"));
System.out.println("Target : " + response.header("location"));
Output
301 : http://goo.gl/fb/gyBkwR
Is URL going to redirect : true
Target : http://feeds.feedburner.com/~r/FeedForMkyong/~3/D_6Jqi4trqo/...
3. Jsoup Example, Again
3.1 This example will print out the redirect URLs recursively.
RedirectExample.java
package com.mkyong.crawler;
import java.io.IOException;
import org.jsoup.Connection.Response;
import org.jsoup.Jsoup;
public class RedirectExample {
public static void main(String[] args) throws IOException {
String url = "http://goo.gl/fb/gyBkwR";
RedirectExample obj = new RedirectExample();
obj.crawl(url);
}
private void crawl(String url) throws IOException {
Response response = Jsoup.connect(url).followRedirects(false).execute();
System.out.println(response.statusCode() + " : " + url);
if (response.hasHeader("location")) {
String redirectUrl = response.header("location");
crawl(redirectUrl);
}
}
}
Output
301 : http://goo.gl/fb/gyBkwR
301 : http://feeds.feedburner.com/~r/FeedForMkyong/~3/D_6Jqi4trqo/...
200 : https://mkyong.com/mongodb/mongodb-remove-a-field-from-array-documents/
Nice! I am writing an Android app that includes some web crawling and scraping. I wanted to get any redirect sequence. Now I have it. Thank you.
Nice Article and easy to understand without ding dong
you are great I LOVE your spring tutorials although some are old and need updated to use annotations. One thing I am struggling with is making OAUTH and SAML stuff work in spring. would love if you post a tutorial on that.
Nice Article