使用HttpClient 4.1.1避免循环重定向

如何使用HttpClient 4.1.1避免循环重定向。 因为我得到这样的错误: –

executing requestGET http://home.somehost.com/Mynet/pages/cHome.xhtml HTTP/1.1 org.apache.http.client.ClientProtocolException at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:822) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:754) at edu.uci.ics.crawler4j.url.WebURL.setURL(WebURL.java:122) at edu.uci.ics.crawler4j.crawler.CrawlController.addSeed(CrawlController.java:207) at edu.uci.ics.crawler4j.example.advanced.Controller.main(Controller.java:31) Caused by: org.apache.http.client.CircularRedirectException: Circular redirect to 'http://home.somehost.com/Mynet/pages/Home.xhtml' at org.apache.http.impl.client.DefaultRedirectStrategy.getLocationURI(DefaultRedirectStrategy.java:168) at org.apache.http.impl.client.DefaultRedirectStrategy.getRedirect(DefaultRedirectStrategy.java:193) at org.apache.http.impl.client.DefaultRequestDirector.handleResponse(DefaultRequestDirector.java:1021) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:482) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820) 

这是我的代码……

 DefaultHttpClient client = null; try { // Set url //URI uri = new URI(url.toString()); client = new DefaultHttpClient(); client.getCredentialsProvider().setCredentials( new AuthScope(AuthScope.ANY_HOST, AuthScope.ANY_PORT, AuthScope.ANY_REALM), new UsernamePasswordCredentials("test", "test")); URL url1 = new URL (url); HttpURLConnection connection = (HttpURLConnection) url1.openConnection(); connection.setFollowRedirects(false); HttpGet request = new HttpGet(url); final HttpParams params = new BasicHttpParams(); HttpClientParams.setRedirecting(params, false); HttpContext context = new BasicHttpContext(); System.out.println("----------------------------------------"); System.out.println("executing request" + request.getRequestLine()); HttpResponse response = client.execute(request, context); HttpEntity entity = response.getEntity(); System.out.println(response.getStatusLine()); InputStream content = entity.getContent(); BufferedReader in = new BufferedReader (new InputStreamReader (content)); String line; while ((line = in.readLine()) != null) { // System.out.println(line); } } catch(Exception e) { e.printStackTrace(); } 

您可以将ClientPNames.ALLOW_CIRCULAR_REDIRECTS设置为true,这将允许重定向到相同的位置。

  client.getParams().setParameter(ClientPNames.ALLOW_CIRCULAR_REDIRECTS, true); 

在这里查看更多信息

你就是避开它。 HttpClient检测到循环重定向并引发exception。 如果没有“避免”,它将继续永远重定向(直到你决定杀死这个过程)。 如果这是服务器响应的内容,则没有很多其他选项。

真正避免循环重定向循环的唯一方法是修复服务器。

如果你想知道发生了什么(比如为什么它似乎在浏览器中找到而不是从你的程序中找到),请尝试打开一些额外的HttpClient日志记录。 特别是,请确保您可以看到所有来回发送的HTTP标头。 然后,您可以查看在浏览器中发出相同请求时发生的对话,并注明差异。 它可能是一个缺少cookie,疯狂浏览器检测等…

跟踪浏览器通信的方法有很多种。 以下是我经常使用的几种方式,从最简单到最难(IMHO):

  • Firefox + HttpFox (或LiveHttpHeaders,Firebug等…)
  • 提琴手 (仅限Windows)
  • Wireshark的/ tcpdump的

对于低级别测试,请尝试使用telnet(除非您使用Windows,在这种情况下,您可能最好使用PuTTY / plink之类的东西)并排除/更改导致循环重定向的更改。

自4.0以来,Apache HttpClient中存在导致循环重定向的错误,即使在最新版本中也没有修复。

在DefaultRequestDirector.java中,它创建一个HttpRedirect来执行重定向,它将重用原始HttpGet中的所有头文件,这里的问题是它还将重用Host头,这意味着服务器在尝试重定向后仍将获得原始主机到新的URI。

我通过重新实现DefaultRequestDirector修复了这个问题:

 public class RedirectRequestDirector extends DefaultRequestDirector { RedirectRequestDirector( final HttpRequestExecutor requestExec, final ClientConnectionManager conman, final ConnectionReuseStrategy reustrat, final ConnectionKeepAliveStrategy kastrat, final HttpRoutePlanner rouplan, final HttpProcessor httpProcessor, final HttpRequestRetryHandler retryHandler, final RedirectHandler redirectHandler, final AuthenticationHandler targetAuthHandler, final AuthenticationHandler proxyAuthHandler, final UserTokenHandler userTokenHandler, final HttpParams params) { super(requestExec, conman, reustrat, kastrat, rouplan, httpProcessor, retryHandler, redirectHandler, targetAuthHandler, proxyAuthHandler, userTokenHandler, params); } @Override protected RoutedRequest handleResponse(RoutedRequest roureq, HttpResponse response, HttpContext context) throws HttpException, IOException { RoutedRequest req = super.handleResponse(roureq, response, context); if(req != null) { String redirectTarget = req.getRoute().getTargetHost().getHostName(); req.getRequest().getOriginal().setHeader("Host", redirectTarget); } return req; } } 

和DefaultHttpClient:

 public class RedirectHttpClient extends DefaultHttpClient { @Override protected RequestDirector createClientRequestDirector( final HttpRequestExecutor requestExec, final ClientConnectionManager conman, final ConnectionReuseStrategy reustrat, final ConnectionKeepAliveStrategy kastrat, final HttpRoutePlanner rouplan, final HttpProcessor httpProcessor, final HttpRequestRetryHandler retryHandler, final RedirectHandler redirectHandler, final AuthenticationHandler targetAuthHandler, final AuthenticationHandler proxyAuthHandler, final UserTokenHandler stateHandler, final HttpParams params) { return new RedirectRequestDirector( requestExec, conman, reustrat, kastrat, rouplan, httpProcessor, retryHandler, redirectHandler, targetAuthHandler, proxyAuthHandler, stateHandler, params); } } 

现在我不会抱怨循环重定向。

在发送到您请求的url之前,请检查您的请求是否未发送到代理。

你可以尝试:

 RequestConfig requestConfig = RequestConfig.custom() .setCircularRedirectsAllowed(true) .build(); HttpClient httpClient = HttpClients.custom() .setDefaultRequestConfig(requestConfig) .setRedirectStrategy(new LaxRedirectStrategy()) .build(); HttpComponentsClientHttpRequestFactory requestFactory = new HttpComponentsClientHttpRequestFactory(); requestFactory.setHttpClient(httpClient);