从Commons HttpClient迁移到HttpComponents Client

我想从Commons HttpClient(3.x)迁移到HttpComponents Client(4.x),但是如何处理重定向很困难。 代码在Commons HttpClient下正常工作,但在迁移到HttpComponents Client时中断。 一些链接会产生不良重定向,但当我将“http.protocol.handle-redirects”设置为“false”时,大量链接会完全停止工作。

Commons HttpClient 3.x:

private static HttpClient httpClient = null; private static MultiThreadedHttpConnectionManager connectionManager = null; private static final long MAX_CONNECTION_IDLE_TIME = 60000; // milliseconds static { //HttpURLConnection.setFollowRedirects(true); CookieManager manager = new CookieManager(); manager.setCookiePolicy(CookiePolicy.ACCEPT_ALL); CookieHandler.setDefault(manager); connectionManager = new MultiThreadedHttpConnectionManager(); connectionManager.getParams().setDefaultMaxConnectionsPerHost(1000); // will need to set from properties file connectionManager.getParams().setMaxTotalConnections(1000); httpClient = new HttpClient(connectionManager); } /* * Retrieve HTML */ public String fetchURL(String url) throws IOException{ if ( StringUtils.isEmpty(url) ) return null; GetMethod getMethod = new GetMethod(url); HttpClient httpClient = new HttpClient(); //configureMethod(getMethod); //ObjectInputStream oin = null; InputStream in = null; int code = -1; String html = ""; String lastModified = null; try { code = httpClient.executeMethod(getMethod); in = getMethod.getResponseBodyAsStream(); //oin = new ObjectInputStream(in); //html = getMethod.getResponseBodyAsString(); html = CharStreams.toString(new InputStreamReader(in)); } catch (Exception except) { } finally { try { //oin.close(); in.close(); } catch (Exception except) {} getMethod.releaseConnection(); connectionManager.closeIdleConnections(MAX_CONNECTION_IDLE_TIME); } if (code <= 400){ return html.replaceAll("\\s+", " "); } else { throw new Exception("URL: " + url + " returned response code " + code); } } 

HttpComponents客户端4.x:

 private static HttpClient httpClient = null; private static HttpParams params = null; //private static MultiThreadedHttpConnectionManager connectionManager = null; private static ThreadSafeClientConnManager connectionManager = null; private static final int MAX_CONNECTION_IDLE_TIME = 60000; // milliseconds static { //HttpURLConnection.setFollowRedirects(true); CookieManager manager = new CookieManager(); manager.setCookiePolicy(CookiePolicy.ACCEPT_ALL); CookieHandler.setDefault(manager); connectionManager = new ThreadSafeClientConnManager(); connectionManager.setDefaultMaxPerRoute(1000); // will need to set from properties file connectionManager.setMaxTotal(1000); httpClient = new DefaultHttpClient(connectionManager); // HTTP parameters stores header etc. params = new BasicHttpParams(); params.setParameter("http.protocol.handle-redirects",false); } /* * Retrieve HTML */ public String fetchURL(String url) throws IOException{ if ( StringUtils.isEmpty(url) ) return null; InputStream in = null; //int code = -1; String html = ""; // Prepare a request object HttpGet httpget = new HttpGet(url); httpget.setParams(params); // Execute the request HttpResponse response = httpClient.execute(httpget); // The response status //System.out.println(response.getStatusLine()); int code = response.getStatusLine().getStatusCode(); // Get hold of the response entity HttpEntity entity = response.getEntity(); // If the response does not enclose an entity, there is no need // to worry about connection release if (entity != null) { try { //code = httpClient.executeMethod(getMethod); //in = getMethod.getResponseBodyAsStream(); in = entity.getContent(); html = CharStreams.toString(new InputStreamReader(in)); } catch (Exception except) { throw new Exception("URL: " + url + " returned response code " + code); } finally { try { //oin.close(); in.close(); } catch (Exception except) {} //getMethod.releaseConnection(); connectionManager.closeIdleConnections(MAX_CONNECTION_IDLE_TIME, TimeUnit.MILLISECONDS); connectionManager.closeExpiredConnections(); } } if (code <= 400){ return html; } else { throw new Exception("URL: " + url + " returned response code " + code); } } 

我不想要重定向,但在HttpClient 4.x下,如果我启用重定向,那么我得到一些不受欢迎的,例如http://www.walmart.com/ => http://mobile.walmart.com/ 。 在HttpClient 3.x下,没有发生这样的重定向。

在不破坏代码的情况下,我需要做什么才能将HttpClient 3.x迁移到HttpClient 4.x?

这不是HttpClient 4.x的问题,可能是目标服务器处理请求的方式,因为用户代理是httpclient,它可以作为移动处理(目标服务器可能会考虑除了可用的浏览器之外,例如chrome,mozilla等移动。)

请使用以下代码手动设置

  httpclient.getParams().setParameter( org.apache.http.params.HttpProtocolParams.USER_AGENT, "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.2) Gecko/20100316 Firefox/3.6.2" );