HttpClient登录,搜索并获取XML内容

我想使用HttpClient登录网站,登录后我想搜索某些内容并检索搜索结果的内容。

 /** * A example that demonstrates how HttpClient APIs can be used to perform * form-based logon. */ public class TestHttpClient { public static void main(String[] args) throws Exception { DefaultHttpClient httpclient = new DefaultHttpClient(); HttpGet httpget = new HttpGet("http://projecteuler.net/"); HttpResponse response = httpclient.execute(httpget); HttpEntity entity = response.getEntity(); System.out.println("Login form get: " + response.getStatusLine()); if (entity != null) { entity.consumeContent(); } System.out.println("Initial set of cookies:"); List cookies = httpclient.getCookieStore().getCookies(); if (cookies.isEmpty()) { System.out.println("None"); } else { for (int i = 0; i < cookies.size(); i++) { System.out.println("- " + cookies.get(i).toString()); } } HttpPost httpost = new HttpPost("http://projecteuler.net/index.php?section=login"); List  nvps = new ArrayList (); nvps.add(new BasicNameValuePair("IDToken1", "username")); nvps.add(new BasicNameValuePair("IDToken2", "password")); httpost.setEntity(new UrlEncodedFormEntity(nvps, HTTP.UTF_8)); response = httpclient.execute(httpost); System.out.println("Response "+response.toString()); entity = response.getEntity(); System.out.println("Login form get: " + response.getStatusLine()); if (entity != null) { InputStream is = entity.getContent(); BufferedReader br = new BufferedReader(new InputStreamReader(is)); String str =""; while ((str = br.readLine()) != null){ System.out.println(""+str); } } System.out.println("Post logon cookies:"); cookies = httpclient.getCookieStore().getCookies(); if (cookies.isEmpty()) { System.out.println("None"); } else { for (int i = 0; i < cookies.size(); i++) { System.out.println("- " + cookies.get(i).toString()); } } httpclient.getConnectionManager().shutdown(); } } 

当我从HttpEntity打印输出时,它打印登录页面内容。 使用HttpClient登录后如何获取页面内容?

post应该模仿表单提交。 无需先登录页面。 如果我看看http://projecteuler.net ,似乎表单已发布到index.php,所以我尝试更改posturl:

 HttpPost httpost = new HttpPost("http://projecteuler.net/index.php"); 

使用像Fire bug这样的东西来查看浏览器中发生的事情。 也许您应该在登录后遵循重定向(HttpClient支持此function)。 似乎还有一个名为“login”的参数,其值为“Login”,正在发布。