HttpClient登录,搜索并获取XML内容
我想使用HttpClient
登录网站,登录后我想搜索某些内容并检索搜索结果的内容。
/** * A example that demonstrates how HttpClient APIs can be used to perform * form-based logon. */ public class TestHttpClient { public static void main(String[] args) throws Exception { DefaultHttpClient httpclient = new DefaultHttpClient(); HttpGet httpget = new HttpGet("http://projecteuler.net/"); HttpResponse response = httpclient.execute(httpget); HttpEntity entity = response.getEntity(); System.out.println("Login form get: " + response.getStatusLine()); if (entity != null) { entity.consumeContent(); } System.out.println("Initial set of cookies:"); List cookies = httpclient.getCookieStore().getCookies(); if (cookies.isEmpty()) { System.out.println("None"); } else { for (int i = 0; i < cookies.size(); i++) { System.out.println("- " + cookies.get(i).toString()); } } HttpPost httpost = new HttpPost("http://projecteuler.net/index.php?section=login"); List nvps = new ArrayList (); nvps.add(new BasicNameValuePair("IDToken1", "username")); nvps.add(new BasicNameValuePair("IDToken2", "password")); httpost.setEntity(new UrlEncodedFormEntity(nvps, HTTP.UTF_8)); response = httpclient.execute(httpost); System.out.println("Response "+response.toString()); entity = response.getEntity(); System.out.println("Login form get: " + response.getStatusLine()); if (entity != null) { InputStream is = entity.getContent(); BufferedReader br = new BufferedReader(new InputStreamReader(is)); String str =""; while ((str = br.readLine()) != null){ System.out.println(""+str); } } System.out.println("Post logon cookies:"); cookies = httpclient.getCookieStore().getCookies(); if (cookies.isEmpty()) { System.out.println("None"); } else { for (int i = 0; i < cookies.size(); i++) { System.out.println("- " + cookies.get(i).toString()); } } httpclient.getConnectionManager().shutdown(); } }
当我从HttpEntity
打印输出时,它打印登录页面内容。 使用HttpClient
登录后如何获取页面内容?
post应该模仿表单提交。 无需先登录页面。 如果我看看http://projecteuler.net ,似乎表单已发布到index.php,所以我尝试更改posturl:
HttpPost httpost = new HttpPost("http://projecteuler.net/index.php");
使用像Fire bug这样的东西来查看浏览器中发生的事情。 也许您应该在登录后遵循重定向(HttpClient支持此function)。 似乎还有一个名为“login”的参数,其值为“Login”,正在发布。
- 用于apache httpclient 4的UNICODE中的URI编码
- 使用HttpClient 4.0.1与x509证书进行相互身份validation
- 如何使用简单的String(字符串格式的xml)调用SOAP Web服务
- HttpClient支持多种TLS协议
- angular 4 httpclient xml响应
- 打印HttpParams / HttpUriRequest的内容?
- 来自commons-httpclient-3.1的URIUtil.encodePath发生了什么?
- 使用HttpProxy通过preemtive身份validation连接到主机
- Java服务器自签名证书+客户端证书和SSL – 连接重置