以编程方式获取谷歌搜索结果计数的最简单(合法)方式?

我想使用Java代码获取某些Google搜索引擎查询(在整个网络上)的估算结果计数。

我每天只需要做很少的查询,所以最初谷歌网络搜索API虽然已被弃用,但看起来还不错(例如, 如何搜索谷歌程序化Java API )。 但事实certificate,此API返回的数字与www.google.com返回的数字非常不同(请参阅http://code.google.com/p/google-ajax-apis/issues/detail?id = 32 )。 所以这些数字对我来说都没用。

我也尝试了谷歌自定义搜索引擎 ,它表现出同样的问题。

您认为我的任务最简单的解决方案是什么?

/**** @author RAJESH Kharche */ //open Netbeans //Choose Java->prject //name it GoogleSearchAPP package googlesearchapp; import java.io.*; import java.net.*; import java.util.*; import java.util.logging.Level; import java.util.logging.Logger; public class GoogleSearchAPP { public static void main(String[] args) { try { // TODO code application logic here final int Result; Scanner s1=new Scanner(System.in); String Str; System.out.println("Enter Query to search: ");//get the query to search Str=s1.next(); Result=getResultsCount(Str); System.out.println("Results:"+ Result); } catch (IOException ex) { Logger.getLogger(GoogleSearchAPP.class.getName()).log(Level.SEVERE, null, ex); } } private static int getResultsCount(final String query) throws IOException { final URL url; url = new URL("https://www.google.com/search?q=" + URLEncoder.encode(query, "UTF-8")); final URLConnection connection = url.openConnection(); connection.setConnectTimeout(60000); connection.setReadTimeout(60000); connection.addRequestProperty("User-Agent", "Google Chrome/36");//put the browser name/version final Scanner reader = new Scanner(connection.getInputStream(), "UTF-8"); //scanning a buffer from object returned by http request while(reader.hasNextLine()){ //for each line in buffer final String line = reader.nextLine(); if(!line.contains("\"resultStats\">"))//line by line scanning for "resultstats" field because we want to extract number after it continue; try{ return Integer.parseInt(line.split("\"resultStats\">")[1].split("<")[0].replaceAll("[^\\d]", ""));//finally extract the number convert from string to integer }finally{ reader.close(); } } reader.close(); return 0; } } 

您可以做的就是以编程方式开始执行实际的Google搜索。 最简单的方法是访问urlhttps://www.google.com/search?q=QUERY_HERE ,然后您想要从该网页上删除结果计数。

以下是如何执行此操作的快速示例:

  private static int getResultsCount(final String query) throws IOException { final URL url = new URL("https://www.google.com/search?q=" + URLEncoder.encode(query, "UTF-8")); final URLConnection connection = url.openConnection(); connection.setConnectTimeout(60000); connection.setReadTimeout(60000); connection.addRequestProperty("User-Agent", "Mozilla/5.0"); final Scanner reader = new Scanner(connection.getInputStream(), "UTF-8"); while(reader.hasNextLine()){ final String line = reader.nextLine(); if(!line.contains("
")) continue; try{ return Integer.parseInt(line.split("
")[1].split("<")[0].replaceAll("[^\\d]", "")); }finally{ reader.close(); } } reader.close(); return 0; }

如需使用,您可以执行以下操作:

 final int count = getResultsCount("horses"); System.out.println("Estimated number of results for horses: " + count);