Java：从互联网上的目录中读取文本文件

有人知道如何递归地从互联网上的特定目录中读取文件，用Java吗？我想阅读这个网站目录中的所有文本文件： http ： //www.cs.ucdavis.edu/~davidson/courses/170-S11/Female/

我知道如何读取计算机上文件夹中的多个文件，以及如何从互联网上读取单个文件。但是如何在互联网上读取多个文件，而无需对url进行硬编码？

我试过的东西：

// List the files on my Desktop final File folder = new File("/Users/crystal/Desktop"); File[] listOfFiles = folder.listFiles(); for (int i = 0; i < listOfFiles.length; i++) { File fileEntry = listOfFiles[i]; if (!fileEntry.isDirectory()) { System.out.println(fileEntry.getName()); } }

我试过的另一件事：

 // Reading data from the web try { // Create a URL object URL url = new URL("http://www.cs.ucdavis.edu/~davidson/courses/170-S11/Female/5_1_1.txt"); // Read all of the text returned by the HTTP server BufferedReader in = new BufferedReader (new InputStreamReader(url.openStream())); String htmlText; // String that holds current file line // Read through file one line at a time. Print line while ((htmlText = in.readLine()) != null) { System.out.println(htmlText); } in.close(); } catch (MalformedURLException e) { e.printStackTrace(); } catch (IOException e) { // If another exception is generated, print a stack trace e.printStackTrace(); }

谢谢！

由于您提到的URL已启用索引，因此您很幸运。你在这里有几个选择。

使用SAX2或任何其他XML解析器解析html以查找a标签的属性。我认为htmlunit也会起作用。
使用一点regexp魔法来匹配https://stackoverflow.com/questions/6165732/java-read-in-text-files-from-a-directory-from-the-internet/

Java：从互联网上的目录中读取文本文件

java 9中javax.activation包的替代品是什么？

Swing – 更新标签

使用堆栈检查给定的字符串是否为回文

Google reCAPTCHA：如何在服务器端获取用户响应和validation

从Gradle依赖项中排除包

在Spring中使用事务时创建一个post提交

存根与mockito之间的区别

Java – 解析文本文件

java.lang.NoClassDefFoundError：在eclipse maven中

为什么给非generics方法或构造函数的显式类型参数编译？