在.txt文件中查找所有字符串“the”

这是我的代码：

// Import io so we can use file objects import java.io.*; public class SearchThe { public static void main(String args[]) { try { String stringSearch = "the"; // Open the file c:\test.txt as a buffered reader BufferedReader bf = new BufferedReader(new FileReader("test.txt")); // Start a line count and declare a string to hold our current line. int linecount = 0; String line; // Let the user know what we are searching for System.out.println("Searching for " + stringSearch + " in file..."); // Loop through each line, stashing the line into our line variable. while (( line = bf.readLine()) != null){ // Increment the count and find the index of the word linecount++; int indexfound = line.indexOf(stringSearch); // If greater than -1, means we found the word if (indexfound > -1) { System.out.println("Word was found at position " + indexfound + " on line " + linecount); } } // Close the file after done searching bf.close(); } catch (IOException e) { System.out.println("IO Error Occurred: " + e.toString()); } } }

我想在test.txt文件中找到一些单词“the” 。问题是当我找到第一个“the”时 ，我的程序停止找到更多。

当一些像“然后”这样的词时，我的程序将其理解为“the”这个词。

使用正则表达式不区分大小写，使用单词边界查找“the”的所有实例和变体。

indexOf("the")无法在“the”和“then”之间分辨，因为每个都以“the”开头。同样，“the”位于“anathema”的中间。

要避免这种情况，请使用正则表达式，并搜索“the”，两边都有单词边界（ \b ）。使用单词边界，而不是分裂“”，或仅使用indexOf(" the ") （任意一侧的空格），它们不会找到“the”。 和标点符号旁边的其他实例。您也可以对搜索案例不敏感地查找“The” 。

 Pattern p = Pattern.compile("\\bthe\\b", Pattern.CASE_INSENSITIVE); while ( (line = bf.readLine()) != null) { linecount++; Matcher m = p.matcher(line); // indicate all matches on the line while (m.find()) { System.out.println("Word was found at position " + m.start() + " on line " + linecount); } }

您不应该使用indexOf，因为它将找到您的字符串中的所有可能的子字符串。因为“then”包含字符串“the”，所以它也是一个很好的子字符串。

有关indexOf的更多信息

指数

public int indexOf（String str，int fromIndex）从指定的索引处开始，返回指定子字符串第一次出现的字符串中的索引。返回的整数是最小值k，其中：

你应该将这些行分成许多单词并循环每个单词并与“the”进行比较。

 String [] words = line.split(" "); for (String word : words) { if (word.equals("the")) { System.out.println("Found the word"); } }

上面的代码片段也会循环遍历行中所有可能的“the”。使用indexOf将始终返回第一个匹配项

您当前的实现只会找到每行的第一个’the’实例。

考虑将每一行拆分为单词，迭代单词列表，并将每个单词与’the’进行比较：

 while (( line = bf.readLine()) != null) { linecount++; String[] words = line.split(" "); for (String word : words) { if(word.equals(stringSearch)) System.out.println("Word was found at position " + indexfound + " on line " + linecount); } }

这听起来不像是练习的目的是让你在正则表达式中熟练（我不知道它可能……但它似乎有点基础），即使正则表达式确实是真实的 – 这样的事情的世界解决方案。

我的建议是专注于基础知识，使用索引和子字符串来测试字符串。想想你如何解释字符串的自然区分大小写的本质。此外，你的读者总是被关闭（即有没有办法bf.close（）不会被执行）？

您最好使用正则表达式进行此类搜索。作为一个简单/脏的解决方法，您可以修改stringSearch

 String stringSearch = "the";

至

 String stringSearch = " the ";

在.txt文件中查找所有字符串“the”

如何加密android中的数据

关闭UI线程的Android GPS回调

HTTP / 1.1 302暂时移动 – 在Android API 16-17上发生

使用getExternalFilesDir保存Android不一致的图片

在机器人崩溃报告的奇怪堆栈

来自资源的大图像获得exception

Android中的间谍/监视变量（Eclipse）

更改字体颜色运行时Android

在没有星期六，星期日和公众假期的情况下计算java的工作日

Android – 事实上REST / JSON客户端的实现？