如何计算文本文件中的单词，java 8样式

我正在尝试执行一项任务，首先计算目录中的文件数，然后在每个文件中给出一个字数。我得到的文件数量还不错，但是我很难转换一些代码，我的教师从一个频率计数到更简单的字数的类中给了我。此外，我似乎无法找到正确的代码来查看每个文件来计算单词（我试图找到“通用”而不是特定的东西，但我试图使用特定的文本文件测试程序）。这是预期的输出：

Count 11 files: word length: 1 ==> 80 word length: 2 ==> 321 word length: 3 ==> 643

但是，这是输出的内容：

 primes.txt but are sometimes sense refrigerator make haiku dont they funny word length: 1 ==> {but=1, are=1, sometimes=1, sense=1, refrigerator=1, make=1, haiku=1, dont=1, they=1, funny=1} ..... Count 11 files:

我正在使用两个类：WordCount和FileCatch8

字数：

 import java.io.IOException; import java.nio.file.Files; import java.nio.file.Path; import java.nio.file.Paths; import java.util.AbstractMap.SimpleEntry; import java.util.Arrays; import java.util.Map; import static java.util.stream.Collectors.counting; import static java.util.stream.Collectors.groupingBy; /** * * @author */ public class WordCount { /** * * @param filename * @return * @throws java.io.IOException */ public Map count(String filename) throws IOException { //Stream lines = Files.lines(Paths.get(filename)); Path path = Paths.get("haiku.txt"); Map wordMap = Files.lines(path) .parallel() .flatMap(line -> Arrays.stream(line.trim().split(" "))) .map(word -> word.replaceAll("[^a-zA-Z]", "").toLowerCase().trim()) .filter(word -> word.length() > 0) .map(word -> new SimpleEntry(word, 1)) //.collect(Collectors.toMap(s -> s, s -> 1, Integer::sum)); .collect(groupingBy(SimpleEntry::getKey, counting())); wordMap.forEach((k, v) -> System.out.println(String.format(k,v))); return wordMap; } }

和FileCatch：

 import java.io.IOException; import java.nio.file.DirectoryStream; import java.nio.file.Files; import java.nio.file.Path; import java.nio.file.Paths; import java.util.ArrayList; import java.util.List; /* * To change this license header, choose License Headers in Project Properties. * To change this template file, choose Tools | Templates * and open the template in the editor. */ /** * * @author */ public class FileCatch8 { public static void main(String args[]) { List fileNames = new ArrayList(); try { DirectoryStream directoryStream = Files.newDirectoryStream (Paths.get("files")); int fileCounter = 0; WordCount wordCnt = new WordCount(); for (Path path : directoryStream) { System.out.println(path.getFileName()); fileCounter++; fileNames.add(path.getFileName().toString()); System.out.println("word length: " + fileCounter + " ==> " + wordCnt.count(path.getFileName().toString())); } } catch(IOException ex){ } System.out.println("Count: "+fileNames.size()+ " files"); } }

该程序使用带有lambda语法的Java 8流

字数例子：

 Files.lines(Paths.get(file)) .flatMap(line -> Arrays.stream(line.trim().split(" "))) .map(word -> word.replaceAll("[^a-zA-Z]", "").toLowerCase().trim()) .filter(word -> !word.isEmpty()) .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));

文件数量：

 Files.walk(Paths.get(file), Integer.MAX_VALUE).count(); Files.walk(Paths.get(file)).count();

在我看来，使用Java 8计算文件中单词的最简单方法是：

 Long wordsCount = Files.lines(Paths.get(file)) .flatMap(str->Stream.of(str.split("[ ,.!?\r\n]"))) .filter(s->s.length()>0).count(); System.out.println(wordsCount);

并计算所有文件：

 Long filesCount = Files.walk(Paths.get(file)).count(); System.out.println(filesCount);

如何计算文本文件中的单词，java 8样式

Windows 64位上的com4j

为什么Java的同步集合不使用读/写锁？

HTTP PUT以Java格式上传文件

在连接三元运算符的2个结果时，字符串连接在Java中无法正常工作

JavaFX 2.1 MessageBox

如何对HashMap键进行排序

编译所有子文件夹中的java文件？

如何确定Java中给定日期前一天的日期？

iText中的PdfPageEventHelper

Intellij – 像在Eclipse中一样添加项目依赖项