在字符串上找到重复的单词并计算重复次数

我需要在字符串上找到重复的单词,然后计算它们被重复的次数。 所以基本上,如果输入字符串是这样的:

String s = "House, House, House, Dog, Dog, Dog, Dog"; 

我需要创建一个不重复的新字符串列表,并在其他地方保存每个单词的重复次数,如:

新字符串:“House,Dog”

新的Int数组:[3,4]

有没有办法用Java轻松完成这项工作? 我已经设法使用s.split()分隔字符串,但是我如何计算重复并在新字符串上消除它们? 谢谢!

你已经完成了艰苦的工作。 现在您可以使用Map来计算出现次数:

 Map occurrences = new HashMap(); for ( String word : splitWords ) { Integer oldCount = occurrences.get(word); if ( oldCount == null ) { oldCount = 0; } occurrences.put(word, oldCount + 1); } 

使用map.get(word)会告诉你多次出现一个单词。 您可以通过迭代map.keySet()来构造一个新列表:

 for ( String word : occurrences.keySet() ) { //do something with word } 

请注意,从keySet获取的顺序是任意的。 如果您需要在首次出现在输入String中时对它们进行排序,则应使用LinkedHashMap

正如其他人所提到的那样使用String :: split(),然后是一些map(hashmap或linkedhashmap),然后合并你的结果。 为了完整起见,放置代码。

 import java.util.*; public class Genric { public static void main(String[] args) { Map unique = new LinkedHashMap(); for (String string : "House, House, House, Dog, Dog, Dog, Dog".split(", ")) { if(unique.get(string) == null) unique.put(string, 1); else unique.put(string, unique.get(string) + 1); } String uniqueString = join(unique.keySet(), ", "); List value = new ArrayList(unique.values()); System.out.println("Output = " + uniqueString); System.out.println("Values = " + value); } public static String join(Collection s, String delimiter) { StringBuffer buffer = new StringBuffer(); Iterator iter = s.iterator(); while (iter.hasNext()) { buffer.append(iter.next()); if (iter.hasNext()) { buffer.append(delimiter); } } return buffer.toString(); } } 

新字符串Output = House, Dog

Int数组(或更确切地说,列表) Values = [3, 4] (您可以使用List :: toArray)来获取数组。

尝试这个,

 public class DuplicateWordSearcher { @SuppressWarnings("unchecked") public static void main(String[] args) { String text = "arbkcd se fgadfssfds ft gh f ws wfvxsghdhjjkf sd je wed adf"; List list = Arrays.asList(text.split(" ")); Set uniqueWords = new HashSet(list); for (String word : uniqueWords) { System.out.println(word + ": " + Collections.frequency(list, word)); } } 

}

 public class StringsCount{ public static void main(String args[]) { String value = "This is testing Program testing Program"; String item[] = value.split(" "); HashMap map = new HashMap<>(); for (String t : item) { if (map.containsKey(t)) { map.put(t, map.get(t) + 1); } else { map.put(t, 1); } } Set keys = map.keySet(); for (String key : keys) { System.out.println(key); System.out.println(map.get(key)); } } } 

它可能会以某种方式帮助你。

 String st="I am am not the one who is thinking I one thing at time"; String []ar = st.split("\\s"); Map mp= new HashMap(); int count=0; for(int i=0;i 

如果这是一个家庭作业,那么我只能说:使用String.split()HashMap

(我看到你已经找到了split()。那么你就是正确的。)

您可以使用前缀树(trie)数据结构来存储单词并跟踪前缀树节点中的单词计数。

  #define ALPHABET_SIZE 26 // Structure of each node of prefix tree struct prefix_tree_node { prefix_tree_node() : count(0) {} int count; prefix_tree_node *child[ALPHABET_SIZE]; }; void insert_string_in_prefix_tree(string word) { prefix_tree_node *current = root; for(unsigned int i=0;i(word[i] - 'a'); // Invalid alphabetic character, then continue // Note :::: Change this condition depending on the scenario if(letter > 26) throw runtime_error("Invalid alphabetic character"); if(current->child[letter] == NULL) current->child[letter] = new prefix_tree_node(); current = current->child[letter]; } current->count++; // Insert this string into Max Heap and sort them by counts } // Data structure for storing in Heap will be something like this struct MaxHeapNode { int count; string word; }; 

插入所有单词后,您必须通过迭代Maxheap来打印单词和计数。

 //program to find number of repeating characters in a string //Developed by Subash import java.util.Scanner; public class NoOfRepeatedChar { public static void main(String []args) { //input through key board Scanner sc = new Scanner(System.in); System.out.println("Enter a string :"); String s1= sc.nextLine(); //formatting String to char array String s2=s1.replace(" ",""); char [] ch=s2.toCharArray(); int counter=0; //for-loop tocompare first character with the whole character array for(int i=0;i1) { boolean flag=false; //for-loop to check whether the character is already refferenced or not for (int k=i-1;k>=0 ;k-- ) { if(ch[i] == ch[k] ) //if the character is already refferenced flag=true; } if( !flag ) //if(flag==false) counter=counter+1; } } if(counter > 0) //if there is/are any repeating characters System.out.println("Number of repeating charcters in the given string is/are " +counter); else System.out.println("Sorry there is/are no repeating charcters in the given string"); } } 
 public static void main(String[] args) { String s="sdf sdfsdfsd sdfsdfsd sdfsdfsd sdf sdf sdf "; String st[]=s.split(" "); System.out.println(st.length); Map mp= new TreeMap(); for(int i=0;i 

如果传递一个String参数,它将计算每个单词的重复次数

 /** * @param string * @return map which contain the word and value as the no of repatation */ public Map findDuplicateString(String str) { String[] stringArrays = str.split(" "); Map map = new HashMap(); Set words = new HashSet(Arrays.asList(stringArrays)); int count = 0; for (String word : words) { for (String temp : stringArrays) { if (word.equals(temp)) { ++count; } } map.put(word, count); count = 0; } return map; } 

输出:

  Word1=2, word2=4, word2=1,. . . 
 import java.util.HashMap; import java.util.LinkedHashMap; public class CountRepeatedWords { public static void main(String[] args) { countRepeatedWords("Note that the order of what you get out of keySet is arbitrary. If you need the words to be sorted by when they first appear in your input String, you should use a LinkedHashMap instead."); } public static void countRepeatedWords(String wordToFind) { String[] words = wordToFind.split(" "); HashMap wordMap = new LinkedHashMap(); for (String word : words) { wordMap.put(word, (wordMap.get(word) == null ? 1 : (wordMap.get(word) + 1))); } System.out.println(wordMap); } } 

我希望这能帮到您

public void countInPara(String str){

  Map strMap = new HashMap(); List paraWords = Arrays.asList(str.split(" ")); Set strSet = new LinkedHashSet<>(paraWords); int count; for(String word : strSet) { count = Collections.frequency(paraWords, word); strMap.put(count, strMap.get(count)==null ? word : strMap.get(count).concat(","+word)); } for(Map.Entry entry : strMap.entrySet()) System.out.println(entry.getKey() +" :: "+ entry.getValue()); } 
 import java.util.ArrayList; import java.util.Arrays; import java.util.HashMap; import java.util.HashSet; import java.util.List; import java.util.Map; import java.util.Set; public class DuplicateWord { public static void main(String[] args) { String para = "this is what it is this is what it can be"; List < String > paraList = new ArrayList < String > (); paraList = Arrays.asList(para.split(" ")); System.out.println(paraList); int size = paraList.size(); int i = 0; Map < String, Integer > duplicatCountMap = new HashMap < String, Integer > (); for (int j = 0; size > j; j++) { int count = 0; for (i = 0; size > i; i++) { if (paraList.get(j).equals(paraList.get(i))) { count++; duplicatCountMap.put(paraList.get(j), count); } } } System.out.println(duplicatCountMap); List < Integer > myCountList = new ArrayList < > (); Set < String > myValueSet = new HashSet < > (); for (Map.Entry < String, Integer > entry: duplicatCountMap.entrySet()) { myCountList.add(entry.getValue()); myValueSet.add(entry.getKey()); } System.out.println(myCountList); System.out.println(myValueSet); } } 

输入:这就是它的本质所在

输出:

[这,是,什么,它是,这,是,什么,它,可以,是]

{can = 1,what = 2,be = 1,this = 2,is = 3,it = 2}

[1,2,1,2,3,2]

[可以,什么,是,这是,它]

 import java.util.HashMap; import java.util.Scanner; public class class1 { public static void main(String[] args) { Scanner in = new Scanner(System.in); String inpStr = in.nextLine(); int key; HashMap hm = new HashMap(); String[] strArr = inpStr.split(" "); for(int i=0;i 

}

请使用以下代码。 根据我的分析,它是最简单的。 希望你会喜欢:

 import java.util.Arrays; import java.util.Collections; import java.util.HashMap; import java.util.HashSet; import java.util.List; import java.util.Scanner; import java.util.Set; public class MostRepeatingWord { String mostRepeatedWord(String s){ String[] splitted = s.split(" "); List listString = Arrays.asList(splitted); Set setString = new HashSet(listString); int count = 0; int maxCount = 1; String maxRepeated = null; for(String inp: setString){ count = Collections.frequency(listString, inp); if(count > maxCount){ maxCount = count; maxRepeated = inp; } } return maxRepeated; } public static void main(String[] args) { System.out.println("Enter The Sentence: "); Scanner s = new Scanner(System.in); String input = s.nextLine(); MostRepeatingWord mrw = new MostRepeatingWord(); System.out.println("Most repeated word is: " + mrw.mostRepeatedWord(input)); } } 
 package day2; import java.util.ArrayList; import java.util.HashMap;`enter code here` import java.util.List; public class DuplicateWords { public static void main(String[] args) { String S1 = "House, House, House, Dog, Dog, Dog, Dog"; String S2 = S1.toLowerCase(); String[] S3 = S2.split("\\s"); List a1 = new ArrayList(); HashMap hm = new HashMap<>(); for (int i = 0; i < S3.length - 1; i++) { if(!a1.contains(S3[i])) { a1.add(S3[i]); } else { continue; } int Count = 0; for (int j = 0; j < S3.length - 1; j++) { if(S3[j].equals(S3[i])) { Count++; } } hm.put(S3[i], Count); } System.out.println("Duplicate Words and their number of occurrences in String S1 : " + hm); } } 
 public class Counter { private static final int COMMA_AND_SPACE_PLACE = 2; private String mTextToCount; private ArrayList mSeparateWordsList; public Counter(String mTextToCount) { this.mTextToCount = mTextToCount; mSeparateWordsList = cutStringIntoSeparateWords(mTextToCount); } private ArrayList cutStringIntoSeparateWords(String text) { ArrayList returnedArrayList = new ArrayList<>(); if(text.indexOf(',') == -1) { returnedArrayList.add(text); return returnedArrayList; } int position1 = 0; int position2 = 0; while(position2 < text.length()) { char c = ','; if(text.toCharArray()[position2] == c) { String tmp = text.substring(position1, position2); position1 += tmp.length() + COMMA_AND_SPACE_PLACE; returnedArrayList.add(tmp); } position2++; } if(position1 < position2) { returnedArrayList.add(text.substring(position1, position2)); } return returnedArrayList; } public int[] countWords() { if(mSeparateWordsList == null) return null; HashMap wordsMap = new HashMap<>(); for(String s: mSeparateWordsList) { int cnt; if(wordsMap.containsKey(s)) { cnt = wordsMap.get(s); cnt++; } else { cnt = 1; } wordsMap.put(s, cnt); } return printCounterResults(wordsMap); } private int[] printCounterResults(HashMap m) { int index = 0; int[] returnedIntArray = new int[m.size()]; for(int i: m.values()) { returnedIntArray[index] = i; index++; } return returnedIntArray; } 

}

 /*count no of Word in String using TreeMap we can use HashMap also but word will not display in sorted order */ import java.util.*; public class Genric3 { public static void main(String[] args) { Map unique = new TreeMap(); String string1="Ram:Ram: Dog: Dog: Dog: Dog:leela:leela:house:house:shayam"; String string2[]=string1.split(":"); for (int i=0; i 
 //program to find number of repeating characters in a string //Developed by Rahul Lakhmara import java.util.*; public class CountWordsInString { public static void main(String[] args) { String original = "I am rahul am i sunil so i can say am i"; // making String type of array String[] originalSplit = original.split(" "); // if word has only one occurrence int count = 1; // LinkedHashMap will store the word as key and number of occurrence as // value Map wordMap = new LinkedHashMap(); for (int i = 0; i < originalSplit.length - 1; i++) { for (int j = i + 1; j < originalSplit.length; j++) { if (originalSplit[i].equals(originalSplit[j])) { // Increment in count, it will count how many time word // occurred count++; } } // if word is already present so we will not add in Map if (wordMap.containsKey(originalSplit[i])) { count = 1; } else { wordMap.put(originalSplit[i], count); count = 1; } } Set word = wordMap.entrySet(); Iterator itr = word.iterator(); while (itr.hasNext()) { Map.Entry map = (Map.Entry) itr.next(); // Printing System.out.println(map.getKey() + " " + map.getValue()); } } } 
  public static void main(String[] args){ String string = "elamparuthi, elam, elamparuthi"; String[] s = string.replace(" ", "").split(","); String[] op; String ops = ""; for(int i=0; i<=s.length-1; i++){ if(!ops.contains(s[i]+"")){ if(ops != "")ops+=", "; ops+=s[i]; } } System.out.println(ops); } 

这里是计算String中重复单词的步骤

  1. 创建String&Integer类型的空HashMap
  2. 使用空格分隔字符串并将其分配给String []
  3. 使用for-each循环拆分后,迭代String []数组
  4. 注意:在检查不区分大小写的目的之前,我们会将所有字符串转换为小写
  5. 使用Map接口的containsKey(k)方法检查HashMap中是否已存在特定单词
  6. 如果它包含,则使用Map的put(K,V)方法将计数值增加1
  7. 否则使用Map的put()方法插入,计数值为1
  8. 最后,使用Map.Entry接口的keySet()或entrySet()方法打印Map

完整的程序有点长,因为它从本地文件读取字符串内容。 您可以在下面粘贴的链接中查看文章

http://www.benchresources.net/count-and-print-number-of-repeated-word-occurrences-in-a-string-in-java/

对于没有空格的字符串,我们可以使用下面提到的代码

 private static void findRecurrence(String input) { final Map map = new LinkedHashMap<>(); for(int i=0; i= 2) { String word = input.substring(startPointer, pointer); if(map.containsKey(word)){ map.put(word, map.get(word)+1); }else{ map.put(word, 1); } i=pointer; }else{ i++; } } for(Map.Entry entry : map.entrySet()){ System.out.println(entry.getKey() + " = " + (entry.getValue()+1)); } } 

将一些输入作为“hahaha”或“ba na na”或“xxxyyyzzzxxxzzz”传递给出所需的输出。

一旦你从字符串中得到了单词,就很容易了。 从Java 10开始,您可以尝试以下代码:

 import java.util.Arrays; import java.util.stream.Collectors; public class StringFrequencyMap { public static void main(String... args) { String[] wordArray = {"House", "House", "House", "Dog", "Dog", "Dog", "Dog"}; var freq = Arrays.stream(wordArray) .collect(Collectors.groupingBy(x -> x, Collectors.counting())); System.out.println(freq); } } 

输出:

 {House=3, Dog=4} 

希望这可以帮助 :

 public static int countOfStringInAText(String stringToBeSearched, String masterString){ int count = 0; while (masterString.indexOf(stringToBeSearched)>=0){ count = count + 1; masterString = masterString.substring(masterString.indexOf(stringToBeSearched)+1); } return count; } 
 package string; import java.util.HashMap; import java.util.Map; import java.util.Set; public class DublicatewordinanArray { public static void main(String[] args) { String str = "This is Dileep Dileep Kumar Verma Verma"; DuplicateString(str); } public static void DuplicateString(String str) { String word[] = str.split(" "); Map < String, Integer > map = new HashMap < String, Integer > (); for (String w: word) if (!map.containsKey(w)) { map.put(w, 1); } else { map.put(w, map.get(w) + 1); } Set < Map.Entry < String, Integer >> entrySet = map.entrySet(); for (Map.Entry < String, Integer > entry: entrySet) if (entry.getValue() > 1) { System.out.printf("%s : %d %n", entry.getKey(), entry.getValue()); } } }