使用SentiWordNet获得错误的分数

我正在使用SentiWordNet进行一些情绪分析,我在这里的post中提到了如何使用SentiWordNet 。 然而,尽管我尝试了各种输入,但我得到的分数为0.0。 我有什么问题吗? 谢谢!

import java.io.BufferedReader; import java.io.File; import java.io.FileReader; import java.util.HashMap; import java.util.Iterator; import java.util.Set; import java.util.Vector; public class SWN3 { private String pathToSWN = "C:\\Users\\Malcolm\\Desktop\\SentiWordNet_3.0.0\\home\\swn\\www\\admin\\dump\\SentiWordNet_3.0.0.txt"; private HashMap _dict; public SWN3(){ _dict = new HashMap(); HashMap<String, Vector> _temp = new HashMap<String, Vector>(); try{ BufferedReader csv = new BufferedReader(new FileReader(pathToSWN)); String line = ""; while((line = csv.readLine()) != null) { String[] data = line.split("\t"); Double score = Double.parseDouble(data[2])-Double.parseDouble(data[3]); String[] words = data[4].split(" "); for(String w:words) { String[] w_n = w.split("#"); w_n[0] += "#"+data[0]; int index = Integer.parseInt(w_n[1])-1; if(_temp.containsKey(w_n[0])) { Vector v = _temp.get(w_n[0]); if(index>v.size()) for(int i = v.size();i<index; i++) v.add(0.0); v.add(index, score); _temp.put(w_n[0], v); } else { Vector v = new Vector(); for(int i = 0;i<index; i++) v.add(0.0); v.add(index, score); _temp.put(w_n[0], v); } } } Set temp = _temp.keySet(); for (Iterator iterator = temp.iterator(); iterator.hasNext();) { String word = (String) iterator.next(); Vector v = _temp.get(word); double score = 0.0; double sum = 0.0; for(int i = 0; i < v.size(); i++) score += ((double)1/(double)(i+1))*v.get(i); for(int i = 1; i=0.75) sent = "strong_positive"; else if(score > 0.25 && score 0 && score>=0.25) sent = "weak_positive"; else if(score =-0.25) sent = "weak_negative"; else if(score =-0.5) sent = "negative"; else if(score<=-0.75) sent = "strong_negative"; _dict.put(word, score); } } catch(Exception e){e.printStackTrace();} } public Double extract(String word) { Double total = new Double(0); if(_dict.get(word+"#n") != null) total = _dict.get(word+"#n") + total; if(_dict.get(word+"#a") != null) total = _dict.get(word+"#a") + total; if(_dict.get(word+"#r") != null) total = _dict.get(word+"#r") + total; if(_dict.get(word+"#v") != null) total = _dict.get(word+"#v") + total; return total; } public static void main(String[] args) { SWN3 test = new SWN3(); String sentence="Hello have a Super awesome great day"; String[] words = sentence.split("\\s+"); double totalScore = 0; for(String word : words) { word = word.replaceAll("([^a-zA-Z\\s])", ""); if (test.extract(word) == null) continue; totalScore += test.extract(word); } System.out.println(totalScore); } } 

这是SentiWordNet.txt的前10行

 a 00001740 0.125 0 able#1 (usually followed by `to') having the necessary means or skill or know-how or authority to do something; "able to swim"; "she was able to program her computer"; "we were at last able to buy a car"; "able to get a grant for the project" a 00002098 0 0.75 unable#1 (usually followed by `to') not having the necessary means or skill or know-how; "unable to get to town without a car"; "unable to obtain funds" a 00002312 0 0 dorsal#2 abaxial#1 facing away from the axis of an organ or organism; "the abaxial surface of a leaf is the underside or side facing away from the stem" a 00002527 0 0 ventral#2 adaxial#1 nearest to or facing toward the axis of an organ or organism; "the upper side of a leaf is known as the adaxial surface" a 00002730 0 0 acroscopic#1 facing or on the side toward the apex a 00002843 0 0 basiscopic#1 facing or on the side toward the base a 00002956 0 0 abducting#1 abducent#1 especially of muscles; drawing away from the midline of the body or from an adjacent part a 00003131 0 0 adductive#1 adducting#1 adducent#1 especially of muscles; bringing together or drawing toward the midline of the body or toward an adjacent part a 00003356 0 0 nascent#1 being born or beginning; "the nascent chicks"; "a nascent insurgency" a 00003553 0 0 emerging#2 emergent#2 coming into existence; "an emergent republic" 

通常, SentiWord.txt文件带有一种奇怪的格式。

您需要删除它的第一部分(包括注释和说明)和最后两行:

 # EMPTY LINE 

解析器不知道如何处理这些情况,如果你删除这些额外的两行你就没事了。