Tag: stanford nlp

静音Stanford coreNLP日志记录: 首先，Java不是我常用的语言，所以我对它非常基础。我需要将它用于这个特定项目，所以请耐心等待，如果我遗漏了任何相关信息，请提出要求，我很乐意提供。我已经能够实现coreNLP，并且，似乎已经正常工作，但是产生了许多消息，如： ene 20, 2017 10:38:42 AM edu.stanford.nlp.process.PTBLexer next ADVERTENCIA: Untokenizable: 【 (U+3010, decimal: 12304) 经过一些研究（文档，谷歌，其他线程），我认为（对不起，我不知道我怎么能确定）coreNLP正在我的类路径中找到slf4j-api.jar并记录它。我可以使用哪些JVM属性来设置将要打印的消息的日志记录级别？另外，我可以在哪个.properties文件中设置它们？（我已经在项目的资源文件夹中有一个commons-logging.properties ，一个simplelog.properties和一个StanfordCoreNLP.properties来设置其他包的属性）。

使用Stanford Parser获得K语句的最佳解析: 我希望得到一个句子的K最好的解析，我想这可以用ExhaustivePCFGParser类来完成，问题是我不知道如何使用这个类，更确切地说，我可以实例化这个类吗？（构造函数是：ExhaustivePCFGParser（BinaryGrammar bg，UnaryGrammar ug，Lexicon lex，Options op，Index stateIndex，Index wordIndex，Index tagIndex））但我不知道如何拟合所有这些参数有没有更简单的方法来进行K最佳解析？

Spark 2.0.1写入错误：引起：java.util.NoSuchElementException: 我试图将情绪值附加到每个消息，我已经下载了所有stanford核心jar文件作为依赖项： import sqlContext.implicits._ import com.databricks.spark.corenlp.functions._ import org.apache.spark.sql.functions._ val version = “3.6.0” val model = s”stanford-corenlp-$version-models-english” // val jars = sc.listJars if (!jars.exists(jar => jar.contains(model))) { import scala.sys.process._ s”wget http://repo1.maven.org/maven2/edu/stanford/nlp/stanford- corenlp/$version/$model.jar -O /tmp/$model.jar”.!! sc.addJar(s”/tmp/$model.jar”)} val all_messages = spark.read.parquet(“/home/ubuntu/messDS.parquet”) case class AllMessSent (user_id: Int, sent_at: java.sql.Timestamp, message: String) val messDS = all_messages.as[AllMess] 到目前为止，一切都很好，因为我可以执行计算并保存DS case class AllMessSentiment = […]

在Stanford CoreNLP中添加新的注释器: 我正在尝试根据http://nlp.stanford.edu/downloads/corenlp.shtml中的说明在Stanford CoreNLP中添加一个新的注释器。 “添加新的注释器StanfordCoreNLP还能够通过reflection添加新的注释器而无需更改StanfordCoreNLP.java的代码。要创建新的注释器，请扩展类edu.stanford.nlp.pipeline.Annotator并使用以下方法定义构造函数。 signature（String，Properties）。然后，将属性customAnnotatorClass。FOO FOO=BAR到用于创建管道的属性。如果FOO随后被添加到注释器列表中，将创建类BAR，其名称用于创建它和传入的属性文件。“ 我已经为我的新注释器创建了一个新类，但我不能放入传入的属性文件。我只将新的注释器放在管道中。 props.put(“annotators”, “tokenize, ssplit, pos, lemma, ner, parse, dcoref, regexner, color”); props.setProperty(“customAnnotatorClass.color”, “myPackage.myPipeline”); 有没有示例代码可以帮助我？

使用java，nlp的Pharse级别依赖解析器: 有人可以使用Stanfords的自然语言处理Lexical Parser-开源Java代码详细说明如何获得“使用图解级别依赖”吗？ http://svn.apache.org/repos/asf/nutch/branches/branch-1.2/src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api/RobotRulesParser.java http://docs.mongodb.org/manual/reference/sql-comparison/ 如分析依赖事故———>发生了坠落———> as 夜晚———->堕落像更多…… 谢谢！

无法在R中初始化CoreNLP: 我无法在运行High Sierra的Mac上访问R中的coreNLP 。我不确定问题是什么，但似乎每次我再次尝试使coreNLP工作时，我都面临着一个不同的错误。我有JDK 9.0.4。请参阅下面的代码，了解我正在尝试做什么，以及阻止我的错误。我以前的尝试我能够让initCoreNLP()运行并加载包的一些元素，但是在其他元素上会失败。当我然后尝试运行annotateString() ，它会抛出错误Error Must initialize with ‘int CoreNLP’! 。我已经多次下载并重新下载了coreNLP Java存档，但仍然没有运气！有关位于/Library/Frameworks/R.framework/Versions/3.4/Resources/library/coreNLP coreNLP R包文件夹的内容，请参阅图像。你知道我怎样才能成功初始化coreNLP吗？ dyn.load(“/Library/Java/JavaVirtualMachines/jdk-9.0.4.jdk/Contents/Home/lib/server/libjvm.dylib”) library(NLP) library(coreNLP) > downloadCoreNLP() trying URL ‘http://nlp.stanford.edu/software//stanford-corenlp-full-2015-12-09.zip’ Content type ‘application/zip’ length 403157240 bytes (384.5 MB) ================================================== downloaded 384.5 MB > initCoreNLP() [main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP – Searching for resource: StanfordCoreNLP.properties Error in rJava::.jnew(“edu.stanford.nlp.pipeline.StanfordCoreNLP”, […]

使用StanfordCoreNLP提取两个实体之间的关系: 这里也提出了类似的问题，但我找不到任何相关的答案，所以我再试一次。我可以使用库获取NER和Dependency树。现在我要找的是我想用实体之间的关系提取实体。例如，“flipkart投资了myntra”，所以我应该能够将entity1作为“flipkart”，将entity2作为“myntra”和“投资者”作为关系。或类似的结构。我无法获得相同的正确工具。我有些指导家伙，如何实现这个目标？提前致谢

stanford依赖解析器: 我试过stanford依赖解析器。我得到了以下解析树和关系。但我需要一个依赖图。怎么弄它。有没有办法将依赖项转换为图形？请帮帮我。我是java和stanford工具的新手。程序是一套指令 (ROOT (S (NP (NNP Program)) (VP (VBZ is) (NP (NP (DT a) (NN set)) (PP (IN of) (NP (NN instruction))))))) nsubj(set-4, Program-1) cop(set-4, is-2) det(set-4, a-3) root(ROOT-0, set-4) prep_of(set-4, instruction-6)

斯坦福NLP – 处理文件列表时OpenIE内存不足: 我正在尝试使用Stanford CoreNLP中的OpenIE工具从多个文件中提取信息，当几个文件传递给输入时，它会产生内存不足错误，而不是只有一个。 All files have been queued; awaiting termination… java.lang.OutOfMemoryError: GC overhead limit exceeded at edu.stanford.nlp.graph.DirectedMultiGraph.outgoingEdgeIterator(DirectedMultiGraph.java:508) at edu.stanford.nlp.semgraph.SemanticGraph.outgoingEdgeIterator(SemanticGraph.java:165) at edu.stanford.nlp.semgraph.semgrex.GraphRelation$GOVERNER$1.advance(GraphRelation.java:267) at edu.stanford.nlp.semgraph.semgrex.GraphRelation$SearchNodeIterator.initialize(GraphRelation.java:1102) at edu.stanford.nlp.semgraph.semgrex.GraphRelation$SearchNodeIterator.(GraphRelation.java:1083) at edu.stanford.nlp.semgraph.semgrex.GraphRelation$GOVERNER$1.(GraphRelation.java:257) at edu.stanford.nlp.semgraph.semgrex.GraphRelation$GOVERNER.searchNodeIterator(GraphRelation.java:257) at edu.stanford.nlp.semgraph.semgrex.NodePattern$NodeMatcher.resetChildIter(NodePattern.java:320) at edu.stanford.nlp.semgraph.semgrex.CoordinationPattern$CoordinationMatcher.matches(CoordinationPattern.java:211) at edu.stanford.nlp.semgraph.semgrex.NodePattern$NodeMatcher.matchChild(NodePattern.java:514) at edu.stanford.nlp.semgraph.semgrex.NodePattern$NodeMatcher.matches(NodePattern.java:542) at edu.stanford.nlp.naturalli.RelationTripleSegmenter.segmentVerb(RelationTripleSegmenter.java:541) at edu.stanford.nlp.naturalli.RelationTripleSegmenter.segment(RelationTripleSegmenter.java:850) at edu.stanford.nlp.naturalli.OpenIE.relationInFragment(OpenIE.java:354) at edu.stanford.nlp.naturalli.OpenIE.lambda$relationsInFragments$2(OpenIE.java:366) at edu.stanford.nlp.naturalli.OpenIE$$Lambda$76/1438896944.apply(Unknown Source) at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) at java.util.HashMap$KeySpliterator.forEachRemaining(HashMap.java:1540) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) […]

斯坦福核心NLP：实体类型非确定性: 我使用Stanford Core NLP构建了一个java解析器。我发现在使用CORENLP对象获得一致结果方面存在问题。我得到相同输入文本的不同实体类型。这似乎是CoreNLP中的一个错误。想知道是否有任何StanfordNLP用户遇到过这个问题，并找到相同的解决方法。这是我正在实例化和重用的Service类。 class StanfordNLPService { //private static final Logger logger = LogConfiguration.getInstance().getLogger(StanfordNLPServer.class.getName()); private StanfordCoreNLP nerPipeline; /* Initialize the nlp instances for ner and sentiments. */ public void init() { Properties nerAnnotators = new Properties(); nerAnnotators.put(“annotators”, “tokenize,ssplit,pos,lemma,ner”); nerPipeline = new StanfordCoreNLP(nerAnnotators); } /** * @param text Text from entities to […]