我如何编写在Java Code中使用相似性度量的SPARQL查询

我想知道在Java代码中编写此SPARQL查询的简单方法：

select ?input ?string (strlen(?match)/strlen(?string) as ?percent) where { values ?string { "London" "Londn" "London Fog" "Lando" "Land Ho!" "concatenate" "catnap" "hat" "cat" "chat" "chart" "port" "part" } values (?input ?pattern ?replacement) { ("cat" "^x[^cat]*([c]?)[^at]*([a]?)[^t]*([t]?).*$" "$1$2$3") ("Londn" "^x[^Londn]*([L]?)[^ondn]*([o]?)[^ndn]*([n]?)[^dn]*([d]?)[^n]*([n]?).*$" "$1$2$3$4$5") } bind( replace( concat('x',?string), ?pattern, ?replacement) as ?match ) } order by ?pattern desc(?percent)

此代码包含在讨论中使用iSPARQL使用相似性度量来比较值。此代码的目的是查找与DBPedia上的给定单词类似的资源。这个方法考虑到我事先知道字符串及其长度。我想知道如何在参数化方法中编写此查询，无论单词和长度如何，它都会返回给我相似性度量。

更新： ARQ – 编写属性函数现在是标准Jena文档的一部分。

看起来您喜欢对SPARQL进行语法扩展，以执行查询中更复杂的部分。例如：

 SELECT ?input ?string ?percent WHERE { VALUES ?string { "London" "Londn" "London Fog" "Lando" "Land Ho!" "concatenate" "catnap" "hat" "cat" "chat" "chart" "port" "part" } VALUES ?input { "cat" "londn" } ?input  (?string ?percent) . } ORDER BY DESC(?percent)

在此示例中，假设是一个属性函数，它将自动执行匹配操作并计算相似性。

Jena文档很好地解释了如何编写自定义filter函数，但是（截至2014年8月7日）几乎没有解释如何实现自定义属性函数。

我将假设您可以将您的答案转换为Java代码以计算字符串相似性，并专注于可以容纳您的代码的属性函数的实现。

实现属性函数

每个属性函数都与特定的Context相关联。这允许您将函数的可用性限制为全局或与特定数据集关联。

假设您有PropertyFunctionFactory的实现（稍后显示），您可以按如下方式注册该函数：

 final PropertyFunctionRegistry reg = PropertyFunctionRegistry.chooseRegistry(ARQ.getContext()); reg.put("urn:ex:fn#matches", new MatchesPropertyFunctionFactory); PropertyFunctionRegistry.set(ARQ.getContext(), reg);

全局和特定于数据集的注册之间的唯一区别在于Context对象来自：

 final Dataset ds = DatasetFactory.createMem(); final PropertyFunctionRegistry reg = PropertyFunctionRegistry.chooseRegistry(ds.getContext()); reg.put("urn:ex:fn#matches", new MatchesPropertyFunctionFactory); PropertyFunctionRegistry.set(ds.getContext(), reg);

MatchesPropertyFunctionFactory

 public class MatchesPropertyFunctionFactory implements PropertyFunctionFactory { @Override public PropertyFunction create(final String uri) { return new PFuncSimpleAndList() { @Override public QueryIterator execEvaluated(final Binding parent, final Node subject, final Node predicate, final PropFuncArg object, final ExecutionContext execCxt) { /* TODO insert your stuff to perform testing. Note that you'll need * to validate that things like subject/predicate/etc are bound */ final boolean nonzeroPercentMatch = true; // XXX example-specific kludge final Double percent = 0.75; // XXX example-specific kludge if( nonzeroPercentMatch ) { final Binding binding = BindingFactory.binding(parent, Var.alloc(object.getArg(1)), NodeFactory.createLiteral(percent.toString(), XSDDatatype.XSDdecimal)); return QueryIterSingleton.create(binding, execCtx); } else { return QueryIterNullIterator.create(execCtx); } } }; } }

因为我们创建的属性函数将列表作为参数，所以我们使用PFuncSimpleAndList作为抽象实现。除此之外，在这些属性函数中发生的大多数魔法都是创建Binding ， QueryIterator和执行输入参数的validation。

validation/结算说明

这应该足以让你继续编写自己的属性函数，如果那是你想要存放字符串匹配逻辑的地方。

未显示的是输入validation。在这个答案中，我假设subject和第一个列表参数（ object.getArg(0) ）是绑定的（ Node.isConcrete() ），并且第二个列表参数（ object.getArg(1) ）不是（ Node.isVariable() ）。如果不以这种方式调用您的方法，事情就会爆炸。强化方法（将许多if-else块与条件检查放在一起）或支持替代用例（即：查找object.getArg(0)值，如果它是变量）留给读者（因为它很难演示，易于测试，并且在实施过程中非常明显）。

我如何编写在Java Code中使用相似性度量的SPARQL查询

Java 7u51 / 7u55使用星号显示变量

在GZIP文件中查找文件的大小

试图使用OpenCV可移植的java应用程序（Executable Jar）。得到不满意的链接错误

如何在Java中阅读icy协议？

使用log4j的多个日志文件

接受List作为Jersey Web服务的参数，该Web服务使用多部分的内容类型

需要使用Enum

使用URLClassLoader重新加载jar时出现问题

自动填充集

JButton，JCheckBox和类似的交互者不会在视觉上改变

我如何编写在Java Code中使用相似性度量的SPARQL查询

Java 7u51 / 7u55使用星号显示变量

在GZIP文件中查找文件的大小

试图使用OpenCV可移植的java应用程序（Executable Jar）。 得到不满意的链接错误

如何在Java中阅读icy协议？

使用log4j的多个日志文件

接受List作为Jersey Web服务的参数，该Web服务使用多部分的内容类型

需要使用Enum

使用URLClassLoader重新加载jar时出现问题

自动填充集

JButton，JCheckBox和类似的交互者不会在视觉上改变

试图使用OpenCV可移植的java应用程序（Executable Jar）。得到不满意的链接错误