解压缩由Nexus生成的Maven存储库索引

我已经从http://mirrors.ibiblio.org/pub/mirrors/maven2/dot-index/nexus-maven-repository-index.gz下载了为Maven Central生成的索引。

我想列出这些索引文件中的工件信息（例如groupId，artifactId，version）。我已经读过，有一个高级API。似乎我必须使用以下maven依赖。但是，我不知道使用什么入口点（哪个类？）以及如何使用它来访问这些文件：

 org.sonatype.nexus nexus-indexer 3.0.4

快来看看https://github.com/cstamas/maven-indexer-examples项目。

简而言之：您不需要手动下载GZ / ZIP（新/旧格式），索引器会为您完成这项工作（此外，如果可能，它也将为您处理增量更新）。

GZ是“新”格式，独立于Lucene索引格式（因此，独立于Lucene版本）仅包含数据，而ZIP是“旧”格式，这实际上是简单的Lucene 2.4.x索引压缩。目前没有数据内容发生变化，但计划在将来进行。

正如我所说，两者之间没有数据内容差异，但是某些字段（如您所注意到的）是索引但未存储在索引中，因此，如果您使用ZIP格式，您将可以搜索它们，但不能检索。

https://github.com/cstamas/maven-indexer-examples已过时。构建失败（测试不通过）。

Nexus Indexer已经移动并包含了这些示例： https ： //github.com/apache/maven-indexer/tree/master/indexer-examples

这构建，代码工作。

如果您想自己推出，这是一个简化版本：

Maven的：

   org.apache.maven.indexer indexer-core 6.0-SNAPSHOT compile    org.apache.maven.wagon wagon-http-lightweight 2.3 compile    org.eclipse.sisu org.eclipse.sisu.plexus 0.2.1   org.sonatype.sisu sisu-guice 3.2.4

Java的：

 public IndexToGavMappingConverter(File dataDir, String id, String url) throws PlexusContainerException, ComponentLookupException, IOException { this.dataDir = dataDir; // Create Plexus container, the Maven default IoC container. final DefaultContainerConfiguration config = new DefaultContainerConfiguration(); config.setClassPathScanning( PlexusConstants.SCANNING_INDEX ); this.plexusContainer = new DefaultPlexusContainer(config); // Lookup the indexer components from plexus. this.indexer = plexusContainer.lookup( Indexer.class ); this.indexUpdater = plexusContainer.lookup( IndexUpdater.class ); // Lookup wagon used to remotely fetch index. this.httpWagon = plexusContainer.lookup( Wagon.class, "http" ); // Files where local cache is (if any) and Lucene Index should be located this.centralLocalCache = new File( this.dataDir, id + "-cache" ); this.centralIndexDir = new File( this.dataDir, id + "-index" ); // Creators we want to use (search for fields it defines). // See https://maven.apache.org/maven-indexer/indexer-core/apidocs/index.html?constant-values.html List indexers = new ArrayList(); // https://maven.apache.org/maven-indexer/apidocs/org/apache/maven/index/creator/MinimalArtifactInfoIndexCreator.html indexers.add( plexusContainer.lookup( IndexCreator.class, "min" ) ); // https://maven.apache.org/maven-indexer/apidocs/org/apache/maven/index/creator/JarFileContentsIndexCreator.html //indexers.add( plexusContainer.lookup( IndexCreator.class, "jarContent" ) ); // https://maven.apache.org/maven-indexer/apidocs/org/apache/maven/index/creator/MavenPluginArtifactInfoIndexCreator.html //indexers.add( plexusContainer.lookup( IndexCreator.class, "maven-plugin" ) ); // Create context for central repository index. this.centralContext = this.indexer.createIndexingContext( id + "Context", id, this.centralLocalCache, this.centralIndexDir, url, null, true, true, indexers ); } final IndexSearcher searcher = this.centralContext.acquireIndexSearcher(); try { final IndexReader ir = searcher.getIndexReader(); Bits liveDocs = MultiFields.getLiveDocs(ir); for ( int i = 0; i < ir.maxDoc(); i++ ) { if ( liveDocs == null || liveDocs.get( i ) ) { final Document doc = ir.document( i ); final ArtifactInfo ai = IndexUtils.constructArtifactInfo( doc, this.centralContext ); if (ai == null) continue; if (ai.getSha1() == null) continue; if (ai.getSha1().length() != 40) continue; if ("javadoc".equals(ai.getClassifier())) continue; if ("sources".equals(ai.getClassifier())) continue; out.append(StringUtils.lowerCase(ai.getSha1())).append(' '); out.append(ai.getGroupId()).append(":"); out.append(ai.getArtifactId()).append(":"); out.append(ai.getVersion()).append(":"); out.append(StringUtils.defaultString(ai.getClassifier())); out.append('\n'); } } } finally { this.centralContext.releaseIndexSearcher( searcher ); }

我们在Windup项目中使用它- JBoss迁移工具。

传统的zip索引是一个简单的lucene索引。我能够用Luke打开它并编写一些简单的lucene代码来转储感兴趣的标题（在这种情况下为“u”）

 import org.apache.lucene.document.Document; import org.apache.lucene.search.IndexSearcher; public class Dumper { public static void main(String[] args) throws Exception { IndexSearcher searcher = new IndexSearcher("c:/PROJECTS/Test/index"); for (int i = 0; i < searcher.maxDoc(); i++) { Document doc = searcher.doc(i); String metadata = doc.get("u"); if (metadata != null) { System.out.println(metadata); } } } }

样品输出......

 org.ioke|ioke-lang-lib|P-0.4.0-p11|NA org.jboss.weld.archetypes|jboss-javaee6-webapp|1.0.1.CR2|sources|jar org.jboss.weld.archetypes|jboss-javaee6-webapp|1.0.1.CR2|NA org.nutz|nutz|1.b.37|javadoc|jar org.nutz|nutz|1.b.37|sources|jar org.nutz|nutz|1.b.37|NA org.openengsb.wrapped|com.google.gdata|1.41.5.w1|NA org.openengsb.wrapped|openengsb-wrapped-parent|6|NA

可能有更好的方法来实现这一目标......

如何解析/解压缩/解压缩由Nexus生成的Maven存储库索引

java.util.NoSuchElementException在java中使用iterator

Mockito – 期望0匹配，1记录（InvalidUseOfMatchersException）

用Java中的构造函数inheritance

java.lang.NoClassDefFoundError：无法初始化类org.com.hibernate.HibernateUtil

根据配置文件在Spring中加载属性文件

ArrayList.java中List接口的冗余实现

java：原始数组 – 它们被初始化了吗？

2011年的Java：线程套接字VS NIO：在64位操作系统和最新的Java版本上可以选择什么？

robots.txt解析器java

改变teechart中的单点颜色