如何在Java或Scala中读取和写入来自/到镶木地板文件的Map ?

寻找一个关于如何在Java或Scala中读取和写入来自/到镶木地板文件的Map的简明示例?

这是期望的结构,使用com.fasterxml.jackson.databind.ObjectMapper作为Java中的序列化程序(即使用镶木地板查找等效项):

 public static Map read(InputStream inputStream) throws IOException { ObjectMapper objectMapper = new ObjectMapper(); return objectMapper.readValue(inputStream, new TypeReference<Map>() { }); } public static void write(OutputStream outputStream, Map map) throws IOException { ObjectMapper objectMapper = new ObjectMapper(); objectMapper.writeValue(outputStream, map); } 

我对镶木地板不太满意但是,从这里 :

 Schema schema = new Schema.Parser().parse(Resources.getResource("map.avsc").openStream()); File tmp = File.createTempFile(getClass().getSimpleName(), ".tmp"); tmp.deleteOnExit(); tmp.delete(); Path file = new Path(tmp.getPath()); AvroParquetWriter writer = new AvroParquetWriter(file, schema); // Write a record with an empty map. ImmutableMap emptyMap = new ImmutableMap.Builder().build(); GenericData.Record record = new GenericRecordBuilder(schema) .set("mymap", emptyMap).build(); writer.write(record); writer.close(); AvroParquetReader reader = new AvroParquetReader(file); GenericRecord nextRecord = reader.read(); assertNotNull(nextRecord); assertEquals(emptyMap, nextRecord.get("mymap")); 

在您的情况下,使用默认地图更改ImmutableMap (Google Collections),如下所示:

 Schema schema = new Schema.Parser().parse( Resources.getResource( "map.avsc" ).openStream() ); File tmp = File.createTempFile( getClass().getSimpleName(), ".tmp" ); tmp.deleteOnExit(); tmp.delete(); Path file = new Path( tmp.getPath() ); AvroParquetWriter writer = new AvroParquetWriter( file, schema ); // Write a record with an empty map. Map emptyMap = new HashMap(); // not empty any more emptyMap.put( "SOMETHING", new SOMETHING() ); GenericData.Record record = new GenericRecordBuilder( schema ).set( "mymap", emptyMap ).build(); writer.write( record ); writer.close(); AvroParquetReader reader = new AvroParquetReader( file ); GenericRecord nextRecord = reader.read(); assertNotNull( nextRecord ); assertEquals( emptyMap, nextRecord.get( "mymap" ) ); 

我没有测试代码,但试一试..

Apache Drill是您的答案!

转换为镶木地板:您可以在钻取中使用CTAS(创建表格)function。 默认情况下,在执行以下查询后,drill会创建一个包含镶木地板文件 您可以替换任何查询并将您查询的输出写入镶木地板文件

 create table file_parquet as select * from dfs.`/data/file.json`; 

从镶木地板转换:我们也在这里使用CTASfunction,但是我们请求钻取使用不同的格式来编写输出

 alter session set `store.format`='json'; create table file_json as select * from dfs.`/data/file.parquet`; 

有关更多信息,请参阅http://drill.apache.org/docs/create-table-as-ctas-command/