如何在Java或Scala中读取和写入来自/到镶木地板文件的Map ?
寻找一个关于如何在Java或Scala中读取和写入来自/到镶木地板文件的Map
的简明示例?
这是期望的结构,使用com.fasterxml.jackson.databind.ObjectMapper
作为Java中的序列化程序(即使用镶木地板查找等效项):
public static Map read(InputStream inputStream) throws IOException { ObjectMapper objectMapper = new ObjectMapper(); return objectMapper.readValue(inputStream, new TypeReference<Map>() { }); } public static void write(OutputStream outputStream, Map map) throws IOException { ObjectMapper objectMapper = new ObjectMapper(); objectMapper.writeValue(outputStream, map); }
我对镶木地板不太满意但是,从这里 :
Schema schema = new Schema.Parser().parse(Resources.getResource("map.avsc").openStream()); File tmp = File.createTempFile(getClass().getSimpleName(), ".tmp"); tmp.deleteOnExit(); tmp.delete(); Path file = new Path(tmp.getPath()); AvroParquetWriter writer = new AvroParquetWriter (file, schema); // Write a record with an empty map. ImmutableMap emptyMap = new ImmutableMap.Builder().build(); GenericData.Record record = new GenericRecordBuilder(schema) .set("mymap", emptyMap).build(); writer.write(record); writer.close(); AvroParquetReader reader = new AvroParquetReader (file); GenericRecord nextRecord = reader.read(); assertNotNull(nextRecord); assertEquals(emptyMap, nextRecord.get("mymap"));
在您的情况下,使用默认地图更改ImmutableMap
(Google Collections),如下所示:
Schema schema = new Schema.Parser().parse( Resources.getResource( "map.avsc" ).openStream() ); File tmp = File.createTempFile( getClass().getSimpleName(), ".tmp" ); tmp.deleteOnExit(); tmp.delete(); Path file = new Path( tmp.getPath() ); AvroParquetWriter writer = new AvroParquetWriter ( file, schema ); // Write a record with an empty map. Map emptyMap = new HashMap(); // not empty any more emptyMap.put( "SOMETHING", new SOMETHING() ); GenericData.Record record = new GenericRecordBuilder( schema ).set( "mymap", emptyMap ).build(); writer.write( record ); writer.close(); AvroParquetReader reader = new AvroParquetReader ( file ); GenericRecord nextRecord = reader.read(); assertNotNull( nextRecord ); assertEquals( emptyMap, nextRecord.get( "mymap" ) );
我没有测试代码,但试一试..
Apache Drill是您的答案!
转换为镶木地板:您可以在钻取中使用CTAS(创建表格)function。 默认情况下,在执行以下查询后,drill会创建一个包含镶木地板文件 您可以替换任何查询并将您查询的输出写入镶木地板文件
create table file_parquet as select * from dfs.`/data/file.json`;
从镶木地板转换:我们也在这里使用CTASfunction,但是我们请求钻取使用不同的格式来编写输出
alter session set `store.format`='json'; create table file_json as select * from dfs.`/data/file.parquet`;
有关更多信息,请参阅http://drill.apache.org/docs/create-table-as-ctas-command/