从System读取文本文件到Hbase MapReduce

我需要将文本文件中的数据加载到Map Reduce,我很多天都很乖,但我没有找到适合我工作的解决方案。 是否有任何方法或类从系统读取text / csv文件并将数据存储到HBASE表中。 对我来说真的非常紧急,任何人都可以帮助我了解MapReduce F / w。

对于从文本文件中读取,首先文本文件应该在hdfs中。 您需要为作业指定输入格式和输出格式

Job job = new Job(conf, "example"); FileInputFormat.addInputPath(job, new Path("PATH to text file")); job.setInputFormatClass(TextInputFormat.class); job.setMapperClass(YourMapper.class); job.setMapOutputKeyClass(Text.class); job.setMapOutputValueClass(Text.class); TableMapReduceUtil.initTableReducerJob("hbase_table_name", YourReducer.class, job); job.waitForCompletion(true); 

YourReducer应该扩展org.apache.hadoop.hbase.mapreduce.TableReducer

样本减速器代码

 public class YourReducer extends TableReducer { private byte[] rawUpdateColumnFamily = Bytes.toBytes("colName"); /** * Called once at the beginning of the task. */ @Override protected void setup(Context context) throws IOException, InterruptedException { // something that need to be done at start of reducer } @Override public void reduce(Text keyin, Iterable values, Context context) throws IOException, InterruptedException { // aggregate counts int valuesCount = 0; for (Text val : values) { valuesCount += 1; // put date in table Put put = new Put(keyin.toString().getBytes()); long explicitTimeInMs = new Date().getTime(); put.add(rawUpdateColumnFamily, Bytes.toBytes("colName"), explicitTimeInMs,val.toString().getBytes()); context.write(keyin, put); } } } 

示例映射器类

 public static class YourMapper extends Mapper { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { String line = value.toString(); StringTokenizer tokenizer = new StringTokenizer(line); while (tokenizer.hasMoreTokens()) { word.set(tokenizer.nextToken()); context.write(word, one); } } }