Tag: hbase

如何避免Hbase put操作中的InterruptedIOException: 在使用HTable.put方法将数据放入Hbase ，我偶尔会HTable.put以下exception。但是当我检查该特定rowkey的get操作时，数据实际上已写入rowkey 。同时我搜索了HMaster和HRegionservers的日志来识别问题。但无法找到。请帮助微调Hbase配置以避免InterruptedIOException。 Hadoop Distribution: Apache Version: HBase 1.2.6 Cluster size: 12nodes java.io.InterruptedIOException: #17209, interrupted. currentNumberOfTask=1 at org.apache.hadoop.hbase.client.AsyncProcess.waitForMaximumCurrentTasks(AsyncProcess.java:1764) at org.apache.hadoop.hbase.client.AsyncProcess.waitForMaximumCurrentTasks(AsyncProcess.java:1734) at org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1810) at org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:240) at org.apache.hadoop.hbase.client.BufferedMutatorImpl.flush(BufferedMutatorImpl.java:190) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1434) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:1018) 请帮忙解决某人遇到了同样的例外。但是在该线程中，没有解释为了避免它需要检查哪些配置 https://groups.google.com/forum/#!topic/nosql-databases/UxfrmWl_ZnM

Hbase读取性能exception变化: 我安装了HBase 0.94.0。我必须通过扫描提高我的阅读性能。我已经随机插入了100000条记录。当我设置setCache(100); 对于100000条记录，我的表现是16秒。当我将它设置为setCache(50)对于100000条记录，我的表现为90秒。当我将它设置为setCache(10); 对于100000条记录，我的表现是16秒 public class Test { public static void main(String[] args) { long start, middle, end; HTableDescriptor descriptor = new HTableDescriptor(“Student7”); descriptor.addFamily(new HColumnDescriptor(“No”)); descriptor.addFamily(new HColumnDescriptor(“Subject”)); try { HBaseConfiguration config = new HBaseConfiguration(); HBaseAdmin admin = new HBaseAdmin(config); admin.createTable(descriptor); HTable table = new HTable(config, “Student7”); System.out.println(“Table created !”); start […]

使用RowFilter无法正常查询HBase表: 我有一个HBase表（来自java），我想通过键列表查询表。我做了以下，但它不起作用。 mFilterFeatureIt = mFeatureSet.iterator(); FilterList filterList=new FilterList(FilterList.Operator.MUST_PASS_ONE); while (mFilterFeatureIt.hasNext()) { long myfeatureId = mFilterFeatureIt.next(); System.out.println(“FeatureId:”+myfeatureId+” , “); RowFilter filter = new RowFilter(CompareOp.EQUAL,new BinaryComparator(Bytes.toBytes(myfeatureId)) ); filterList.addFilter(filter); } outputMap = HbaseUtils.getHbaseData(“mytable”, filterList); System.out.println(“Size of outputMap map:”+ outputMap.szie()); public static Map<String, Map> getHbaseData(String table, FilterList filter) { Map<String, Map> data = new HashMap<String, Map>(); HTable htable = […]

Clojure和HBase：通过扫描迭代懒惰: 假设我想在clojure中打印hbase表扫描的输出。 (defmulti scan (fn [table & args] (map class args))) (defmethod scan [java.lang.String java.lang.String] [table start-key end-key] (let [scan (Scan. (Bytes/toBytes start-key) (Bytes/toBytes end-key))] (let [scanner (.getScanner table scan)] (doseq [result scanner] (prn (Bytes/toString (.getRow result)) (get-to-map result)))))) get-to-map将结果转换为地图。它可以像这样运行： (hbase.table/scan table “key000001” “key999999”) 但是，如果我想让用户对扫描结果做些什么呢？我可以允许它们将函数作为回调函数传递给每个结果。但我的问题是：如果我希望用户能够懒散地迭代每个结果，我会返回什么 (Bytes/toString (.getRow result)) (get-to-map result) 而不是保留以前的结果，就像在lazy-seq的简单化中所发生的那样。

SPARK到HBase写作: 我的SPARK计划的流程如下：驱动程序 – >创建Hbase连接 – >广播Hbase句柄现在从执行程序，我们获取此句柄并尝试写入hbase 在Driver程序中，我正在创建HBase conf对象和Connection Object，然后通过JavaSPARK Context广播它，如下所示： SparkConf sparkConf = JobConfigHelper.getSparkConfig(); Configuration conf = new Configuration(); UserGroupInformation.setConfiguration(conf); jsc = new JavaStreamingContext(sparkConf, Durations.milliseconds(Long.parseLong(batchDuration))); Configuration hconf=HBaseConfiguration.create(); hconf.addResource(new Path(“/etc/hbase/conf/core-site.xml”)); hconf.addResource(new Path(“/etc/hbase/conf/hbase-site.xml”)); UserGroupInformation.setConfiguration(hconf); JavaSparkContext js = jsc.sparkContext(); Connection connection = ConnectionFactory.createConnection(hconf); connectionbroadcast=js.broadcast(connection); 执行器的内部call（）方法， Table table = connectionbroadcast.getValue().getTable(TableName.valueOf(“gfttsdgn:FRESHHBaseRushi”)) ; Put p = new Put(Bytes.toBytes(“row1”)); p.add(Bytes.toBytes(“c1”), Bytes.toBytes(“output”), Bytes.toBytes(“rohan”)); […]

运行hbase MR作业时，我的cdh5.2集群会出现FileNotFoundException: 我的cdh5.2群集运行hbase MR作业时出现问题。例如，我将hbase类路径添加到hadoop类路径中： vi /etc/hadoop/conf/hadoop-env.sh 添加行： export HADOOP_CLASSPATH=”/usr/lib/hbase/bin/hbase classpath:$HADOOP_CLASSPATH” 当我运行时： hadoop jar /usr/lib/hbase/hbase-server-0.98.6-cdh5.2.1.jar rowcounter “mytable” 我得到以下exception： 14/12/09 03:44:02 WARN security.UserGroupInformation: PriviledgedActionException as:root (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not exist: hdfs://clusterName/usr/lib/hbase/lib/hbase-client-0.98.6-cdh5.2.1.jar Exception in thread “main” java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hbase.mapreduce.Driver.main(Driver.java:54) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at […]

将java代码从hbase 0.92迁移到0.98.0-hadoop2: 我有一些代码，用hbase 0.92写道： /** * Writes the given scan into a Base64 encoded string. * * @param scan The scan to write out. * @return The scan saved in a Base64 encoded string. * @throws IOException When writing the scan fails. */ public static String convertScanToString(Scan scan) throws IOException { ByteArrayOutputStream out = new ByteArrayOutputStream(); DataOutputStream […]

HBase：primefaces’检查行不存在并创建’操作: 我建议这应该是常见的情况之一，但可能在谷歌搜索时使用错误的关键字。我只需要用完全随机的密钥创建新的表记录。假设我获得了具有良好随机性的密钥（几乎是随机的）。但是，我不能100％确定没有行存在。所以我需要primefaces地做什么：有行键检查还没有行。如果行存在则拒绝操作。如果没有退出，请创建行。我在这个主题上找到的最有用的信息是关于HBase行锁的文章。我认为HBase行锁是合适的解决方案，但我想在没有显式行锁定的情况下更好地做到这一点。 ICV看起来不合适，因为我确实希望密钥是随机的。如果他们可以在“行不存在”条件下工作，那么CAS会很棒，但看起来他们不能。显式行锁具有区域拆分问题等缺点。有人可以加入有用的建议吗？优选的API是基于Java的，但实际上它更多的是概念而不是实现。

从Storm bolt中将行插入HBase: 我希望能够从分布式（非本地）Storm拓扑中将新条目写入HBase。存在一些GitHub项目，它们提供HBase Mappers或预先制作的Storm bolt来将元组写入HBase。这些项目提供了在LocalCluster上执行样本的说明。我遇到这两个项目并直接从bolt中访问HBase API的问题是它们都需要将HBase-site.xml文件包含在类路径中。使用直接API方法，也可能使用GitHub方法，当您执行HBaseConfiguration.create(); 它将尝试从类路径上的条目中查找所需的信息。如何修改storm bolt的类路径以包含Hbase配置文件？更新：使用danehammer的答案，这就是我的工作方式将以下文件复制到〜/ .storm目录中： HBase的-共0.98.0.2.1.2.0-402-hadoop2.jar HBase的-site.xml中 storm.yaml：注意：如果你没有将storm.yaml复制到该目录中，那么storm jar命令将不会在类路径中使用该目录（请参阅storm.py python脚本以查看自己的逻辑 – 如果这被记录在案）接下来，在拓扑类的main方法中获取HBase配置并对其进行序列化： final Configuration hbaseConfig = HBaseConfiguration.create(); final DataOutputBuffer databufHbaseConfig = new DataOutputBuffer(); hbaseConfig.write(databufHbaseConfig); final byte[] baHbaseConfigSerialized = databufHbaseConfig.getData(); 通过构造函数将字节数组传递给spout类。 spout类将此字节数组保存到字段中（不要在构造函数中反序列化。我发现如果spout有一个Configuration字段，你将在运行拓扑时得到一个无法序列化的exception）在spout的open方法中，反序列化配置并访问hbase表： Configuration hBaseConfiguration = new Configuration(); ByteArrayInputStream bas = new ByteArrayInputStream(baHbaseConfigSerialized); hBaseConfiguration.readFields(new DataInputStream(bas)); HTable […]

将1GB数据加载到hbase中需要1小时: 我想将1GB（1000万条记录）CSV文件加载到Hbase中。我为它写了Map-Reduce程序。我的代码工作正常但需要1小时才能完成。最后减速机耗时超过半小时。有人可以帮帮我吗？我的守则如下： Driver.Java 包com.cloudera.examples.hbase.bulkimport; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.hbase.HBaseConfiguration; import org.apache.hadoop.hbase.KeyValue; import org.apache.hadoop.hbase.client.HTable; import org.apache.hadoop.hbase.io.ImmutableBytesWritable; import org.apache.hadoop.hbase.mapreduce.HFileOutputFormat; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.input.TextInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; / ** * HBase批量导入示例 *数据准备MapReduce作业驱动* * args [0]：HDFS输入路径* args [1]：HDFS输出路径* args [2]：HBase表名* * / public class Driver {public static void main（String [] args）throws Exception {Configuration […]