hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Busbey <bus...@apache.org>
Subject Re: NoSuchMethodError: org.apache.hadoop.hbase.CellComparator.getInstance() while trying to bulk load in hbase using spark
Date Tue, 08 Oct 2019 21:26:57 GMT
Please use a released version of the hbase-connectors project. The master
branch isn't meant for downstream use (or feel free to come over to
dev@hbase and help us work on the not-yet-released version).

Please be aware hbase 2.0 is EOM and you should upgrade.

FYI, I've only seen testing of hbase-connectors with hbase 2.1+, so this
may or may not be possible on HBase 2.0.

That error message looks like you have an older hbase jar on the classpath.
That method is present both in hbase 2.0 and 2.1.

Can you get the full classpath out of the executor logs and see if it has
more hbase references?

On Tue, Oct 8, 2019, 10:03 Karthik Bikkumala <bikkumala.karthik@gmail.com>
wrote:

> Hi,
>
> I am trying to Bulk Load data from HDFS to HBase. I used the following
> example
>
> https://github.com/apache/hbase-connectors/blob/master/spark/hbase-spark/src/main/java/org/apache/hadoop/hbase/spark/example/hbasecontext/JavaHBaseBulkLoadExample.java
>
>
> I built the module with following command
>
> mvn -Dspark.version=2.4.3 -Dscala.version=2.11.7
> -Dscala.binary.version=2.11 clean install
>
> when i tried to run the example using the spark-submit command, i am
> getting the following error:
>
> Caused by: java.lang.NoSuchMethodError:
>
> org.apache.hadoop.hbase.CellComparator.getInstance()Lorg/apache/hadoop/hbase/CellComparator;
>
> at
>
> org.apache.hadoop.hbase.regionserver.StoreFileWriter$Builder.<init>(StoreFileWriter.java:348)
>
> at org.apache.hadoop.hbase.spark.HBaseContext.org
>
> $apache$hadoop$hbase$spark$HBaseContext$$getNewHFileWriter(HBaseContext.scala:928)
>
> at
>
> org.apache.hadoop.hbase.spark.HBaseContext$$anonfun$2.apply(HBaseContext.scala:1023)
>
> at
>
> org.apache.hadoop.hbase.spark.HBaseContext$$anonfun$2.apply(HBaseContext.scala:972)
>
> at scala.collection.mutable.HashMap.getOrElseUpdate(HashMap.scala:79)
>
> at org.apache.hadoop.hbase.spark.HBaseContext.org
>
> $apache$hadoop$hbase$spark$HBaseContext$$writeValueToHFile(HBaseContext.scala:972)
>
> at
>
> org.apache.hadoop.hbase.spark.HBaseContext$$anonfun$bulkLoad$3$$anonfun$apply$7.apply(HBaseContext.scala:677)
>
> at
>
> org.apache.hadoop.hbase.spark.HBaseContext$$anonfun$bulkLoad$3$$anonfun$apply$7.apply(HBaseContext.scala:675)
>
> at scala.collection.Iterator$class.foreach(Iterator.scala:891)
>
> at
>
> org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)
>
> at
>
> org.apache.hadoop.hbase.spark.HBaseContext$$anonfun$bulkLoad$3.apply(HBaseContext.scala:675)
>
> at
>
> org.apache.hadoop.hbase.spark.HBaseContext$$anonfun$bulkLoad$3.apply(HBaseContext.scala:664)
>
> at org.apache.hadoop.hbase.spark.HBaseContext.org
>
> $apache$hadoop$hbase$spark$HBaseContext$$hbaseForeachPartition(HBaseContext.scala:490)
>
> at
>
> org.apache.hadoop.hbase.spark.HBaseContext$$anonfun$foreachPartition$1.apply(HBaseContext.scala:106)
>
> at
>
> org.apache.hadoop.hbase.spark.HBaseContext$$anonfun$foreachPartition$1.apply(HBaseContext.scala:106)
>
> at
>
> org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:935)
>
> at
>
> org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:935)
>
> at
>
> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
>
> at
>
> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
>
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
>
> at org.apache.spark.scheduler.Task.run(Task.scala:121)
>
> at
>
> org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402)
>
> at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
>
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408)
>
> at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>
> at java.lang.Thread.run(Thread.java:745)
>
>
>
> Please find the code here (pom.xml, mvn dependency tree, source file):
> https://gist.github.com/bikkumala/d2e349c7bfaffc673e8a641ff3ec9d33
>
> I tried with the following versions
> Spark : 2.4.x
> HBase : 2.0.x
> Hadoop : 2.7.x
>
>
> I created a JIRA for the same:
> https://jira.apache.org/jira/browse/HBASE-23135
>
> I've also tried excluding hbase-common from dependency and explicitly
> adding it as explained by Zheng Hu
> <https://jira.apache.org/jira/secure/ViewProfile.jspa?name=openinx> in
> above JIRA, but i am facing the same issue.
>
> Any help is appreciated.
>
>
> Thank You,
>
> Bikkumala.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message