gora-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From SJC Multimedia <sjcmultime...@gmail.com>
Subject Nutch + Gora + Hbase client ( BigTable )
Date Mon, 30 Oct 2017 18:08:32 GMT
Hi

I am trying out Google BigTable as a nutch backend for which there is no
official documentation that its supported. However I dont see any reason
why it would be not be possible so I am giving it a shot.

I have upgraded Gora to 0.8 version with Nutch 2.3.1 and JDK to 1.8.

Currently while utilizing *bigtable-hbase-1.x-hadoop-1.0.0-pre3.jar *version,
call to Bigtable fails while performing flushCommits as part of inject
operation. I do see the table getting created on the BigTable side but the
table is empty.

The exception by itself is not enough to give us an answer.  The
UnsupportedOperationException is a bit strange.  I'm not sure where that's
coming from.  Here
<https://cloud.google.com/bigtable/docs/hbase-batch-exceptions>'s a guide
on getting more information from a RetriesExhaustedWithDetailsException,
since neither Gora or BigtableBufferedMutator are under our control.

This seems like a client-side thing, so this is likely some strange
interaction between BigTable library and Gora.

*Any suggestion on how exactly to figure out what is the issue here?*


Here is grpc session info:

2017-10-27 17:37:51,462 INFO  grpc.BigtableSession - Bigtable options:
BigtableOptions{dataHost=bigtable.googleapis.com, tableAdminHost=
bigtableadmin.googleapis.com, instanceAdminHost=bigtableadmin.googleapis.com,
projectId=xxxxxx-dev, instanceId=big-table-nutch-test,
userAgent=hbase-1.2.0-cdh5.13.0, credentialType=DefaultCredentials,
port=443, dataChannelCount=20, retryOptions=RetryOptions{retriesEnabled=true,
allowRetriesWithoutTimestamp=false, statusToRetryOn=[INTERNAL,
DEADLINE_EXCEEDED, ABORTED, UNAUTHENTICATED, UNAVAILABLE],
initialBackoffMillis=5, maxElapsedBackoffMillis=60000,
backoffMultiplier=2.0, streamingBufferSize=60,
readPartialRowTimeoutMillis=60000, maxScanTimeoutRetries=3},
bulkOptions=BulkOptions{asyncMutatorCount=2, useBulkApi=true,
bulkMaxKeyCount=25, bulkMaxRequestSize=1048576, autoflushMs=0,
maxInflightRpcs=1000, maxMemory=93218406, enableBulkMutationThrottling=false,
bulkMutationRpcTargetMs=100},
callOptionsConfig=CallOptionsConfig{useTimeout=false,
shortRpcTimeoutMs=60000, longRpcTimeoutMs=600000},
usePlaintextNegotiation=false}.

Getting following error:

2017-10-27 17:37:51,660 ERROR store.HBaseStore - Failed 1 action:
UnsupportedOperationException: 1 time, servers with issues:
bigtable.googleapis.com,
org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed
1 action: UnsupportedOperationException: 1 time, servers with issues:
bigtable.googleapis.com,
at com.google.cloud.bigtable.hbase.BigtableBufferedMutator.hand
leExceptions(BigtableBufferedMutator.java:271)
at com.google.cloud.bigtable.hbase.BigtableBufferedMutator.muta
te(BigtableBufferedMutator.java:198)
at org.apache.gora.hbase.store.HBaseTableConnection.flushCommit
s(HBaseTableConnection.java:115)
at org.apache.gora.hbase.store.HBaseTableConnection.close(HBase
TableConnection.java:127)
at org.apache.gora.hbase.store.HBaseStore.close(HBaseStore.java:819)
at org.apache.gora.mapreduce.GoraRecordWriter.close(GoraRecordW
riter.java:56)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.cl
ose(MapTask.java:647)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:770)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.
run(LocalJobRunner.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
Executor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
lExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Thanks
Akshar

Mime
View raw message