hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Artyom Shvedchikov <sho...@gmail.com>
Subject Re: HBase 0.20.1 on Ubuntu 9.04: master fails to start
Date Thu, 29 Oct 2009 12:50:08 GMT
Dear, Tatsuya

1. Delete hadoop data directory
> 2. bin/hadoop namenode -format
> 3. bin/start-all.sh
>    -> namenode will start immediately and go in service, but data
> node will be making a long (almost seven minutes) pause in a middle of
> the startup.
>
> 4. Before the data node becomes ready, do an HDFS write operation
> (e.g. "bin/hadoop fs -put conf input"), and then the write operations
> will fail with the following error:
>

Today I tried to restart Hadoop and HBase skipping step #1 and step #2.
First I stop HBase, then Hadoop and then start Hadoop, wait for 10 minutes
and start HBase - it works. Data was not lost and was available to read and
etc. Then I tried to scan several times the table with 6 000 000 rows and
HBase hanged down again with the same exceptions as in my previous post (see
post at Thu, 29 Oct, 10:06).

hbase(main):006:0> list
> NativeException: org.apache.hadoop.hbase.client.RetriesExhaustedException:
> Trying to contact region server 127.0.0.1:57613 for region .META.,,1, row
> '', but failed after 5 attempts.
> Exceptions:
> java.net.ConnectException: Connection refused
> java.net.ConnectException: Connection refused
> java.net.ConnectException: Connection refused
> java.net.ConnectException: Connection refused
> java.net.ConnectException: Connection refused
>
>     from org/apache/hadoop/hbase/client/HConnectionManager.java:1001:in
> `getRegionServerWithRetries'
>     from org/apache/hadoop/hbase/client/MetaScanner.java:55:in `metaScan'
>     from org/apache/hadoop/hbase/client/MetaScanner.java:28:in `metaScan'
>     from org/apache/hadoop/hbase/client/HConnectionManager.java:432:in
> `listTables'
>     from org/apache/hadoop/hbase/client/HBaseAdmin.java:127:in `listTables'
>     from sun/reflect/NativeMethodAccessorImpl.java:-2:in `invoke0'
>     from sun/reflect/NativeMethodAccessorImpl.java:39:in `invoke'
>     from sun/reflect/DelegatingMethodAccessorImpl.java:25:in `invoke'
>     from java/lang/reflect/Method.java:597:in `invoke'
>     from org/jruby/javasupport/JavaMethod.java:298:in
> `invokeWithExceptionHandling'
>     from org/jruby/javasupport/JavaMethod.java:259:in `invoke'
>     from org/jruby/java/invokers/InstanceMethodInvoker.java:36:in `call'
>     from org/jruby/runtime/callsite/CachingCallSite.java:70:in `call'
>     from org/jruby/ast/CallNoArgNode.java:61:in `interpret'
>     from org/jruby/ast/ForNode.java:104:in `interpret'
>     from org/jruby/ast/NewlineNode.java:104:in `interpret'
> ... 110 levels...
>     from hadoop/hbase/bin/$_dot_dot_/bin/hirb#start:-1:in `call'
>     from org/jruby/internal/runtime/methods/DynamicMethod.java:226:in
> `call'
>     from org/jruby/internal/runtime/methods/CompiledMethod.java:211:in
> `call'
>     from org/jruby/internal/runtime/methods/CompiledMethod.java:71:in
> `call'
>     from org/jruby/runtime/callsite/CachingCallSite.java:253:in
> `cacheAndCall'
>     from org/jruby/runtime/callsite/CachingCallSite.java:72:in `call'
>     from hadoop/hbase/bin/$_dot_dot_/bin/hirb.rb:497:in `__file__'
>     from hadoop/hbase/bin/$_dot_dot_/bin/hirb.rb:-1:in `load'
>     from org/jruby/Ruby.java:577:in `runScript'
>     from org/jruby/Ruby.java:480:in `runNormally'
>     from org/jruby/Ruby.java:354:in `runFromMain'
>     from org/jruby/Main.java:229:in `run'
>     from org/jruby/Main.java:110:in `run'
>     from org/jruby/Main.java:94:in `main'
>     from /hadoop/hbase/bin/../bin/hirb.rb:338:in `list'
>     from (hbase):7hbase(main):007:0> status
> 0 servers, 0 dead, NaN average load
> hbase(main):008:0> exit
>

Full hbase and hadoop logs can be found in my post at Thu, 29 Oct, 07:52

The main issue for now is that HBase hangs down each time after I try to
scan the table (after second or third time). By the way, this time it was
enough to restart HBase only. And it was became available to scan/get/put
operations.

Table structure:

hbase(main):003:0> describe 'channel_products'
> DESCRIPTION
> ENABLED
>  {NAME => 'channel_products', FAMILIES => [{NAME => 'active', VERSIONS
> true
>  => '3', COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE =>
> '6553
>  6', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME =>
> 'channel_cat
>  egory_id', VERSIONS => '3', COMPRESSION => 'NONE', TTL =>
> '2147483647'
>  , BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'},
> {
>  NAME => 'channel_id', VERSIONS => '3', COMPRESSION => 'NONE', TTL =>
> '
>  2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE
> =>
>   'true'}, {NAME => 'contract_id', VERSIONS => '3', COMPRESSION =>
> 'NON
>  E', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
> B
>  LOCKCACHE => 'true'}, {NAME => 'created_at', VERSIONS => '3',
> COMPRESS
>  ION => 'NONE', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY
> =>
>   'false', BLOCKCACHE => 'true'}, {NAME => 'shop_category_id',
> VERSIONS
>   => '3', COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE =>
> '655
>  36', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME =>
> 'shop_id',
>  VERSIONS => '3', COMPRESSION => 'NONE', TTL => '2147483647',
> BLOCKSIZE
>   => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME =>
> 'sh
>  op_product_id', VERSIONS => '3', COMPRESSION => 'NONE', TTL =>
> '214748
>  3647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE =>
> 'true
>  '}, {NAME => 'updated_at', VERSIONS => '3', COMPRESSION => 'NONE',
> TTL
>   => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
> BLOCKCAC
>  HE =>
> 'true'}]}
>
> 1 row(s) in 0.0630 seconds
>

Table contains ~ 6 000 000 rows, each value is a String.

Code to scan the table:

 protected void doGet(HttpServletRequest request, HttpServletResponse
> response) throws ServletException, IOException {
>
  Date startDate = new Date();
>
  Date finishDate;
>
  log(startDate + ": Get activation status started");
>
  String shop_id = request.getParameter("shop_id");
>

>   String[] shop_product_ids =
> request.getParameterValues("shop_product_ids");
>
  if (shop_product_ids != null && shop_product_ids.length == 1) {
>
  shop_product_ids = shop_product_ids[0].split(",");
>
  }
>

>   String channel_id = request.getParameter("channel_id");
>
  String channel_category_id = request.getParameter("channel_category_id");
>

>   String tableName = "channel_products";
>
  StringBuffer result = new StringBuffer("<?xml version=\"1.0\"?>");
>

>   if (this.admin.tableExists(tableName)) {
>
  result.append("<result>");
>

>   HTable table = new HTable(this.configuration, tableName);
>

>   Scan scan = new Scan();
>

>   FilterList mainFilterList = new FilterList();
>

>   if (shop_id != null) {
>
  mainFilterList.addFilter(new
> SingleColumnValueFilter(Bytes.toBytes("shop_id"),Bytes.toBytes(""),CompareFilter.CompareOp.EQUAL,
> Bytes.toBytes(shop_id)));
>
  }
>
  if (channel_id != null) {
>
  mainFilterList.addFilter(new
> SingleColumnValueFilter(Bytes.toBytes("channel_id"),Bytes.toBytes(""),CompareFilter.CompareOp.EQUAL,
> Bytes.toBytes(channel_id)));
>
  }
>
  if (channel_category_id != null) {
>
  mainFilterList.addFilter(new
> SingleColumnValueFilter(Bytes.toBytes("channel_category_id"),Bytes.toBytes(""),CompareFilter.CompareOp.EQUAL,
> Bytes.toBytes(channel_category_id)));
>
  }
>

>   if (shop_product_ids != null && shop_product_ids.length > 0) {
>
  List<Filter> filterList = new ArrayList<Filter>();
>
  for (String shop_product_id : shop_product_ids) {
>
  filterList.add(new
> SingleColumnValueFilter(Bytes.toBytes("shop_product_id"),Bytes.toBytes(""),CompareFilter.CompareOp.EQUAL,
> Bytes.toBytes(shop_product_id)));
>
  }
>
  FilterList filters = new FilterList(FilterList.Operator.MUST_PASS_ONE,
> filterList);
>
  mainFilterList.addFilter(filters);
>
  }
>

>   scan.setFilter(mainFilterList);
>
  ResultScanner scanner = null;
>
  try {
>
  scanner = table.getScanner(scan);
>
  for (Result item : scanner) {
>
  getItemXml(result, item);
>
  }
>
  } catch (Exception e) {
>
  logError("Error during table scan: ", e);
>
  result.append("<error>").append("Error during table scan: " +
> e).append("</error>");
>
  } finally {
>
  try {
>
  scanner.close();
>
  } catch (Exception e1) {
>
  //Can be null, skip
>
  }
>
  result.append("</result>");
>
  }
>
  } else {
>
  result.append("<result>").append("Table " + tableName + " not
> exists!").append("</result>");
>
  }
>
  finishDate = new Date();
>
  log(finishDate + ": Get activation status finihed, duration: " +
> (finishDate.getTime() - startDate.getTime()) + " ms");
>

>   response.getOutputStream().print(result.toString());
>
  }
>

I checked regionserver logs, but regionserver was not started:

> 2009-10-29 13:34:13,754 WARN
> org.apache.hadoop.hbase.regionserver.HRegionServer: Not starting a distinct
> region server because hbase.cluster.distributed is false

HBase and Hadoop were configured according to "Getting Started" section on *
hadoop.org*. They are both started in Pseudo-distributed mode.
May be I should set this setting *hbase.cluster.distributed* to true?
I'll  try to increase RAM capacity. And then I'll write here about results
-------------------------------------------------
Best wishes, Artyom Shvedchikov

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message