spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Reading from Hbase using python
Date Wed, 12 Nov 2014 20:26:38 GMT
Can you give us a bit more detail:

hbase release you're using.
whether you can reproduce using hbase shell.

I did the following using hbase shell against 0.98.4:

hbase(main):001:0> create 'test', 'f1'
0 row(s) in 2.9140 seconds

=> Hbase::Table - test
hbase(main):002:0> put 'test', 'row1', 'f1:1', 'value1'
0 row(s) in 0.1040 seconds

hbase(main):003:0> put 'test', 'row1', 'f1:2', 'value2'
0 row(s) in 0.0080 seconds

hbase(main):004:0> scan 'test'
ROW                                      COLUMN+CELL
 row1                                    column=f1:1,
timestamp=1415823887048, value=value1
 row1                                    column=f1:2,
timestamp=1415823893857, value=value2

Cheers

On Wed, Nov 12, 2014 at 11:32 AM, Alan Prando <alan@scanboo.com.br> wrote:

> Hi all,
>
> I'm trying to read an hbase table using this an example from github (
> https://github.com/apache/spark/blob/master/examples/src/main/python/hbase_inputformat.py),
> however I have two qualifiers in a column family.
>
> Ex.:
>
>  ROW COLUMN+CELL  row1 column=f1:1, timestamp=1401883411986, value=value1  row1
> column=f1:2, timestamp=1401883415212, value=value2  row2 column=f1:1,
> timestamp=1401883417858, value=value3  row3 column=f1:1,
> timestamp=1401883420805, value=value4
> When I run the code hbase_inputformat.py, the following loop print row1
> just once:
>
> output = hbase_rdd.collect()  for (k, v) in output:  print (k, v)
> Am I doing anything wrong?
>
> Thanks in advance.
>

Mime
View raw message