hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Naidu MS <sanyasinaidu.malla...@gmail.com>
Subject Re: Read access pattern
Date Wed, 01 May 2013 07:25:57 GMT
Hi i have two questions regarding hdfs and jps utility

I am new to Hadoop and started leraning hadoop from the past week

1.when ever i start start-all.sh and jps in console it showing the
processes started

*naidu@naidu:~/work/hadoop-1.0.4/bin$ jps*
*22283 NameNode*
*23516 TaskTracker*
*26711 Jps*
*22541 DataNode*
*23255 JobTracker*
*22813 SecondaryNameNode*
*Could not synchronize with target*

But along with the list of process stared it always showing *" Could not
synchronize with target" *in the jps output. What is meant by "Could not
synchronize with target"?  Can some one explain why this is happening?


2.Is it possible to format namenode multiple  times? When i enter the
 namenode -format command, it not formatting the name node and showing the
following ouput.

*naidu@naidu:~/work/hadoop-1.0.4/bin$ hadoop namenode -format*
*Warning: $HADOOP_HOME is deprecated.*
*
*
*13/05/01 12:08:04 INFO namenode.NameNode: STARTUP_MSG: *
*/*************************************************************
*STARTUP_MSG: Starting NameNode*
*STARTUP_MSG:   host = naidu/127.0.0.1*
*STARTUP_MSG:   args = [-format]*
*STARTUP_MSG:   version = 1.0.4*
*STARTUP_MSG:   build =
https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r
1393290; compiled by 'hortonfo' on Wed Oct  3 05:13:58 UTC 2012*
*************************************************************/*
*Re-format filesystem in /home/naidu/dfs/namenode ? (Y or N) y*
*Format aborted in /home/naidu/dfs/namenode*
*13/05/01 12:08:05 INFO namenode.NameNode: SHUTDOWN_MSG: *
*/*************************************************************
*SHUTDOWN_MSG: Shutting down NameNode at naidu/127.0.0.1*
*
*
*************************************************************/*

Can someone help me in understanding this? Why is it not possible to format
name node multiple times?


On Wed, May 1, 2013 at 10:42 AM, lars hofhansl <larsh@apache.org> wrote:

> I do not want to be rude or anything... But how often we need to have this
> discussion?
>
> When you salt your rowkeys with say 10 salt values then for each read you
> need to fork of 10 read requests, and each of them touches only 1/10th of
> the tables (which nicely with HBase's prefix scans).
>
> Obviously, if you only need point gets you wouldn't salting, that would be
> stupid. If you mostly do range scans, than salting is quite nice.
>
> Saying that salting is bad, because it does not work for point gets is
> like saying that bulldozers are bad, because you cannot use on them race
> tracks. :)
>
>
> -- Lars
>
>
>
> ________________________________
>  From: Michael Segel <michael_segel@hotmail.com>
> To: user@hbase.apache.org
> Sent: Tuesday, April 30, 2013 10:06 AM
> Subject: Re: Read access pattern
>
>
> Sure.
>
> By definition, the salt number is a random seed that is not associated
> with the underlying record.
> A simple example is a round robin counter (mod the counter by 10 yielding
> [0..9] )
>
> So you get a record, prepend your salt and you write it out to HBase. The
> salt will push the data out to a different region.
>
> But what happens when you want to read the data?
>
> So on a full table scan... no biggie, its the same.
>
> But suppose I want to do a partial table scan. Now I have to do multiple
> partial scans because I dont know the salt.
> Or if I want to do a simple get() I now have to do N number of get()s
> where N is the number of salt values allowed. In my example that's 10.
>
> And that's the problem.
>
> You are better off doing a hash of the record, use the first couple of
> bytes off the hash and then writing the record out.
> You want the record, take the key, hash it, using the same process and you
> have 1 get().
>
> You're still screwed up on doing a range scan, but you can't have
> everything.
>
> THIS IS WHY I AND MANY CARDIOLOGISTS SAY NO TO SALT. The only difference
> is that they are talking about excess sodium chloride in your diet. I'm
> talking about using a salt aka 'random seed'.
>
> Does that make sense?
>
>
> On Apr 30, 2013, at 11:17 AM, Shahab Yunus <shahab.yunus@gmail.com> wrote:
>
> > Well those are *some* words :) Anyway, can you explain a bit in detail
> that
> > why you feel so strongly about this design/approach? The salting here is
> > not the only option mentioned and static hashing can be used as well.
> Plus
> > even in case of salting, wouldn't the distributed scan take care of it?
> The
> > downside that I see, is the bucket_number that we have to maintain both
> at
> > time or reading/writing and update it in case of cluster restructuring.
> >
> > Thanks,
> > Shahab
> >
> >
> > On Tue, Apr 30, 2013 at 11:57 AM, Michael Segel
> > <michael_segel@hotmail.com>wrote:
> >
> >> Geez that's a bad article.
> >> Never salt.
> >>
> >> And yes there's a difference between using a salt and using the first
> 2-4
> >> bytes from your MD5 hash.
> >>
> >> (Hint: Salts are random. Your hash isn't. )
> >>
> >> Sorry to be-itch but its a bad idea and it shouldn't be propagated.
> >>
> >> On Apr 29, 2013, at 10:17 AM, Shahab Yunus <shahab.yunus@gmail.com>
> wrote:
> >>
> >>> I think you cannot use the scanner simply to to a range scan here as
> your
> >>> keys are not monotonically increasing. You need to apply logic to
> >>> decode/reverse your mechanism that you have used to hash your keys at
> the
> >>> time of writing. You might want to check out the SemaText library which
> >>> does distributed scans and seem to handle the scenarios that you want
> to
> >>> implement.
> >>>
> >>
> http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/
> >>>
> >>>
> >>> On Mon, Apr 29, 2013 at 11:03 AM, <ricla@laposte.net> wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> I have a rowkey defined by :
> >>>>       getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n",
> >>>> (Long.MAX_VALUE - changeDate.getTime()));
> >>>>
> >>>> How could I get the previous and next row for a given rowkey ?
> >>>> For instance, I have the following ordered keys :
> >>>>
> >>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370673172227807
> >>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
> >>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807
> >>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
> >>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674987271807
> >>>>
> >>>> If I choose the rowkey :
> >>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807, what would be
> the
> >>>> correct scan to get the previous and next key ?
> >>>> Result would be :
> >>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
> >>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
> >>>>
> >>>> Thank you !
> >>>> R.
> >>>>
> >>>> Une messagerie gratuite, garantie à vie et des services en plus, ça
> vous
> >>>> tente ?
> >>>> Je crée ma boîte mail www.laposte.net
> >>>>
> >>
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message