Hi i have two questions regarding hdfs and jps utility I am new to Hadoop and started leraning hadoop from the past week 1.when ever i start start-all.sh and jps in console it showing the processes started *naidu@naidu:~/work/hadoop-1.0.4/bin$ jps* *22283 NameNode* *23516 TaskTracker* *26711 Jps* *22541 DataNode* *23255 JobTracker* *22813 SecondaryNameNode* *Could not synchronize with target* But along with the list of process stared it always showing *" Could not synchronize with target" *in the jps output. What is meant by "Could not synchronize with target"? Can some one explain why this is happening? 2.Is it possible to format namenode multiple times? When i enter the namenode -format command, it not formatting the name node and showing the following ouput. *naidu@naidu:~/work/hadoop-1.0.4/bin$ hadoop namenode -format* *Warning: $HADOOP_HOME is deprecated.* * * *13/05/01 12:08:04 INFO namenode.NameNode: STARTUP_MSG: * */************************************************************* *STARTUP_MSG: Starting NameNode* *STARTUP_MSG: host = naidu/127.0.0.1* *STARTUP_MSG: args = [-format]* *STARTUP_MSG: version = 1.0.4* *STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1393290; compiled by 'hortonfo' on Wed Oct 3 05:13:58 UTC 2012* *************************************************************/* *Re-format filesystem in /home/naidu/dfs/namenode ? (Y or N) y* *Format aborted in /home/naidu/dfs/namenode* *13/05/01 12:08:05 INFO namenode.NameNode: SHUTDOWN_MSG: * */************************************************************* *SHUTDOWN_MSG: Shutting down NameNode at naidu/127.0.0.1* * * *************************************************************/* Can someone help me in understanding this? Why is it not possible to format name node multiple times? On Wed, May 1, 2013 at 10:42 AM, lars hofhansl wrote: > I do not want to be rude or anything... But how often we need to have this > discussion? > > When you salt your rowkeys with say 10 salt values then for each read you > need to fork of 10 read requests, and each of them touches only 1/10th of > the tables (which nicely with HBase's prefix scans). > > Obviously, if you only need point gets you wouldn't salting, that would be > stupid. If you mostly do range scans, than salting is quite nice. > > Saying that salting is bad, because it does not work for point gets is > like saying that bulldozers are bad, because you cannot use on them race > tracks. :) > > > -- Lars > > > > ________________________________ > From: Michael Segel > To: user@hbase.apache.org > Sent: Tuesday, April 30, 2013 10:06 AM > Subject: Re: Read access pattern > > > Sure. > > By definition, the salt number is a random seed that is not associated > with the underlying record. > A simple example is a round robin counter (mod the counter by 10 yielding > [0..9] ) > > So you get a record, prepend your salt and you write it out to HBase. The > salt will push the data out to a different region. > > But what happens when you want to read the data? > > So on a full table scan... no biggie, its the same. > > But suppose I want to do a partial table scan. Now I have to do multiple > partial scans because I dont know the salt. > Or if I want to do a simple get() I now have to do N number of get()s > where N is the number of salt values allowed. In my example that's 10. > > And that's the problem. > > You are better off doing a hash of the record, use the first couple of > bytes off the hash and then writing the record out. > You want the record, take the key, hash it, using the same process and you > have 1 get(). > > You're still screwed up on doing a range scan, but you can't have > everything. > > THIS IS WHY I AND MANY CARDIOLOGISTS SAY NO TO SALT. The only difference > is that they are talking about excess sodium chloride in your diet. I'm > talking about using a salt aka 'random seed'. > > Does that make sense? > > > On Apr 30, 2013, at 11:17 AM, Shahab Yunus wrote: > > > Well those are *some* words :) Anyway, can you explain a bit in detail > that > > why you feel so strongly about this design/approach? The salting here is > > not the only option mentioned and static hashing can be used as well. > Plus > > even in case of salting, wouldn't the distributed scan take care of it? > The > > downside that I see, is the bucket_number that we have to maintain both > at > > time or reading/writing and update it in case of cluster restructuring. > > > > Thanks, > > Shahab > > > > > > On Tue, Apr 30, 2013 at 11:57 AM, Michael Segel > > wrote: > > > >> Geez that's a bad article. > >> Never salt. > >> > >> And yes there's a difference between using a salt and using the first > 2-4 > >> bytes from your MD5 hash. > >> > >> (Hint: Salts are random. Your hash isn't. ) > >> > >> Sorry to be-itch but its a bad idea and it shouldn't be propagated. > >> > >> On Apr 29, 2013, at 10:17 AM, Shahab Yunus > wrote: > >> > >>> I think you cannot use the scanner simply to to a range scan here as > your > >>> keys are not monotonically increasing. You need to apply logic to > >>> decode/reverse your mechanism that you have used to hash your keys at > the > >>> time of writing. You might want to check out the SemaText library which > >>> does distributed scans and seem to handle the scenarios that you want > to > >>> implement. > >>> > >> > http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/ > >>> > >>> > >>> On Mon, Apr 29, 2013 at 11:03 AM, wrote: > >>> > >>>> Hi, > >>>> > >>>> I have a rowkey defined by : > >>>> getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n", > >>>> (Long.MAX_VALUE - changeDate.getTime())); > >>>> > >>>> How could I get the previous and next row for a given rowkey ? > >>>> For instance, I have the following ordered keys : > >>>> > >>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370673172227807 > >>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807 > >>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807 > >>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807 > >>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674987271807 > >>>> > >>>> If I choose the rowkey : > >>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807, what would be > the > >>>> correct scan to get the previous and next key ? > >>>> Result would be : > >>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807 > >>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807 > >>>> > >>>> Thank you ! > >>>> R. > >>>> > >>>> Une messagerie gratuite, garantie à vie et des services en plus, ça > vous > >>>> tente ? > >>>> Je crée ma boîte mail www.laposte.net > >>>> > >> > >> >