sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wei Yan <ywsk...@gmail.com>
Subject Is hash-based partition supported by Sqoop?
Date Thu, 05 Nov 2015 21:59:57 GMT
Hi, guys,

I have a question how Sqoop imports the data in parallel. IMO, Sqoop first
gets the min and max values for the SPLIT_BY column, and then does a
range-based partition, to let each mapper consumes one range. Do we support
hash-based partition, like each mapper ingests the data satisfying query
"select * from table where hash(split_by) % n = i" ?

thanks,
Wei

Mime
View raw message