sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Something Something <mailinglist...@gmail.com>
Subject Re: Using BLOB created by Sqoop
Date Fri, 10 Apr 2015 04:37:08 GMT
Our blobs are very bring!  I am NOT importing into Hive.  Writing to HDFS.
I am creating an EXTERNAL Hive table on top of it by setting
'--map-column-hive' to 'columnName=BINARY', but would that read the blobs
from the '_lob' directory?  I don't think so!  How would Hive know that
Sqoop puts the Blobs in '_lob' directory.  May be we need to use some other

Has anyone used 'blobs' that were copied using Sqoop?

On Thu, Apr 9, 2015 at 9:26 PM, Abraham Elmahrek <abe@cloudera.com> wrote:

> 1. You can define the size limit, but default is 16 MB.
> 2. How big are your blobs? I'd map these to strings when importing into
> hive if they're small enough.
> On Thu, Apr 9, 2015 at 9:35 AM, Something Something <
> mailinglists19@gmail.com> wrote:
>> Using Sqoop I’ve successfully imported a few rows from a table that has a
>> BLOB column. As indicated in the Sqoop documentation, it has created ‘_lob’
>> directory with files such as:
>> large_obj_attempt_201503141229_83736_m_000004_00.lob for *some* of the rows.
>> Questions:
>> 1) As per doc, only files over 16M will go in this directory, correct?
>> 2) How do I know which row this file is related to?
>> In short, how do I use these ‘lobs’ that are created? Using Hive? Pig?
>> Native MapReduce? Any sample code will be greatly appreciated.

View raw message