sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Venkat Ranganathan <vranganat...@hortonworks.com>
Subject Re: Complex free form queries
Date Thu, 18 Sep 2014 22:43:29 GMT
There are a few scenarios where we warn against inconsistencies.   Using a
character column as a split by column, using complex queries with split by
column that can potentially generate incorrect data in each of the mappers
than what is intended.

If you use -m 1 option, then you don't have the inconsistency issues.

Venkat

On Thu, Sep 18, 2014 at 2:40 PM, pratik khadloya <tispratik@gmail.com>
wrote:

> Am not facing any problem. Am checking to see what are the reservations
> against not supporting complex joins with OR conditions.
> I would like to know when it could create a problem and would the problem
> be solvable by using a "view" or limiting the number of mappers to just 1.
> I would like to know if the problem if any is due to the parallelism which
> comes with increasing the number of mappers?
>
> ~Pratik
>
> On Thu, Sep 18, 2014 at 1:23 PM, Sambit Tripathy (RBEI/PJ-NBS) <
> Sambit.Tripathy@in.bosch.com> wrote:
>
>> Pratik,
>>
>>
>>
>> Are you facing a problem or trying to make a recommendation?
>>
>>
>>
>>
>>
>> Regards,
>>
>> Sambit.
>>
>>
>>
>>
>>
>> *From:* pratik khadloya [mailto:tispratik@gmail.com]
>> *Sent:* Thursday, September 18, 2014 1:09 PM
>> *To:* user@sqoop.apache.org
>> *Subject:* Complex free form queries
>>
>>
>>
>> The sqoop docs say:
>>
>>
>>
>> The facility of using free-form query in the current version of Sqoop is
>> limited to simple queries where there are no ambiguous projections and no
>> OR conditions in the WHERE clause. Use of complex queries such as
>> queries that have sub-queries or joins leading to ambiguous projections can
>> lead to unexpected results.
>>
>>
>>
>> Does anyone know why such is case is not supported and can it be avoided
>> by:
>>
>>
>>
>> a) Using only 1 mapper
>>
>> or
>>
>> b) Creating a view out of the complex query
>>
>>
>>
>> I have tested a hive textfile import for a very complex query and
>> verified the data and it seems to be correct. I checked the number of
>> words, number of lines and file sizes of the dump from mysql vs the text
>> file imported onto hdfs by sqoop.
>>
>> My query does have OR conditions. I have attached an obfuscated version
>> of the query, and that screenprint is still 1/2 of the complete query.
>>
>>
>>
>> Any info on this will be helpful.
>>
>>
>>
>> Thanks,
>>
>> Pratik
>>
>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Mime
View raw message