sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Per Ullberg <per.ullb...@klarna.com>
Subject Re: 回复: How to limit the number of rows to export data when using sqoopto export data from hdfs to oracle?
Date Sat, 02 Dec 2017 08:02:42 GMT
You have to do a CTAS. Just look it up in the hive manual. Basically it's
just
CREATE TABLE export_sample AS SELECT * FROM big_table;

You might have to adjust the storage format depending on  how you use
sqoop. Using text usually works but might be a bit inefficient. The select
part in the CTAS can be any select, so much more options than just
select-star.

The CTAS will in contrast to a view materialize as files. Once materialized
it has no coupling to the original data. If you want to include new data
from the source table, you have to drop the CTAS and rerun it.

Regards
/Pelle

On Sat, 2 Dec 2017 at 07:50, qq <987626311@qq.com> wrote:

> I created a hive view, through sqoop export data to oracle , the error
> message is as follows:
> ERROR tool.ExporTool:Encoountered IOException running export
> job:java.io.IOException:java.lang.NullPointerException
> Is sqoop currently exporting hive view data?
>
>
> ------------------ 原始邮件 ------------------
> *发件人:* "Markus Kemper"<markus@cloudera.com>;
> *发送时间:* 2017年12月1日(星期五) 晚上9:24
> *收件人:* "user"<user@sqoop.apache.org>;
> *主题:* Re: 回复: How to limit the number of rows to export data when using
> sqoopto export data from hdfs to oracle?
>
> Hey Team,
>
> Not sure a VIEW with export will work.  If I recall export with --hcatalog
> is not aware of HMS VIEWs, need to test.
>
> Markus Kemper
> Customer Operations Engineer
> <http://www.cloudera.com>
>
>
> On Fri, Dec 1, 2017 at 8:00 AM, Attila Szabó <maugli@apache.org> wrote:
>
>> Hey,
>>
>> IMHO it's CTAS which stands for "Create Table As Select".
>>
>> On the front of views :
>> The question is more than valid if Sqoop supports it or not. I do
>> remember a problem we've faced 1-1.5 year ago in connection with Hive +
>> Hcatalog + Sqoop,  and that was not supported because of some missing Hive
>> Serde implementation. I'm not sure if this problem exists with standard
>> Hive views and Sqoop export command, but you should give a try.
>>
>> The thing what could be problematic with views:
>> A View is a result of a select statement by design. But in Hive every
>> HiveQL command is translated to a map/reduce job. But Sqoop also works with
>> map/reduce jobs and thus it tries to read files from the HDFS ( because of
>> data locality and things), so this might be a clashing problem here, but as
>> I've advised you should give it a try.
>>
>> With CTAS:
>> It should definitely work, because in this case Hive will store the
>> filtered results in a different Hive table ( == HDFS directory)  and thus
>> export dir is your friend. :)
>>
>> Were I able to clarify everything or do you have further questions?
>>
>> Cheers,
>> Attila
>>
>> On Dec 1, 2017 1:33 PM, "qq" <987626311@qq.com> wrote:
>>
>>> Hello:
>>>       First of all, thank you very much for your answer, I just started
>>> to touch sqoop, there are many do not understand, you can explain in detail
>>> about the operation steps of sqoop export work with views and the steps of
>>> exact dataset to sqoop using a CAST?
>>>      thinks.
>>>      I am looking forward to your reply!
>>>
>>>
>>> ------------------ 原始邮件 ------------------
>>> *发件人:* "Per Ullberg";<per.ullberg@klarna.com>;
>>> *发送时间:* 2017年12月1日(星期五) 晚上6:21
>>> *收件人:* "user"<user@sqoop.apache.org>;
>>> *主题:* Re: How to limit the number of rows to export data when using
>>> sqoopto export data from hdfs to oracle?
>>>
>>> Does Sqoop export work with Views? If not, you'll have to materialise
>>> the exact dataset you want to sqoop using a CTAS.
>>>
>>> regards
>>> /Pelle
>>>
>>> On Fri, Dec 1, 2017 at 11:08 AM, Attila Szabó <maugli@apache.org> wrote:
>>>
>>>> Hey,
>>>>
>>>> If you're trying to export from Hive into RDBMS I would suggest
>>>> creating a Hive view and only export the content of the view. Thus you
>>>> could directly control the data quantity by the underlying HiveQL query.
>>>>
>>>> My 2cents,
>>>> Attila
>>>>
>>>>
>>>> On Dec 1, 2017 10:54 AM, "qq" <987626311@qq.com> wrote:
>>>>
>>>> Hello:
>>>>
>>>>       I have a question on the export of sqoop need your help, the
>>>> problem is as follows:
>>>>       How to limit the number of rows that need to be exported when
>>>> exporting data from hdfs to oracle using sqoop?
>>>>       For example: hive data stored in the hdfs 100 lines, just want
>>>> the first 10 lines of data through sqoop exported to the oracle table, how
>>>> to achieve through sqoop?
>>>>       I am looking forward to your reply!
>>>>       thinks.
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> *Per Ullberg*
>>> Datavault Tech Lead
>>> Odin (Uppsala)
>>>
>>> Klarna Bank AB (publ)
>>> Sveavägen 46, 111 34 Stockholm
>>> <https://maps.google.com/?q=Sveav%C3%A4gen+46,+111+34+Stockholm&entry=gmail&source=g>
>>> Tel: +46 8 120 120 00
>>> Reg no: 556737-0431
>>> klarna.com
>>>
>>>
> --

*Per Ullberg*
Datavault Tech Lead
Odin (Uppsala)

Klarna Bank AB (publ)
Sveavägen 46, 111 34 Stockholm
Tel: +46 8 120 120 00 <javascript:void(0);>
Reg no: 556737-0431
klarna.com

Mime
View raw message