sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jarek Jarcec Cecho <jar...@apache.org>
Subject Re: Append to file
Date Thu, 15 Nov 2012 22:12:55 GMT
Hi Artem,
thank you for your questions, let me provide comment them:

1) HDFS to my best knowledge do not support appending to existing files and so your daily
imports will need to create new additional files. However as all other parts of the framework
were designed to work with this, it's should not be an issue. Would you mind sharing your
use case? Why do you need to append to already existing files? Also Sqoop is not supporting
Hadoop archives (HAR) natively, however you can do several imports and then convert them to
HAR manually.

2) Parameter --as-avrodatafile will create avro container files that are not text files. They
are not text files even when you do not use --compress argument. Thus you can't read them
directly. You need to use Avro libraries to get the data. I would recommend going to Avro
official pages for more information [1]. I would recommend removing --as-avrodatafile parameter
in case that you need to have simple text files.

Jarcec

Links:
1: http://avro.apache.org/

On Thu, Nov 15, 2012 at 07:17:34PM +0000, Artem Ervits wrote:
> Hello community,
> 
> 
> 1.       I'd like to know what people are doing when they need to do a daily Sqoop import,
incremental or not, to append the data to an existing file. One possible solution I can think
of is Hadoop archive files. Does Sqoop support an append to an existing file or HAR is the
way to go in my scenario? Also
> 
> 2.       Another question I have is that if I sqoop import using -as-avrodatafile in
conjunction with -compress, the result is returned with .avro extension. If I wasn't using
compression, I can cat the file, how may I view the file when it's compressed? I have three
files with .avro extension, I tried to run gunzip -d on them but it won't recognize the codec.
> 
> Thank you.
> 
> Artem Ervits
> Data Analyst
> New York Presbyterian Hospital
> 
> 
> 
> --------------------
> 
> This electronic message is intended to be for the use only of the named recipient, and
may contain information that is confidential or privileged.  If you are not the intended recipient,
you are hereby notified that any disclosure, copying, distribution or use of the contents
of this message is strictly prohibited.  If you have received this message in error or are
not the named recipient, please notify us immediately by contacting the sender at the electronic
mail address noted above, and delete and destroy all copies of this message.  Thank you.
> 
> 
> 
> 
> --------------------
> 
> This electronic message is intended to be for the use only of the named recipient, and
may contain information that is confidential or privileged.  If you are not the intended recipient,
you are hereby notified that any disclosure, copying, distribution or use of the contents
of this message is strictly prohibited.  If you have received this message in error or are
not the named recipient, please notify us immediately by contacting the sender at the electronic
mail address noted above, and delete and destroy all copies of this message.  Thank you.
> 
> 
> 

Mime
View raw message