sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cheolsoo Park <cheol...@cloudera.com>
Subject Re: loading as sequencefile and running an hadoop mapreduce job
Date Wed, 11 Apr 2012 18:52:12 GMT
Hi Mark,

It would be helpful if you could provide complete log with the --verbose
option on.

I believe the result in the hdfs file is a serialization of the java object
> of a class generated automatically by sqoop, the class name is the table
> name and extends SqoopRecord: let’s call it table_name.java .


A serialization of 'table_name' is not the result.  The auto-generated Java
class is only for Sqoop to interface with the DB. The result is sequence
files that contain data.

Now, I am trying to run a MapReduce job against this file but it is
> failing, I added the class table_name.java in my jar. But when I run the
> mapreduce job, I get “ClassNotFoundException:
> com.cloudera.sqoop.lib.SqoopRecord”. Even with the option –libjars
> sqoop-1.3.0.jar.

**

I am not clear what MR jobs you're running here.

1) If you're importing data, I am wondering why you have to do this
manually since it should be automatically done by Sqoop: compile table_name
into a jar, load the jar into hdfs, pass the path to the jar to import
mapper jobs, etc

2) If you're running your own MR jobs on imported data, they don't need to
know about 'tabe_name' or 'SqoopRecord' since data are already in sequence
file format, so your MR jobs should be able to understand them.

Hope this is helpful.

Thanks,
Cheolsoo

On Wed, Apr 11, 2012 at 9:28 AM, Marc Sturm <mas9161@nyp.org> wrote:

>  Hi,****
>
> ** **
>
> I am new to hadoop and sqoop. So far I was able to run a single node
> hadoop cluster on my mac and I am trying to load data from sql server using
> sqoop 1.3 and Microsoft’s sqoop connector.****
>
> The data is stored as varbinary column (though it is text blob) and I am
> loading it into hadoop with sqoop using the --as-sequencefile option. I
> believe the result in the hdfs file is a serialization of the java object
> of a class generated automatically by sqoop, the class name is the table
> name and extends SqoopRecord: let’s call it table_name.java . This was done
> successfully.****
>
> ** **
>
> Now, I am trying to run a MapReduce job against this file but it is
> failing, I added the class table_name.java in my jar. But when I run the
> mapreduce job, I get “ClassNotFoundException:
> com.cloudera.sqoop.lib.SqoopRecord”.****
>
> Even with the option –libjars sqoop-1.3.0.jar.****
>
> ** **
>
> I hope all this makes sense to you. If you can help me understand what the
> problem is or point me to the right documentation that would be great.****
>
> ** **
>
> Thanks,****
>
> Marc****
>
> ** **
>
> ------------------------------
> This electronic message is intended to be for the use only of the named
> recipient, and may contain information that is confidential or privileged.
> If you are not the intended recipient, you are hereby notified that any
> disclosure, copying, distribution or use of the contents of this message is
> strictly prohibited. If you have received this message in error or are not
> the named recipient, please notify us immediately by contacting the sender
> at the electronic mail address noted above, and delete and destroy all
> copies of this message. Thank you.
>
> --------------------
>
> This electronic message is intended to be for the use only of the named recipient, and
may contain information that is confidential or privileged.  If you are not the intended recipient,
you are hereby notified that any disclosure, copying, distribution or use of the contents
of this message is strictly prohibited.  If you have received this message in error or are
not the named recipient, please notify us immediately by contacting the sender at the electronic
mail address noted above, and delete and destroy all copies of this message.  Thank you.
>
>
> --------------------
>
> This electronic message is intended to be for the use only of the named recipient, and
may contain information that is confidential or privileged.  If you are not the intended recipient,
you are hereby notified that any disclosure, copying, distribution or use of the contents
of this message is strictly prohibited.  If you have received this message in error or are
not the named recipient, please notify us immediately by contacting the sender at the electronic
mail address noted above, and delete and destroy all copies of this message.  Thank you.
>
>
>
>

Mime
View raw message