sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marc Sturm" <mas9...@nyp.org>
Subject RE: loading as sequencefile and running an hadoop mapreduce job
Date Thu, 12 Apr 2012 20:03:19 GMT
Thanks for your help Cheolsoo, I added the sqoop jar to hadoop's lib dir and now my mr job
runs fine.
Marc

From: Cheolsoo Park [mailto:cheolsoo@cloudera.com]
Sent: Wednesday, April 11, 2012 4:06 PM
To: user@sqoop.apache.org
Subject: Re: loading as sequencefile and running an hadoop mapreduce job

Hi Mark,

I was totally wrong about sequence files in my previous email. In fact, I realized that SqoopRecord
is needed by MR jobs to deserialize sequence files. Again, I am sorry for the confusion.

Thanks,
Cheolsoo
On Wed, Apr 11, 2012 at 11:52 AM, Cheolsoo Park <cheolsoo@cloudera.com<mailto:cheolsoo@cloudera.com>>
wrote:
Hi Mark,

It would be helpful if you could provide complete log with the --verbose option on.

I believe the result in the hdfs file is a serialization of the java object of a class generated
automatically by sqoop, the class name is the table name and extends SqoopRecord: let's call
it table_name.java .

A serialization of 'table_name' is not the result.  The auto-generated Java class is only
for Sqoop to interface with the DB. The result is sequence files that contain data.

Now, I am trying to run a MapReduce job against this file but it is failing, I added the class
table_name.java in my jar. But when I run the mapreduce job, I get "ClassNotFoundException:
com.cloudera.sqoop.lib.SqoopRecord". Even with the option -libjars sqoop-1.3.0.jar.

I am not clear what MR jobs you're running here.

1) If you're importing data, I am wondering why you have to do this manually since it should
be automatically done by Sqoop: compile table_name into a jar, load the jar into hdfs, pass
the path to the jar to import mapper jobs, etc

2) If you're running your own MR jobs on imported data, they don't need to know about 'tabe_name'
or 'SqoopRecord' since data are already in sequence file format, so your MR jobs should be
able to understand them.

Hope this is helpful.

Thanks,
Cheolsoo

On Wed, Apr 11, 2012 at 9:28 AM, Marc Sturm <mas9161@nyp.org<mailto:mas9161@nyp.org>>
wrote:
Hi,

I am new to hadoop and sqoop. So far I was able to run a single node hadoop cluster on my
mac and I am trying to load data from sql server using sqoop 1.3 and Microsoft's sqoop connector.
The data is stored as varbinary column (though it is text blob) and I am loading it into hadoop
with sqoop using the --as-sequencefile option. I believe the result in the hdfs file is a
serialization of the java object of a class generated automatically by sqoop, the class name
is the table name and extends SqoopRecord: let's call it table_name.java . This was done successfully.

Now, I am trying to run a MapReduce job against this file but it is failing, I added the class
table_name.java in my jar. But when I run the mapreduce job, I get "ClassNotFoundException:
com.cloudera.sqoop.lib.SqoopRecord".
Even with the option -libjars sqoop-1.3.0.jar.

I hope all this makes sense to you. If you can help me understand what the problem is or point
me to the right documentation that would be great.

Thanks,
Marc


________________________________
This electronic message is intended to be for the use only of the named recipient, and may
contain information that is confidential or privileged. If you are not the intended recipient,
you are hereby notified that any disclosure, copying, distribution or use of the contents
of this message is strictly prohibited. If you have received this message in error or are
not the named recipient, please notify us immediately by contacting the sender at the electronic
mail address noted above, and delete and destroy all copies of this message. Thank you.

--------------------



This electronic message is intended to be for the use only of the named recipient, and may
contain information that is confidential or privileged.  If you are not the intended recipient,
you are hereby notified that any disclosure, copying, distribution or use of the contents
of this message is strictly prohibited.  If you have received this message in error or are
not the named recipient, please notify us immediately by contacting the sender at the electronic
mail address noted above, and delete and destroy all copies of this message.  Thank you.



--------------------



This electronic message is intended to be for the use only of the named recipient, and may
contain information that is confidential or privileged.  If you are not the intended recipient,
you are hereby notified that any disclosure, copying, distribution or use of the contents
of this message is strictly prohibited.  If you have received this message in error or are
not the named recipient, please notify us immediately by contacting the sender at the electronic
mail address noted above, and delete and destroy all copies of this message.  Thank you.








--------------------

This electronic message is intended to be for the use only of the named recipient, and may
contain information that is confidential or privileged.  If you are not the intended recipient,
you are hereby notified that any disclosure, copying, distribution or use of the contents
of this message is strictly prohibited.  If you have received this message in error or are
not the named recipient, please notify us immediately by contacting the sender at the electronic
mail address noted above, and delete and destroy all copies of this message.  Thank you.




--------------------

This electronic message is intended to be for the use only of the named recipient, and may
contain information that is confidential or privileged.  If you are not the intended recipient,
you are hereby notified that any disclosure, copying, distribution or use of the contents
of this message is strictly prohibited.  If you have received this message in error or are
not the named recipient, please notify us immediately by contacting the sender at the electronic
mail address noted above, and delete and destroy all copies of this message.  Thank you.




Mime
View raw message