sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anna Szonyi <szo...@cloudera.com>
Subject Re: Import more than 10 million records from MySQL to HDFS
Date Tue, 14 Feb 2017 23:44:46 GMT
Hi Wenxing,

I tested the scenario with a simple table (10 million rows, but only 3
columns) and it seems to work perfectly for me (with super simple types:
int, varchar, timestamp).
Could you please share the full create table statement (doesn't have to be
the same column names just the same types) and some sample inserts to see
what causes the problem, as it doesn't seem to be the row size. It maybe to
do with the types or the number of columns.
Also please share your link and job setup & the exception you got.

Thanks,
/Anna

On Mon, Jan 16, 2017 at 6:59 PM, wenxing zheng <wenxing.zheng@gmail.com>
wrote:

> Hi Szabolcs,
>
> Sorry for the late reply. From my test, it's ok for 100,0000 rows.
>
> Thanks, Wenxing
>
> On Wed, Jan 11, 2017 at 12:42 AM, Szabolcs Vasas <vasas@cloudera.com>
> wrote:
>
> > Hi Wenxing,
> >
> > I have created a table based on the column information you sent but I
> won't
> > be able to do this testing in the next couple of days.
> > Btw have you tried the import with smaller data sets? I mean have you
> tried
> > to test what is the biggest data set you can import successfully?
> >
> > Szabolcs
> >
> > On Wed, Jan 4, 2017 at 10:55 AM, wenxing zheng <wenxing.zheng@gmail.com>
> > wrote:
> >
> > > Hi Szabolcs,
> > >
> > > I am testing this scenario with our client's slave database. And I am
> > > sorry that I can not share the table definition and the sample data
> here.
> > > But attached is a sample of table definition with the column types.
> > >
> > > It's quite complex.
> > >
> > > Thanks, Wenxing
> > >
> > > On Wed, Jan 4, 2017 at 4:24 PM, Szabolcs Vasas <vasas@cloudera.com>
> > wrote:
> > >
> > >> Hi Wenxing,
> > >>
> > >> I haven't tried this scenario yet but I would be happy to test it on
> my
> > >> side. Can you please send me the DDL statement for creating the MySQL
> > >> table
> > >> and some sample data?
> > >> Also it would be very helpful to send the details of the job you would
> > >> like
> > >> to run.
> > >>
> > >> Regards,
> > >> Szabolcs
> > >>
> > >> On Wed, Jan 4, 2017 at 2:54 AM, wenxing zheng <
> wenxing.zheng@gmail.com>
> > >> wrote:
> > >>
> > >> > can anyone help to advice?
> > >> >
> > >> > And I met with a problem when I set the checkColumn with
> updated_time,
> > >> but
> > >> > currently all the updated_time are in NULL. Under this case, the
> Sqoop
> > >> will
> > >> > fail to start the job. I think we need to support such kind of case.
> > >> >
> > >> > On Thu, Dec 29, 2016 at 9:18 AM, wenxing zheng <
> > wenxing.zheng@gmail.com
> > >> >
> > >> > wrote:
> > >> >
> > >> > > Dear all,
> > >> > >
> > >> > > Did anyone already try to import more than 10 million data from
> > MySQL
> > >> to
> > >> > > HDFS by using the Sqoop2?
> > >> > >
> > >> > > I always failed at the very beginning with various throttling
> > >> settings,
> > >> > > but never made it.
> > >> > >
> > >> > > Appreciated for any advice.
> > >> > > Thanks, Wenxing
> > >> > >
> > >> >
> > >>
> > >>
> > >>
> > >> --
> > >> Szabolcs Vasas
> > >> Software Engineer
> > >> <http://www.cloudera.com>
> > >>
> > >
> > >
> >
> >
> > --
> > Szabolcs Vasas
> > Software Engineer
> > <http://www.cloudera.com>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message