sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jarek Jarcec Cecho <jar...@apache.org>
Subject Re: Sqoop not picking up immediate changes
Date Fri, 15 Nov 2013 18:37:33 GMT
Hi sir,
I'm glad to hear that everything is working for you! The parameter --split-by is used to generate
splits and as a result it do not have to be a primary key. Any column that do have sufficient
key coverage should be fine. For a composite primary key, you can use the first column of
the key.

Jarcec

On Thu, Nov 14, 2013 at 09:37:26PM -0800, burberry blues wrote:
> I go the solution. I have exclusively committed the changes in the oracle
> database. The updates have been identified by the sqoop and written to the
> filesystem location. I have used the Sqoop merge command which has replaced
> the old file with the new file and removed the duplicates based on the
> primary key.
> 
> But one question here the --split-by command in hive is accepting only 1
> primary key. What if I ahve a combination of keys as primary keys.I am
> getting error while giving multiple fields in the split-by parameter.
> Please clarify
> 
> 
> On Thu, Nov 14, 2013 at 1:13 AM, Anas Mosaad <AMOSAAD@eg.ibm.com> wrote:
> 
> > Hi all,
> >
> > I'm not experienced with Sqoop but I'm trying to help. Is it possible to
> > see the SQL statements executed by Sqoop. I believe if the statements are
> > debugged anywhere, Blues will be able to pin point the issue.
> >
> >
> > Best Regards
> > Anas Mosaad
> >
> >
> >
> > From:        burberry blues <bluesburberry@gmail.com>
> > To:        user@sqoop.apache.org,
> > Date:        11/13/2013 07:02 PM
> > Subject:        Re: Sqoop not picking up immediate changes
> > ------------------------------
> >
> >
> >
> > HI Jarek,
> > Intially in db i have
> >
> > col1 col2 col3
> > 1        a    08-NOV-2013
> > 2        b    08-NOV-2013
> > 3        c    08-NOV-2013
> >
> > First  time sqoop import command
> > ===========================
> > sqoop import --connect jdbc:oracle:thin:@//url:driver/database--username<username>
> > --password <password> --table table1   --columns col1,col2,col3
> > --incremental lastmodified --check-column col3 --last-value "2013-11-07
> > 00.00.00.0" --split-by col1 --target-dir<outputdir>
> >
> > When i ran the above sqoop import i am able to successfully get all the 3
> > records .
> >
> >
> > Now i made 2 updates in DB
> >
> > col1 col2 col3
> > 1        d    10-NOV-2013
> > 2        e    10-NOV-2013
> > 3        c    08-NOV-2013
> >
> > Second time Sqoop Command
> > ========================
> > I read that sqoop is currently unable to merge the records of updates ,so
> > i am trying to get the updates in a new directory and then use "sqoop
> > merge" to merge this new one and the previous import output.
> >
> > So the command i ran is
> >
> > sqoop import --connect jdbc:oracle:thin:@//url:driver/database--username<username>
> > --password <password> --table table1   --columns col1,col2,col3
> > --incremental lastmodified --check-column col3 --last-value "2013-11-09
> > 00.00.00.0" --split-by col1 --target-dir<outputdir1>
> >
> > This time accoring to the updates i should get  records with col1 values
> > 1,2 as they are updated.
> > But the second sqoop import zero records in output.(Even during the job
> > execution it says map input reocrds or reduce output records as 0).
> >
> > Even the changes are happening in the DB(I checked the changes by running
> > the selest * query in db) why cant sqoop find them.It seems like sqoop
> > didnt find any updates from 9th nov .Please assist me in this issue.
> >
> > Thanks,
> > Blues.
> >
> >
> >
> >
> > On Wed, Nov 13, 2013 at 8:32 AM, Jarek Jarcec Cecho <*jarcec@apache.org*<jarcec@apache.org>>
> > wrote:
> > Hi Blues,
> > would you mind sharing details about your use case? Table schemas, exact
> > commands (both on database and in command line) and associated logs?
> >
> > Wild guess - when you are changing the rows in the database, are you
> > committing the ongoing transaction? Sqoop will create a new connection with
> > new transaction, so due to ACID it won't pick up any uncommitted changes.
> >
> > Jarcec
> >
> > On Tue, Nov 12, 2013 at 10:36:10PM -0800, burberry blues wrote:
> > > Hi Team,
> > >
> > > I am having a problem with following scenario.
> > >
> > > In Db i update a column1 of a row and the column 2 got modified with
> > > current timestamp.
> > > But when i try to import those changes through sqoop using --incremental
> > > lastmodified --check-column column2 --last-value <less than current
> > > date>,it shows 0 records imported which are changed.
> > >
> > > There are changes in the DB but sqoop qorks as if it couldnt find the
> > > updated once and still pointing to the old records.
> > >
> > > i.e Before updating i have 3 records with date as 10th Nov,i asked sqoop
> > to
> > > import records after 9th Nov. It imports all 3 records.
> > > Now i change 1 row and date is updated to 12 Nov. Immediate I ask sqoop
> > to
> > > import records after 11th Nov .But it imports 0 records now.If i run the
> > > same import with date as 9th nov again it works fine and also* give me
> > > duplicate records*.
> > >
> > > Please help me in this issue at the earliest.
> > >
> > > Thanks,
> > > Blues
> >
> >

Mime
View raw message