sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anas Mosaad <AMOS...@eg.ibm.com>
Subject Re: Sqoop not picking up immediate changes
Date Thu, 14 Nov 2013 09:13:36 GMT
Hi all,

I'm not experienced with Sqoop but I'm trying to help. Is it possible to 
see the SQL statements executed by Sqoop. I believe if the statements are 
debugged anywhere, Blues will be able to pin point the issue. 

Best Regards 
Anas Mosaad

From:   burberry blues <bluesburberry@gmail.com>
To:     user@sqoop.apache.org, 
Date:   11/13/2013 07:02 PM
Subject:        Re: Sqoop not picking up immediate changes

HI Jarek,
Intially in db i have 

col1 col2 col3
1        a    08-NOV-2013
2        b    08-NOV-2013
3        c    08-NOV-2013

First  time sqoop import command
sqoop import --connect 
jdbc:oracle:thin:@//url:driver/database--username<username> --password 
<password> --table table1   --columns col1,col2,col3 --incremental 
lastmodified --check-column col3 --last-value "2013-11-07" 
--split-by col1 --target-dir<outputdir>

When i ran the above sqoop import i am able to successfully get all the 3 
records .

Now i made 2 updates in DB

col1 col2 col3
1        d    10-NOV-2013
2        e    10-NOV-2013
3        c    08-NOV-2013

Second time Sqoop Command
I read that sqoop is currently unable to merge the records of updates ,so 
i am trying to get the updates in a new directory and then use "sqoop 
merge" to merge this new one and the previous import output.

So the command i ran is 

sqoop import --connect 
jdbc:oracle:thin:@//url:driver/database--username<username> --password 
<password> --table table1   --columns col1,col2,col3 --incremental 
lastmodified --check-column col3 --last-value "2013-11-09" 
--split-by col1 --target-dir<outputdir1>

This time accoring to the updates i should get  records with col1 values 
1,2 as they are updated.
But the second sqoop import zero records in output.(Even during the job 
execution it says map input reocrds or reduce output records as 0).

Even the changes are happening in the DB(I checked the changes by running 
the selest * query in db) why cant sqoop find them.It seems like sqoop 
didnt find any updates from 9th nov .Please assist me in this issue.


On Wed, Nov 13, 2013 at 8:32 AM, Jarek Jarcec Cecho <jarcec@apache.org> 
Hi Blues,
would you mind sharing details about your use case? Table schemas, exact 
commands (both on database and in command line) and associated logs?

Wild guess - when you are changing the rows in the database, are you 
committing the ongoing transaction? Sqoop will create a new connection 
with new transaction, so due to ACID it won't pick up any uncommitted 


On Tue, Nov 12, 2013 at 10:36:10PM -0800, burberry blues wrote:
> Hi Team,
> I am having a problem with following scenario.
> In Db i update a column1 of a row and the column 2 got modified with
> current timestamp.
> But when i try to import those changes through sqoop using --incremental
> lastmodified --check-column column2 --last-value <less than current
> date>,it shows 0 records imported which are changed.
> There are changes in the DB but sqoop qorks as if it couldnt find the
> updated once and still pointing to the old records.
> i.e Before updating i have 3 records with date as 10th Nov,i asked sqoop 
> import records after 9th Nov. It imports all 3 records.
> Now i change 1 row and date is updated to 12 Nov. Immediate I ask sqoop 
> import records after 11th Nov .But it imports 0 records now.If i run the
> same import with date as 9th nov again it works fine and also* give me
> duplicate records*.
> Please help me in this issue at the earliest.
> Thanks,
> Blues

View raw message