spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jane thorpe <janethor...@aol.com.INVALID>
Subject Re: Spark hangs while reading from jdbc - does nothing Removing Guess work from trouble shooting
Date Tue, 14 Apr 2020 09:31:44 GMT
 Here a is another tool I use Logic Analyser  7:55
https://youtu.be/LnzuMJLZRdU

you could take some suggestions for improving performance  queries.
https://dzone.com/articles/why-you-should-not-use-select-in-sql-query-1
  
Jane thorpe
janethorpe1@aol.com
 
 
-----Original Message-----
From: jane thorpe <janethorpe1@aol.com.INVALID>
To: janethorpe1 <janethorpe1@aol.com>; mich.talebzadeh <mich.talebzadeh@gmail.com>;
liruijing09 <liruijing09@gmail.com>; user <user@spark.apache.org>
Sent: Mon, 13 Apr 2020 8:32
Subject: Re: Spark hangs while reading from jdbc - does nothing Removing Guess work from trouble
shooting



This tool may be useful for you to trouble shoot your problems away.

https://www.javacodegeeks.com/2020/04/simplifying-apm-remove-the-guesswork-from-troubleshooting.html

"APM tools typically use a waterfall-type view to show the blocking time of different components
cascading through the control flow within an application. These types of visualizations are
useful, and AppOptics has them, but they can be difficult to understand for those of us without
a PhD."
Especially  helpful if you want to understand through visualisation and you do not have a
phD.

 
Jane thorpe
janethorpe1@aol.com
 
 
-----Original Message-----
From: jane thorpe <janethorpe1@aol.com.INVALID>
To: mich.talebzadeh <mich.talebzadeh@gmail.com>; liruijing09 <liruijing09@gmail.com>;
user <user@spark.apache.org>
CC: user <user@spark.apache.org>
Sent: Sun, 12 Apr 2020 4:35
Subject: Re: Spark hangs while reading from jdbc - does nothing

You seem to be implying the error is intermittent.  
You seem to be implying data is being ingested  via JDBC. So the connection has proven itself
to be working unless no data is arriving from the  JDBC channel at all.  If no data is arriving
then one could say it could be  the JDBC.If the error is intermittent  then it is likely
a resource involved in processing is filling to capacity. Try reducing the data ingestion
volume and see if that completes, then increase the data ingested  incrementally.I assume
you have  run the job on small amount of data so you have  completed your prototype stage
successfully. 

On Saturday, 11 April 2020 Mich Talebzadeh <mich.talebzadeh@gmail.com> wrote:
Hi,
Have you checked your JDBC connections from Spark to Oracle. What is Oracle saying? Is it
doing anything or hanging?
set pagesize 9999
set linesize 140
set heading off
select SUBSTR(name,1,8) || ' sessions as on '||TO_CHAR(CURRENT_DATE, 'MON DD YYYY HH:MI AM')
from v$database;
set heading on
column spid heading "OS PID" format a6
column process format a13 heading "Client ProcID"
column username  format a15
column sid       format 999
column serial#   format 99999
column STATUS    format a3 HEADING 'ACT'
column last      format 9,999.99
column TotGets   format 999,999,999,999 HEADING 'Logical I/O'
column phyRds    format 999,999,999 HEADING 'Physical I/O'
column total_memory format 999,999,999 HEADING 'MEM/KB'
--
SELECT
          substr(a.username,1,15) "LOGIN"
        , substr(a.sid,1,5) || ','||substr(a.serial#,1,5) AS "SID/serial#"
        , TO_CHAR(a.logon_time, 'DD/MM HH:MI') "LOGGED IN SINCE"
        , substr(a.machine,1,10) HOST
        , substr(p.username,1,8)||'/'||substr(p.spid,1,5) "OS PID"
        , substr(a.osuser,1,8)||'/'||substr(a.process,1,5) "Client PID"
        , substr(a.program,1,15) PROGRAM
        --,ROUND((CURRENT_DATE-a.logon_time)*24) AS "Logged/Hours"
        , (
                select round(sum(ss.value)/1024) from v$sesstat ss, v$statname sn
                where ss.sid = a.sid and
                        sn.statistic# = ss.statistic# and
                        -- sn.name in ('session pga memory')
                        sn.name in ('session pga memory','session uga memory')
          ) AS total_memory
        , (b.block_gets + b.consistent_gets) TotGets
        , b.physical_reads phyRds
        , decode(a.status, 'ACTIVE', 'Y','INACTIVE', 'N') STATUS
        , CASE WHEN a.sid in (select sid from v$mystat where rownum = 1) THEN '<--
YOU' ELSE ' ' END "INFO"
FROM
         v$process p
        ,v$session a
        ,v$sess_io b
WHERE
a.paddr = p.addr
AND p.background IS NULL
--AND  a.sid NOT IN (select sid from v$mystat where rownum = 1)
AND a.sid = b.sid
AND a.username is not null
--AND (a.last_call_et < 3600 or a.status = 'ACTIVE')
--AND CURRENT_DATE - logon_time > 0
--AND a.sid NOT IN ( select sid from v$mystat where rownum=1)  -- exclude me
--AND (b.block_gets + b.consistent_gets) > 0
ORDER BY a.username;
exit

HTH

Dr Mich Talebzadeh LinkedIn  https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw http://talebzadehmich.wordpress.com
Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or
destructionof data or any other property which may arise from relying on this email's technical content
is explicitly disclaimed.The author will in no case be liable for any monetary damages arising
from suchloss, damage or destruction.  

On Fri, 10 Apr 2020 at 17:37, Ruijing Li <liruijing09@gmail.com> wrote:

Hi all,
I am on spark 2.4.4 and using scala 2.11.12, and running cluster mode on mesos. I am ingesting
from an oracle database using spark.read.jdbc. I am seeing a strange issue where spark just
hangs and does nothing, not starting any new tasks. Normally this job finishes in 30 stages
but sometimes it stops at 29 completed stages and doesn’t start the last stage. The spark
job is idling and there is no pending or active task. What could be the problem? Thanks.--

Cheers,Ruijing Li

Mime
View raw message