spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gabor Somogyi <gabor.g.somo...@gmail.com>
Subject Re: Spark hangs while reading from jdbc - does nothing Removing Guess work from trouble shooting
Date Tue, 14 Apr 2020 11:48:49 GMT
The simplest way is to do thread dump which doesn't require any fancy tool
(it's available on Spark UI).
Without thread dump it's hard to say anything...


On Tue, Apr 14, 2020 at 11:32 AM jane thorpe <janethorpe1@aol.com.invalid>
wrote:

> Here a is another tool I use Logic Analyser  7:55
> https://youtu.be/LnzuMJLZRdU
>
> you could take some suggestions for improving performance  queries.
> https://dzone.com/articles/why-you-should-not-use-select-in-sql-query-1
>
>
> Jane thorpe
> janethorpe1@aol.com
>
>
> -----Original Message-----
> From: jane thorpe <janethorpe1@aol.com.INVALID>
> To: janethorpe1 <janethorpe1@aol.com>; mich.talebzadeh <
> mich.talebzadeh@gmail.com>; liruijing09 <liruijing09@gmail.com>; user <
> user@spark.apache.org>
> Sent: Mon, 13 Apr 2020 8:32
> Subject: Re: Spark hangs while reading from jdbc - does nothing Removing
> Guess work from trouble shooting
>
>
>
> This tool may be useful for you to trouble shoot your problems away.
>
>
> https://www.javacodegeeks.com/2020/04/simplifying-apm-remove-the-guesswork-from-troubleshooting.html
>
>
> "APM tools typically use a waterfall-type view to show the blocking time
> of different components cascading through the control flow within an
> application.
> These types of visualizations are useful, and AppOptics has them, but they
> can be difficult to understand for those of us without a PhD."
>
> Especially  helpful if you want to understand through visualisation and
> you do not have a phD.
>
>
> Jane thorpe
> janethorpe1@aol.com
>
>
> -----Original Message-----
> From: jane thorpe <janethorpe1@aol.com.INVALID>
> To: mich.talebzadeh <mich.talebzadeh@gmail.com>; liruijing09 <
> liruijing09@gmail.com>; user <user@spark.apache.org>
> CC: user <user@spark.apache.org>
> Sent: Sun, 12 Apr 2020 4:35
> Subject: Re: Spark hangs while reading from jdbc - does nothing
>
> You seem to be implying the error is intermittent.
> You seem to be implying data is being ingested  via JDBC. So the
> connection has proven itself to be working unless no data is arriving from
> the  JDBC channel at all.  If no data is arriving then one could say it
> could be  the JDBC.
> If the error is intermittent  then it is likely a resource involved in
> processing is filling to capacity.
> Try reducing the data ingestion volume and see if that completes, then
> increase the data ingested  incrementally.
> I assume you have  run the job on small amount of data so you have
> completed your prototype stage successfully.
>
> ------------------------------
> On Saturday, 11 April 2020 Mich Talebzadeh <mich.talebzadeh@gmail.com>
> wrote:
> Hi,
>
> Have you checked your JDBC connections from Spark to Oracle. What is
> Oracle saying? Is it doing anything or hanging?
>
> set pagesize 9999
> set linesize 140
> set heading off
> select SUBSTR(name,1,8) || ' sessions as on '||TO_CHAR(CURRENT_DATE, 'MON
> DD YYYY HH:MI AM') from v$database;
> set heading on
> column spid heading "OS PID" format a6
> column process format a13 heading "Client ProcID"
> column username  format a15
> column sid       format 999
> column serial#   format 99999
> column STATUS    format a3 HEADING 'ACT'
> column last      format 9,999.99
> column TotGets   format 999,999,999,999 HEADING 'Logical I/O'
> column phyRds    format 999,999,999 HEADING 'Physical I/O'
> column total_memory format 999,999,999 HEADING 'MEM/KB'
> --
> SELECT
>           substr(a.username,1,15) "LOGIN"
>         , substr(a.sid,1,5) || ','||substr(a.serial#,1,5) AS "SID/serial#"
>         , TO_CHAR(a.logon_time, 'DD/MM HH:MI') "LOGGED IN SINCE"
>         , substr(a.machine,1,10) HOST
>         , substr(p.username,1,8)||'/'||substr(p.spid,1,5) "OS PID"
>         , substr(a.osuser,1,8)||'/'||substr(a.process,1,5) "Client PID"
>         , substr(a.program,1,15) PROGRAM
>         --,ROUND((CURRENT_DATE-a.logon_time)*24) AS "Logged/Hours"
>         , (
>                 select round(sum(ss.value)/1024) from v$sesstat ss,
> v$statname sn
>                 where ss.sid = a.sid and
>                         sn.statistic# = ss.statistic# and
>                         -- sn.name in ('session pga memory')
>                         sn.name in ('session pga memory','session uga
> memory')
>           ) AS total_memory
>         , (b.block_gets + b.consistent_gets) TotGets
>         , b.physical_reads phyRds
>         , decode(a.status, 'ACTIVE', 'Y','INACTIVE', 'N') STATUS
>         , CASE WHEN a.sid in (select sid from v$mystat where rownum = 1)
> THEN '<-- YOU' ELSE ' ' END "INFO"
> FROM
>          v$process p
>         ,v$session a
>         ,v$sess_io b
> WHERE
> a.paddr = p.addr
> AND p.background IS NULL
> --AND  a.sid NOT IN (select sid from v$mystat where rownum = 1)
> AND a.sid = b.sid
> AND a.username is not null
> --AND (a.last_call_et < 3600 or a.status = 'ACTIVE')
> --AND CURRENT_DATE - logon_time > 0
> --AND a.sid NOT IN ( select sid from v$mystat where rownum=1)  -- exclude
> me
> --AND (b.block_gets + b.consistent_gets) > 0
> ORDER BY a.username;
> exit
>
> HTH
>
> Dr Mich Talebzadeh
>
> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
> http://talebzadehmich.wordpress.com
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On Fri, 10 Apr 2020 at 17:37, Ruijing Li <liruijing09@gmail.com> wrote:
>
> Hi all,
>
> I am on spark 2.4.4 and using scala 2.11.12, and running cluster mode on
> mesos. I am ingesting from an oracle database using spark.read.jdbc. I am
> seeing a strange issue where spark just hangs and does nothing, not
> starting any new tasks. Normally this job finishes in 30 stages but
> sometimes it stops at 29 completed stages and doesn’t start the last stage.
> The spark job is idling and there is no pending or active task. What could
> be the problem? Thanks.
> --
> Cheers,
> Ruijing Li
>
>

Mime
View raw message