kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zhen Zhang <zhqu...@gmail.com>
Subject Re: Scan timeout and slow sync call on TS side
Date Fri, 26 Jan 2018 09:01:31 GMT
​
​Attach iowait of all 32 cores. It seems only several cores' iowait is high.

2018-01-26 16:49 GMT+08:00 Zhen Zhang <zhquake@gmail.com>:

> Hi Kudu Users,
>
> We use Kudu1.3 and encountered the following exception yesterday:
>
> 2018-01-25,20:14:42,001 WARN org.apache.kudu.client.AsyncKuduScanner:
> 1042e82c70594e3aaca741b686aa91fa pretends to not know
> KuduScanner(table=b2c_i18n.kudu_i18n_web_mistat_data, tablet=
> 1042e82c70594e3aaca741b686aa91fa, scannerId="
> 14f926d1384f4d848f62a7e701d2d623", scanRequestTimeout=30000)
> org.apache.kudu.client.NonRecoverableException: Invalid call sequence ID
> in scan request
> at org.apache.kudu.client.TabletClient.dispatchTSErrorOrReturnExcepti
> on(TabletClient.java:518)
> at org.apache.kudu.client.TabletClient.messageReceived(
> TabletClient.java:450)
> at org.apache.kudu.client.shaded.org.jboss.netty.channel.
> SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.
> java:70)
> at org.apache.kudu.client.TabletClient.handleUpstream(
> TabletClient.java:595)
>
>
> The following seems to be the corresponding log on TS side:
>
> W0125 20:14:43.624604 69076 rpcz_store.cc:234] Call kudu.tserver.TabletServerService.Scan
> from 10.68.146.36:35782 (request call id 57) took 11635ms (client timeout
> 10000).
> W0125 20:14:43.624671 69076 rpcz_store.cc:238] Trace:
> 0125 20:14:31.989201 (+     0us) service_pool.cc:143] Inserting onto call
> queue
> 0125 20:14:31.989249 (+    48us) service_pool.cc:202] Handling call
> 0125 20:14:31.989532 (+   283us) tablet_service.cc:1669] Found scanner
> 14f926d1384f4d848f62a7e701d2d623
> 0125 20:14:43.621814 (+11632282us) tablet_service.cc:1728] Deadline
> expired - responding early
> 0125 20:14:43.624597 (+  2783us) inbound_call.cc:130] Queueing success
> response
> Metrics: {"cfile_cache_hit":21,"cfile_cache_hit_bytes":180754,"
> cfile_cache_miss":28,"cfile_cache_miss_bytes":780542,"lbm_
> read_time_us":11601328,"lbm_reads_1-10_ms":3,"lbm_reads_
> 10-100_ms":6,"lbm_reads_gt_100_ms":14,"lbm_reads_lt_1ms":
> 5,"spinlock_wait_cycles":21888,"tcmalloc_contention_cycles":220544}
>
>
> It seems kudu spend too much time processing the scan request. The CPU
> load then was about 7 ~ 10, and the machine has 36 cores.
>
> I also found some warning logs on the server side like:
>
> W0125 20:14:31.228502 68713 env_posix.cc:682] Time spent sync call for
> /home/work/app/kudu/azsgprc-xiaomi/kudu_tablet_server/hdd2/data/
> f24c68a1bd1a4ed9be6b4d612487ba42.metadata: real 1.799s   user 0.000s
>  sys 0.000s
> W0125 20:14:32.640887 68713 env_posix.cc:682] Time spent sync call for
> /home/work/app/kudu/azsgprc-xiaomi/kudu_tablet_server/hdd2/data/
> 6f29e46d493a40e3a3610e7cff8023b6.metadata: real 1.410s   user 0.000s
>  sys 0.000s
> W0125 20:14:34.116614 68716 env_posix.cc:682] Time spent sync call for
> /home/work/app/kudu/azsgprc-xiaomi/kudu_tablet_server/hdd2/tablet-meta/
> 21b28a6ca38548b7afab09324ecf559a.kudutmp.wVZgBz: real 5.792s      user
> 0.000s     sys 0.001s
> W0125 20:14:35.665977 68713 env_posix.cc:682] Time spent sync call for
> /home/work/app/kudu/azsgprc-xiaomi/kudu_tablet_server/hdd2/data/
> a1ce36a69a22427d802f7a856f1f6582.metadata: real 2.900s   user 0.000s
>  sys 0.000s
> W0125 20:14:39.278427 68713 env_posix.cc:682] Time spent sync call for
> /home/work/app/kudu/azsgprc-xiaomi/kudu_tablet_server/hdd2/data/
> 72f36af948684ca2a519e910f76ca91f.metadata: real 1.650s   user 0.000s
>  sys 0.000s
>
>
> Some sync call also seems slow, for example 21b28a6ca38548b7afab09324ecf559a
> takes more than 5 seconds to sync. I checked more about this tablet and
> found it has about 110GB on-disk data.
>
> I wonder if this huge tablet cause all the problems, including slow sync
> and slow scan, but these are tablets of two different tables, and the cpu
> load seems not high, so I'm not sure about it. What's your idea?
>
> Thanks,
> Zhen
>
>

Mime
View raw message