flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Flink Jira Bot (Jira)" <j...@apache.org>
Subject [jira] [Updated] (FLINK-22086) Does it support rate-limiting on reading hive in flink 1.12?Read hive source lead to high loadaverage
Date Thu, 20 May 2021 10:54:03 GMT

     [ https://issues.apache.org/jira/browse/FLINK-22086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Flink Jira Bot updated FLINK-22086:
-----------------------------------
    Labels: stale-major  (was: )

I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help the community
manage its development. I see this issues has been marked as Major but is unassigned and neither
itself nor its Sub-Tasks have been updated for 30 days. I have gone ahead and added a "stale-major"
to the issue". If this ticket is a Major, please either assign yourself or give an update.
Afterwards, please remove the label or in 7 days the issue will be deprioritized.


> Does it support rate-limiting on reading hive in flink 1.12?Read hive source lead to
high loadaverage
> -----------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-22086
>                 URL: https://issues.apache.org/jira/browse/FLINK-22086
>             Project: Flink
>          Issue Type: New Feature
>          Components: Connectors / Hive
>    Affects Versions: 1.12.0
>            Reporter: ying zhang
>            Priority: Major
>              Labels: stale-major
>         Attachments: image-2021-04-01-15-56-50-736.png, image-2021-04-01-15-58-18-265.png,
image-2021-04-01-16-07-28-469.png, image-2021-04-01-16-08-28-442.png
>
>
> I read hive source with flink sql batch,but I found a Exception like this:
>  org.apache.flink.runtime.JobException: Recovery is suppressed by NoRestartBackoffTimeStrategy
>      at org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:118)
>      at org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.getFailureHandlingResult(ExecutionFailureHandler.java:80)
>      at org.apache.flink.runtime.scheduler.DefaultScheduler.handleTaskFailure(DefaultScheduler.java:239)
>      at org.apache.flink.runtime.scheduler.DefaultScheduler.maybeHandleTaskFailure(DefaultScheduler.java:230)
>      at org.apache.flink.runtime.scheduler.DefaultScheduler.updateTaskExecutionStateInternal(DefaultScheduler.java:221)
>      at org.apache.flink.runtime.scheduler.SchedulerBase.updateTaskExecutionState(SchedulerBase.java:672)
>      at org.apache.flink.runtime.scheduler.SchedulerNG.updateTaskExecutionState(SchedulerNG.java:90)
>      at org.apache.flink.runtime.jobmaster.JobMaster.updateTaskExecutionState(JobMaster.java:453)
>      at sun.reflect.GeneratedMethodAccessor312.invoke(Unknown Source)
>      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>      at java.lang.reflect.Method.invoke(Method.java:498)
>      at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:306)
>      at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:213)
>      at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:77)
>      at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:159)
>      at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
>      at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
>      at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
>      at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
>      at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
>      at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
>      at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
>      at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
>      at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
>      at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
>      at akka.actor.ActorCell.invoke(ActorCell.scala:561)
>      at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
>      at akka.dispatch.Mailbox.run(Mailbox.scala:225)
>      at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
>      at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>      at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>      at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>      at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>  Caused by: org.apache.flink.runtime.io.network.netty.exception.LocalTransportException:
readAddress(..) failed: Connection reset by peer (connection to '11.69.21.53/11.69.21.53:19977')
>      at org.apache.flink.runtime.io.network.netty.CreditBasedPartitionRequestClientHandler.exceptionCaught(CreditBasedPartitionRequestClientHandler.java:201)
>      at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:302)
>      at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:281)
>      at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:273)
>      at org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline$HeadContext.exceptionCaught(DefaultChannelPipeline.java:1377)
>      at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:302)
>      at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:281)
>      at org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline.fireExceptionCaught(DefaultChannelPipeline.java:907)
>      at org.apache.flink.shaded.netty4.io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.handleReadException(AbstractEpollStreamChannel.java:728)
>      at org.apache.flink.shaded.netty4.io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:818)
>      at org.apache.flink.shaded.netty4.io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:475)
>      at org.apache.flink.shaded.netty4.io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378)
>      at org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
>      at org.apache.flink.shaded.netty4.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
>      at java.lang.Thread.run(Thread.java:748)
>  Caused by: org.apache.flink.shaded.netty4.io.netty.channel.unix.Errors$NativeIoException:
readAddress(..) failed: Connection reset by peer
>   
>   
>   
>  then, I watch the monitor of the machine:
> !image-2021-04-01-16-07-28-469.png!
>  
>  
>  
>  
> I have the jstack log:
> !image-2021-04-01-15-58-18-265.png!
>  
>  
> "Source Data Fetcher for Source: HiveSource-app.app_sdl_yinliu_search_query_log (164/250)#0"
Id=815 cpuUsage=99.96% deltaTime=201ms time=43270ms RUNNABLE
>  at sun.nio.cs.UTF_8$Decoder.decodeLoop(UTF_8.java:412)
>  at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:579)
>  at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:802)
>  at org.apache.hadoop.io.Text.decode(Text.java:412)
>  at org.apache.hadoop.io.Text.decode(Text.java:389)
>  at org.apache.hadoop.io.Text.toString(Text.java:280)
>  at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.getPrimitiveJavaObject(WritableStringObjectInspector.java:46)
>  at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.getPrimitiveJavaObject(WritableStringObjectInspector.java:26)
>  at org.apache.flink.table.functions.hive.conversion.HiveInspectors.toFlinkObject(HiveInspectors.java:291)
>  at org.apache.flink.table.functions.hive.conversion.HiveInspectors.toFlinkObject(HiveInspectors.java:338)
>  at org.apache.flink.connectors.hive.read.HiveMapredSplitReader.nextRecord(HiveMapredSplitReader.java:180)
>  at org.apache.flink.connectors.hive.read.HiveBulkFormatAdapter$HiveReader.nextRecord(HiveBulkFormatAdapter.java:336)
>  at org.apache.flink.connectors.hive.read.HiveBulkFormatAdapter$HiveReader.readBatch(HiveBulkFormatAdapter.java:319)
>  at org.apache.flink.connector.file.src.impl.FileSourceSplitReader.fetch(FileSourceSplitReader.java:67)
>  at org.apache.flink.connector.base.source.reader.fetcher.FetchTask.run(FetchTask.java:56)
>  at org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.runOnce(SplitFetcher.java:138)
>  at org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.run(SplitFetcher.java:101)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> "Source Data Fetcher for Source: HiveSource-app.app_sdl_yinliu_search_query_log (165/250)#0"
Id=817 cpuUsage=99.96% deltaTime=201ms time=44024ms RUNNABLE
>  at org.apache.hadoop.io.Text.append(Text.java:236)
>  at org.apache.hadoop.hive.ql.io.orc.DynamicByteArray.setText(DynamicByteArray.java:212)
>  at org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StringDictionaryTreeReader.next(TreeReaderFactory.java:1724)
>  at org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StringTreeReader.next(TreeReaderFactory.java:1397)
>  at org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$MapTreeReader.next(TreeReaderFactory.java:2274)
>  at org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StructTreeReader.next(TreeReaderFactory.java:2004)
>  at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1046)
>  at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:166)
>  at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:140)
>  at org.apache.flink.connectors.hive.read.HiveMapredSplitReader.reachedEnd(HiveMapredSplitReader.java:160)
>  at org.apache.flink.connectors.hive.read.HiveBulkFormatAdapter$HiveReader.readBatch(HiveBulkFormatAdapter.java:318)
>  at org.apache.flink.connector.file.src.impl.FileSourceSplitReader.fetch(FileSourceSplitReader.java:67)
>  at org.apache.flink.connector.base.source.reader.fetcher.FetchTask.run(FetchTask.java:56)
>  at org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.runOnce(SplitFetcher.java:138)
>  at org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.run(SplitFetcher.java:101)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> "Source Data Fetcher for Source: HiveSource-app.app_sdl_yinliu_search_query_log (166/250)#0"
Id=816 cpuUsage=99.96% deltaTime=201ms time=43490ms RUNNABLE
>  at org.apache.flink.table.data.util.DataFormatConverters$MapConverter.toBinaryMap(DataFormatConverters.java:1279)
>  at org.apache.flink.table.data.util.DataFormatConverters$MapConverter.toInternalImpl(DataFormatConverters.java:1245)
>  at org.apache.flink.table.data.util.DataFormatConverters$MapConverter.toInternalImpl(DataFormatConverters.java:1196)
>  at org.apache.flink.table.data.util.DataFormatConverters$DataFormatConverter.toInternal(DataFormatConverters.java:406)
>  at org.apache.flink.connectors.hive.read.HiveMapredSplitReader.nextRecord(HiveMapredSplitReader.java:185)
>  at org.apache.flink.connectors.hive.read.HiveBulkFormatAdapter$HiveReader.nextRecord(HiveBulkFormatAdapter.java:336)
>  at org.apache.flink.connectors.hive.read.HiveBulkFormatAdapter$HiveReader.readBatch(HiveBulkFormatAdapter.java:319)
>  at org.apache.flink.connector.file.src.impl.FileSourceSplitReader.fetch(FileSourceSplitReader.java:67)
>  at org.apache.flink.connector.base.source.reader.fetcher.FetchTask.run(FetchTask.java:56)
>  at org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.runOnce(SplitFetcher.java:138)
>  at org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.run(SplitFetcher.java:101)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
>  
> 1、I set 10 core  and 12 GB one TaskManage,
> 2、it looks like only 3 cores is full,but loadaverage goes up to 100,and adding cores
doesnot work
> 3、I run 'netstat -anp | wc -l', result is 18
> 4、 !image-2021-04-01-16-08-28-442.png!
>  
>  
>  
>  
> my hive storage config:
> # Storage Information# Storage InformationSerDe Library:      org.apache.hadoop.hive.ql.io.orc.OrcSerdeInputFormat: 
      org.apache.hadoop.hive.ql.io.orc.OrcInputFormatOutputFormat:        org.apache.hadoop.hive.ql.io.orc.OrcOutputFormatCompressed: 
        NoNum Buckets:        -1Bucket Columns:      []Sort Columns:       
[]
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message