flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Martin Kiefer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-1597) VertexCentricIterations create inefficient execution plans
Date Mon, 23 Feb 2015 09:49:12 GMT

    [ https://issues.apache.org/jira/browse/FLINK-1597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14333162#comment-14333162

Martin Kiefer commented on FLINK-1597:

It appears that the magic word in the execution plan is "Forwad (Cached)". The UDFs are FINISHED
before the Iteration begins. I will close the Issue.

> VertexCentricIterations create inefficient execution plans
> ----------------------------------------------------------
>                 Key: FLINK-1597
>                 URL: https://issues.apache.org/jira/browse/FLINK-1597
>             Project: Flink
>          Issue Type: Bug
>          Components: Gelly
>    Affects Versions: master
>            Reporter: Martin Kiefer
> I did experiments with optimized versions of a graph algorithm that should utilize a
secondary sort on the edges and a trade off between superstep numbers and I/O. To my surprise
the optimizations did barely affect the execution times. I narrowed it down to inefficient
execution plans.
> I assumed that edge sets would be partitioned once at the beginning of a VertexCentricIteration
and never be touched again because they can not change during the iteration. I think this
should be the desired behavior. What actually happens is that UDFs creating the edge set are
pulled inside the iteration and are executed every superstep. This harms the performance of
graph algorithms significantly. 
> As a simple example have a look at the execution plan generated for the PageRankExample:
> https://gist.github.com/martinkiefer/28a63f953477e3987b5d

This message was sent by Atlassian JIRA

View raw message