beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kenneth Knowles (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (BEAM-2516) User reports 4 minutes to process 1 million line CSV in DirectRunner
Date Tue, 12 Sep 2017 23:15:00 GMT

    [ https://issues.apache.org/jira/browse/BEAM-2516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16163818#comment-16163818
] 

Kenneth Knowles edited comment on BEAM-2516 at 9/12/17 11:14 PM:
-----------------------------------------------------------------

Went through the commit history to be sure, and things are quick before fa3a5abbc94db629feae8d7d73a31e7dda06bf76
while they are slow afterwards, so it is isolated to the use of dehydration-insensitive APIs
in the ParDo evaluator, as suspected.


was (Author: kenn):
Went through the commit history to be sure, and things are quick before 4b355844a4920bc9faba75f7cd61008bedebaf29
while they are slow afterwards, so it is isolated to the use of dehydration-insensitive APIs
in the ParDo evaluator, as suspected.

> User reports 4 minutes to process 1 million line CSV in DirectRunner
> --------------------------------------------------------------------
>
>                 Key: BEAM-2516
>                 URL: https://issues.apache.org/jira/browse/BEAM-2516
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-direct
>            Reporter: Kenneth Knowles
>            Assignee: Kenneth Knowles
>            Priority: Minor
>             Fix For: 2.2.0
>
>
> https://stackoverflow.com/questions/44736414/simple-apache-beam-manipulations-work-very-slow
> I don't know what the expectation are here, so I wasn't ready to say this is WAI. Low
priority since it isn't what the runner is for anyhow, but this seems like the scale of data
that should be snappy. Worth investigating, or maybe you can quickly indicate why it is expected?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message