spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matei Zaharia (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-1415) Add a minSplits parameter to wholeTextFiles
Date Tue, 08 Apr 2014 05:06:16 GMT

    [ https://issues.apache.org/jira/browse/SPARK-1415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962585#comment-13962585
] 

Matei Zaharia commented on SPARK-1415:
--------------------------------------

Hey Xusen, that makes sense. I think that for consistency with our other API methods, we should
add minSplits here, and we can compute maxSplitSize from it. Later on we can have versions
of the methods that take a maxSplitSize. But on the old Hadoop API for example we can't easily
change this, and it seems that a maxSplitSize is always possible to compute from minSplits.

> Add a minSplits parameter to wholeTextFiles
> -------------------------------------------
>
>                 Key: SPARK-1415
>                 URL: https://issues.apache.org/jira/browse/SPARK-1415
>             Project: Spark
>          Issue Type: Bug
>            Reporter: Matei Zaharia
>            Assignee: Xusen Yin
>              Labels: Starter
>
> This probably requires adding one to newAPIHadoopFile too.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message