spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tor Myklebust (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-1672) Support separate partitioners (and numbers of partitions) for users and products
Date Tue, 29 Apr 2014 23:28:15 GMT

     [ https://issues.apache.org/jira/browse/SPARK-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Tor Myklebust updated SPARK-1672:
---------------------------------

    Component/s: MLlib

> Support separate partitioners (and numbers of partitions) for users and products
> --------------------------------------------------------------------------------
>
>                 Key: SPARK-1672
>                 URL: https://issues.apache.org/jira/browse/SPARK-1672
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>            Reporter: Tor Myklebust
>            Priority: Minor
>
> The user ought to be able to specify a partitioning of his data if he knows a good one.
 It's convenient to have separate partitioners for users and products so that no strange mapping
step needs to happen.
> It may also be reasonable to partition the users and products into different numbers
of partitions (for instance, to balance memory requirements) if the dataset is tall, thin,
and very sparse.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message