spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tor Myklebust (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SPARK-1672) Support separate partitioners (and numbers of partitions) for users and products
Date Tue, 29 Apr 2014 23:28:14 GMT
Tor Myklebust created SPARK-1672:
------------------------------------

             Summary: Support separate partitioners (and numbers of partitions) for users
and products
                 Key: SPARK-1672
                 URL: https://issues.apache.org/jira/browse/SPARK-1672
             Project: Spark
          Issue Type: Improvement
            Reporter: Tor Myklebust
            Priority: Minor


The user ought to be able to specify a partitioning of his data if he knows a good one.  It's
convenient to have separate partitioners for users and products so that no strange mapping
step needs to happen.

It may also be reasonable to partition the users and products into different numbers of partitions
(for instance, to balance memory requirements) if the dataset is tall, thin, and very sparse.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message