cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Liu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-6091) Better Vnode support in hadoop/pig
Date Mon, 14 Oct 2013 21:05:42 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13794482#comment-13794482
] 

Alex Liu commented on CASSANDRA-6091:
-------------------------------------

The following code
{code}
   List<TokenRange> masterRangeNodes = getRangeMap(conf);
{code}
returns all the token ranges. We need find a way to merge the token ranges into bigger token
ranges and keep the replica locations no change.

Merging token ranges helps reduce the number of splits. The reduction rate depends on how
random the token ranges are shuffled around the ring. It helps a lot if we could find a better
shuffle algorithm to maximum the merging.

> Better Vnode support in hadoop/pig
> ----------------------------------
>
>                 Key: CASSANDRA-6091
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6091
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Alex Liu
>            Assignee: Alex Liu
>
> CASSANDRA-6084 shows there are some issues during running hadoop/pig job if vnodes are
enable. Also the hadoop performance of vnode enabled nodes  are bad for there are so many
splits.
> The idea is to combine vnode splits into a big sudo splits so it work like vnode is disable
for hadoop/pig job



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message