cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Johan Oskarsson (JIRA)" <>
Subject [jira] Updated: (CASSANDRA-955) Hadoop doesn't schedule the tasks close to the data
Date Tue, 06 Apr 2010 14:38:33 GMT


Johan Oskarsson updated CASSANDRA-955:

    Attachment: CASSANDRA-955.patch

Looks up the hostname from the ipaddress in order to have Hadoop schedule the tasks as close
to the data as possible. This is done from where the Hadoop job is scheduled and not on Cassandra.

> Hadoop doesn't schedule the tasks close to the data
> ---------------------------------------------------
>                 Key: CASSANDRA-955
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Johan Oskarsson
>         Attachments: CASSANDRA-955.patch
> Hadoop relies on locations for data in input splits being represented as hostnames and
not ip addresses. Currently in my testing tasks are more often then not being scheduled on
a node that does not contain the data requested.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message