cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <>
Subject [jira] Updated: (CASSANDRA-1434) ColumnFamilyOutputFormat performs blocking writes for large batches
Date Sun, 19 Sep 2010 15:45:34 GMT


Jonathan Ellis updated CASSANDRA-1434:

    Attachment: 1434-v3.txt

I squashed and added code to keep CFRW from slamming Cassandra with spikes of load: it keeps
a pooled connection, and sends mutations one at a time over that.  This is only a trivial
amount of overhead compared to using a large batch, since we're not reconnecting for each
message.  (The main advantage of using a larger batch is that it gives you an idempotent group
of work to replay if necessary, which doesn't matter here.  Under the hood it takes the same
code path.)

Also attempted to distinguish between recoverable errors and non- in the exception handling.

> ColumnFamilyOutputFormat performs blocking writes for large batches
> -------------------------------------------------------------------
>                 Key: CASSANDRA-1434
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.7 beta 2
>         Attachments: 0001-Switch-away-from-Multimap-and-fix-regression-introdu.patch,
0002-Improve-concurrency-and-add-basic-retries-by-attempt.patch, 0003-Switch-RingCache-back-to-multimap.patch,
0004-Replace-Executor-with-map-of-threads.patch, 1434-v3.txt
> By default, ColumnFamilyOutputFormat batches {{mapreduce.output.columnfamilyoutputformat.batch.threshold}}
or {{Long.MAX_VALUE}} mutations, and then performs a blocking write.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message