cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christian Spriegel (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-4304) Add bytes-limit clause to queries
Date Mon, 04 Jun 2012 00:29:22 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Christian Spriegel updated CASSANDRA-4304:
------------------------------------------

    Attachment: TestImplForSlices.patch

Attached a simple implementation for slice queries.
                
> Add bytes-limit clause to queries
> ---------------------------------
>
>                 Key: CASSANDRA-4304
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4304
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API, Core
>            Reporter: Christian Spriegel
>             Fix For: 1.2
>
>         Attachments: TestImplForSlices.patch
>
>
> Idea is to add a second limit clause to (slice)queries. This would allow easy loading
of batches, even if content is variable sized.
> Imagine the following use case:
> You want to load a batch of XMLs, where each is between 100bytes and 5MB large.
> Currently you can load either
> - a large number of XMLs, but risk OOMs or timeouts
> or
> - a small number of XMLs, and do too many queries where each query usually retrieves
very little data.
> With cassandra being able to limit by size and not just count, we could do a single query
which would never OOM but always return a decent amount of data -- with no extra overhead
for multiple queries.
> Few thoughts from my side:
> - The limit should be a soft limit, not a hard limit. Therefore it will always return
at least one row/column, even if that one large than the limit specifies.
> - HintedHandoffManager:303 is already doing a InMemoryCompactionLimit/averageColumnSize
to avoid OOM. It could then simply use the new limit clause :-)
> - A bytes-limit on a range- or indexed-query should always return a complete row

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message