cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tupshin Harper (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-6704) Create wide row scanners
Date Sat, 15 Feb 2014 22:46:21 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13902558#comment-13902558
] 

Tupshin Harper commented on CASSANDRA-6704:
-------------------------------------------

Given the dissension over this issue, and given my shared interest in many of the objectives
of this ticket (over and above the overlap with CASSANDRA-6167), I'd like to propose an alternative
way forward.

What if we were to create an interface exactly analogous to triggers that would have two hooks
(instead of the single one for triggers). One to act on on the query itself before it is executed,
and another to act on the result set of any query.

The result would be jar deployment of a SELECT equivalent of triggers, and would have all
the same pros and caveats as triggers.

Admin deployment, and table-level permissions to use them would be the same.

The main thing that would be sacrificed, with respect to this ticket, would be embedded groovy
in select statements, as I believe this is the most controversial aspect. But it would provide
a mechanism around which to discuss the possibility of embedded turing complete scripting
in CQL in the future. 

Thiis would appear to provide Ed the necessary hooks to achieve most of his goals by automating
groovy->jar deployment outside of core cassandra code.

> Create wide row scanners
> ------------------------
>
>                 Key: CASSANDRA-6704
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6704
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Edward Capriolo
>            Assignee: Edward Capriolo
>
> The BigTable white paper demonstrates the use of scanners to iterate over rows and columns.
http://static.googleusercontent.com/media/research.google.com/en/us/archive/bigtable-osdi06.pdf
> Because Cassandra does not have a primary sorting on row keys scanning over ranges of
row keys is less useful. 
> However we can use the scanner concept to operate on wide rows. For example many times
a user wishes to do some custom processing inside a row and does not wish to carry the data
across the network to do this processing. 
> I have already implemented thrift methods to compile dynamic groovy code into Filters
as well as some code that uses a Filter to page through and process data on the server side.
> https://github.com/edwardcapriolo/cassandra/compare/apache:trunk...trunk
> The following is a working code snippet.
> {code}
>     @Test
>     public void test_scanner() throws Exception
>     {
>       ColumnParent cp = new ColumnParent();
>       cp.setColumn_family("Standard1");
>       ByteBuffer key = ByteBuffer.wrap("rscannerkey".getBytes());
>       for (char a='a'; a < 'g'; a++){
>         Column c1 = new Column();
>         c1.setName((a+"").getBytes());
>         c1.setValue(new byte [0]);
>         c1.setTimestamp(System.nanoTime());
>         server.insert(key, cp, c1, ConsistencyLevel.ONE);
>       }
>       
>       FilterDesc d = new FilterDesc();
>       d.setSpec("GROOVY_CLASS_LOADER");
>       d.setName("limit3");
>       d.setCode("import org.apache.cassandra.dht.* \n" +
>               "import org.apache.cassandra.thrift.* \n" +
>           "public class Limit3 implements SFilter { \n " +
>           "public FilterReturn filter(ColumnOrSuperColumn col, List<ColumnOrSuperColumn>
filtered) {\n"+
>           " filtered.add(col);\n"+
>           " return filtered.size()< 3 ? FilterReturn.FILTER_MORE : FilterReturn.FILTER_DONE;\n"+
>           "} \n" +
>         "}\n");
>       server.create_filter(d);
>       
>       
>       ScannerResult res = server.create_scanner("Standard1", "limit3", key, ByteBuffer.wrap("a".getBytes()));
>       Assert.assertEquals(3, res.results.size());
>     }
> {code}
> I am going to be working on this code over the next few weeks but I wanted to get the
concept our early so the design can see some criticism.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message