cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hari Sekhon (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CASSANDRA-7860) csv2sstable - bulk load CSV data to SSTables similar to json2sstable
Date Mon, 01 Sep 2014 17:05:20 GMT
Hari Sekhon created CASSANDRA-7860:
--------------------------------------

             Summary: csv2sstable - bulk load CSV data to SSTables similar to json2sstable
                 Key: CASSANDRA-7860
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7860
             Project: Cassandra
          Issue Type: New Feature
         Environment: DataStax Community Edition 2.0.9
            Reporter: Hari Sekhon
            Priority: Blocker


Need a csv2sstable utility to bulk load billions of rows of CSV data - impractical to have
to pre-convert to json before bulk loading to sstable.

CQL COPY really is too slow - a test of mere 4 million row 6GB CSV directly took 28 minutes...
while it only takes 60 secs to cat all the data off hdfs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message