beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Work logged] (BEAM-4049) Improve write throughput of CassandraIO
Date Sun, 30 Sep 2018 17:53:00 GMT

     [ https://issues.apache.org/jira/browse/BEAM-4049?focusedWorklogId=149815&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-149815
]

ASF GitHub Bot logged work on BEAM-4049:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 30/Sep/18 17:52
            Start Date: 30/Sep/18 17:52
    Worklog Time Spent: 10m 
      Work Description: script3r commented on issue #5112: [BEAM-4049] Improve CassandraIO
write throughput by performing async queries
URL: https://github.com/apache/beam/pull/5112#issuecomment-425738613
 
 
   Hello.
   
   Latest is failing with the following issue:
   
   `Caused by: java.lang.NoSuchMethodError: com.datastax.driver.mapping.Mapper.saveAsync(Ljava/lang/Object;)Lorg/apache/beam/repackaged/beam_sdks_java_io_cassandra/com/google/common/util/concurrent/ListenableFuture;`
   
   That is running from a packaged pipeline in Flink.
   Any thoughts?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 149815)
    Time Spent: 6h 40m  (was: 6.5h)

> Improve write throughput of CassandraIO
> ---------------------------------------
>
>                 Key: BEAM-4049
>                 URL: https://issues.apache.org/jira/browse/BEAM-4049
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-java-cassandra
>    Affects Versions: 2.4.0
>            Reporter: Alexander Dejanovski
>            Assignee: Alexander Dejanovski
>            Priority: Major
>              Labels: performance
>             Fix For: 2.5.0
>
>          Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> The CassandraIO currently uses the mapper to perform writes in a synchronous fashion. 
> This implies that writes are serialized and is a very suboptimal way of writing to
Cassandra.
> The IO should use the saveAsync() method instead of save() and should wait for completion
each time 100 queries are in flight, in order to avoid overwhelming clusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message