parquet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tham (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PARQUET-1022) [C++] Append mode in parquet-cpp
Date Tue, 12 Mar 2019 10:55:00 GMT

    [ https://issues.apache.org/jira/browse/PARQUET-1022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16790435#comment-16790435
] 

Tham commented on PARQUET-1022:
-------------------------------

Our application would like to have this feature. Our use case is when the application crashes
(it's true we cannot develop a perfect system), so we cannot close the file that is opening
at that time. We want to open, close multiple times, then we just need to open when we want
to write data, then close the file. With that way, we can reduce the risks to lose data. Any
suggestion?

> [C++] Append mode in parquet-cpp
> --------------------------------
>
>                 Key: PARQUET-1022
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1022
>             Project: Parquet
>          Issue Type: New Feature
>          Components: parquet-cpp
>    Affects Versions: cpp-1.1.0
>            Reporter: yugu
>            Assignee: Wes McKinney
>            Priority: Major
>
> As said, currently trying to work out a append feature for parquet files in c++.
> (been searching through repo etc, can't find example tho..)
> Current solution is to (assume no schema changes that is):
> Read in metadata
> Change metadata based on appended rows+ original rows
> Append a new row group (or multiple row group writer)
> Write the new rows.
> ---
> The problem is that, is approached this way, the original last row group may not be complete
filled. Was wondering if there is a fix or I'm using the api wrong...
> Thanks ! : D



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message