parquet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe L. Korn (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PARQUET-1022) [C++] Append mode in parquet-cpp
Date Tue, 12 Mar 2019 12:59:00 GMT

    [ https://issues.apache.org/jira/browse/PARQUET-1022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16790518#comment-16790518
] 

Uwe L. Korn commented on PARQUET-1022:
--------------------------------------

[~thamha] The solution here is to write more files and combine them afterwards. Reading Parquet
files that have no footer is not possible. Even with the possibility of modifying an existing
file, you would loose all data in your crash scenario as in the second write, the data would
be written over the footer. While no new footer is written, the data in a file cannot be restored.

> [C++] Append mode in parquet-cpp
> --------------------------------
>
>                 Key: PARQUET-1022
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1022
>             Project: Parquet
>          Issue Type: New Feature
>          Components: parquet-cpp
>    Affects Versions: cpp-1.1.0
>            Reporter: yugu
>            Assignee: Wes McKinney
>            Priority: Major
>
> As said, currently trying to work out a append feature for parquet files in c++.
> (been searching through repo etc, can't find example tho..)
> Current solution is to (assume no schema changes that is):
> Read in metadata
> Change metadata based on appended rows+ original rows
> Append a new row group (or multiple row group writer)
> Write the new rows.
> ---
> The problem is that, is approached this way, the original last row group may not be complete
filled. Was wondering if there is a fix or I'm using the api wrong...
> Thanks ! : D



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message