pig-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Doo <michael....@verve.com>
Subject Reading partitioned Parquet data into Pig
Date Mon, 27 Aug 2018 17:18:41 GMT
Hello,

I’m trying to read in Parquet data into Pig that is partitioned (so it’s stored in S3
like s3://path/to/files/some_flag=true/part-00095-a2a6230b-9750-48e4-9cd0-b553ffc220de.c000.gz.parquet).
I’d like to load it into Pig and add the partitions as columns. I’ve read some resources
suggesting using the HCatLoader, but so far haven’t had success.

Any advice would be welcome.

~ Michael
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message