spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Woody <>
Subject Lazy casting with Catalyst
Date Sat, 28 Mar 2015 15:26:44 GMT
Hi all,

In my application, we take input from Parquet files where BigDecimals are
written as Strings to maintain arbitrary precision.

I was hoping to convert these back over to Decimal with Unlimited
precision, but I'd still like to maintain the Parquet column pruning (all
my attempts thus far seem to bring in the whole Row). Is it possible to do
this lazily through catalyst?

Basically I'd want to do Cast(col, DecimalType()) whenever col is actually
referenced. Any tips on how to approach this would be appreciated.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message