spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From matthes <>
Subject Re: Is it possible to use Parquet with Dremel encoding
Date Fri, 26 Sep 2014 15:48:08 GMT
Hi Frank,

thanks al lot for your response, this is a very helpful!

Actually I'm try to figure out does the current spark version supports
Repetition levels
( but now it
looks good to me.
It is very hard to find some good things about that. Now I found this as

I wasn't sure of that because nested data can be many different things!
If it works with SQL, to find the firstRepeatedid or secoundRepeatedid would
be awesome. But if it only works with kind of map/reduce job than it also
good. The most important thing is to filter the first or secound  repeated
value as fast as possible and in combination as well.
I start now to play with this things to get the best search results!

Me schema looks like this:

val nestedSchema =
    """message nestedRowSchema 
		  int32 firstRepeatedid;
		  repeated group level1
		  	int64 secoundRepeatedid;
		  	repeated group level2 
		      	int64	value1;
		      	int32	value2;


View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message