drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Philip Haynes <philip.hay...@virtualnation.com.au>
Subject Sample Query Design - Task Help Request
Date Thu, 18 Oct 2012 23:35:19 GMT
Hi,

In order to do query performance design I was hoping if someone could help
by creating a set of various queries which then maps down to various
primitives that can then be modelled in both C++/Supersonic and Java.
If it could be kept concrete and use the datasets below, the sample queries
can be developed and tested using BigQuery. For expediency  reasons I am
using the data set below with decompresses to a 38 GB sample file. If people
think other data set  files relevant, then please let me know, but I would
like to keep the final data set under 24 & 8 GB since this is the maximum
size of memory I have in readily available machines.

In creating test cases please have a view to concurrency models and how
thing such as SIMD will process queries.
In the first instance I would like to keep queries all in memory so I am
testing primitive operation rather than I/O performance of my hard disk.

Help appreciated,
Kind Regards,
Philip


https://code.google.com/p/supersonic/wiki/ExpressionReference
https://code.google.com/p/supersonic/wiki/OperationReference

https://developers.google.com/bigquery/docs/dataset-wikipedia
http://dumps.wikimedia.org/enwiki/20121001/enwiki-20121001-pages-articles-mu
ltistream.xml.bz2



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message