That is what i am doing as the SQL dumps were too large. I was going to Map the XML to tables
and columns to generate the SQL.
Stefan
Von meinem iPad gesendet
Am 19.01.2013 um 23:21 schrieb "Jacques Nadeau" <jacques.drill@gmail.com>:
> Stefan, one other thought. It might also be interesting to explore working
> with the XML representation of the Wikipedia data to push the nested data
> requirements.
>
> Jacques
>
> On Sat, Jan 19, 2013 at 10:51 AM, Jacques Nadeau <jacques.drill@gmail.com>wrote:
>
>>
>>> * I drew a UML diagram. I saw that there is some glifffy support in
>>> confluenc,e but the free account is pretty much useless. I used omni
>>> graffle to draw the diagram, but this is payware on the mac - is there some
>>> usable freeware alternative? Don't mention tigris :-)
>>
>> I don't have any suggestions on this.
>>> * I have some ideas on the queries, but I am not sure how I should
>>> specify them? Should I use pseudo SQL? Prose? I saw the syntax document on
>>> the server, it it mature enough, that I attempt to use its syntax? Is there
>>> a BNF or better ANTLR grammar I can use to check my syntax? Should I draw
>>> one up while I am at it?
>>
>> I suggest you target SQL2003 (including subqueries). We're looking at how
>> to use Optiq's SQL parser for Drill. Our goal is to stay as close as
>> possible to that spec but add the following extensions:
>> - Add flatten operator similar to BigQuery syntax.
>> - Support use of selection and output identifiers using dotted/bracketed
>> notation. E.g. "select person.children[0].age as
>> output.profile.firstChildAge"
>> - Support new functions that can accept nested values including
>> collections and maps. For example "select ARRAY_LENGTH(person.children)".
>>
>> Once you have some sql examples, the next goal would be to manually
>> translate those into Logical Plan syntax. This syntax is still maturing so
>> I'd take it to the SQL stage first.
>>
>>
>>
>>>
>>>
>>>
>>> Stefan
>>>
>>>
>>>
>>> On 19.01.2013, at 02:05, Jacques Nadeau <jacques.drill@gmail.com> wrote:
>>>
>>>> The wiki is up. Michael and Stefan, it would be great if you started
>>>> putting your use case thoughts there.
>>>>
>>>> Jacques
>>>>
>>>> On Sun, Jan 13, 2013 at 3:31 PM, Ted Dunning <ted.dunning@gmail.com>
>>> wrote:
>>>>
>>>>> Ahh... yes. That wiki. I will ping infra again.
>>>>>
>>>>> (I was attaching your comment to the wikipedia use case and had
>>> confused
>>>>> myself)
>>>>>
>>>>> On Sun, Jan 13, 2013 at 2:53 PM, Michael Hausenblas <
>>>>> michael.hausenblas@gmail.com> wrote:
>>>>>
>>>>>>
>>>>>>> What do you need from me?
>>>>>>
>>>>>> Maybe I've overlooked something in which case I apologize - was
>>> wondering
>>>>>> if the public Wiki for Drill is available where Stefan, I and others
>>> can
>>>>>> write up the UC and queries.
>>>>>>
>>>>>> Cheers,
>>>>>> Michael
>>>>>>
>>>>>> --
>>>>>> Michael Hausenblas
>>>>>> Ireland, Europe
>>>>>> http://mhausenblas.info/
>>>>>>
>>>>>> On 13 Jan 2013, at 14:20, Ted Dunning <ted.dunning@gmail.com>
wrote:
>>>>>>
>>>>>>> What do you need from me?
>>>>>>>
>>>>>>>
>>>>>>> On Sun, Jan 13, 2013 at 11:06 AM, Michael Hausenblas <
>>>>>>> michael.hausenblas@gmail.com> wrote:
>>>>>>>
>>>>>>>> as soon as we hear back from Ted re the Wiki we work there.
>>
|