gora-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lewis john mcgibbney <lewi...@apache.org>
Subject Re: GSoC Ideas
Date Thu, 15 Mar 2018 07:26:09 GMT
I should also say, ALL of the projects below which I have named require the
Gora dependency to be upgraded.

On Thu, Mar 15, 2018 at 12:24 AM, lewis john mcgibbney <lewismc@apache.org>

> Hi Renato,
> On Wed, Mar 14, 2018 at 3:22 PM, Renato MarroquĂ­n Mogrovejo <
> renatoj.marroquin@gmail.com> wrote:
>> Hey guys,
>> There might not be an integration/convertors of Arrow to Avro (and/or
>> viceversa) because there are parquet readers that can take avro and once
>> stuff is in parquet, then arrow can be used directly.
> Yes there might not be. I actually raised this issue [0] a wee while ago
> on the Arrow list. At that time I was told, "...The use case you outline
> makes a lot of sense for Arrow to help out with. We don't yet have an AVRO
> <> Arrow converter written but it is something that would be great to
> have." So maybe that would be something to keep in mind.
> [0] https://s.apache.org/2GwS
>> Regarding if an integration of Parquet with Gora, I think it would be
>> interesting to make it easier for people to read and write parquet files by
>> providing a higher level api as Gora provides. However, for you @Talat,
>> that knows Gora pretty well, maybe you could take another project that
>> helps Gora more. For example, fixing the integration with Nutch. There are
>> multiple loose ends in Nutch 2.x and Gora that we have neglected as a
>> community.
>> IMHO that should be GSOC project.
> ACK, other existing projects which consume Gora are (off the top of my
> head),
>    - Chukwa - https://s.apache.org/cW6a
>    - Giraph - https://github.com/apache/giraph/tree/trunk/giraph-gora
>    - Camel - https://camel.apache.org/gora.html
>    - Nutch 2.X - https://github.com/apache/nutch/tree/2.x
> An interesting idea I had where Gora could be implemented would be in
> Hadoop metrics
> https://hadoop.apache.org/docs/current/hadoop-project-
> dist/hadoop-common/Metrics.html
> This would provide provide a text book usage for Gora to store Hadoop
> metrics in some datastore which would then be exposed for query and
> analysis.
>> I can't mentored it because I do not have enough insights on this, but
>> @Lewis and @Talat you can probably tackle this as mentor and student. This
>> would be an awesome contribution to the project as there are quite a lot of
>> people going over Nutch and trying to use it with Gora.
>> Just my 2c
> Understood Renato, no biggie. Thanks for your input. I know you are
> working with Parquet alot these days so your input is appreciated.
> Lewis


View raw message