flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabian Hueske <fhue...@apache.org>
Subject Re: Flink Mongodb
Date Tue, 04 Nov 2014 15:37:49 GMT
That would be great! :-)

2014-11-04 15:26 GMT+01:00 Flavio Pompermaier <pompermaier@okkam.it>:

> Ok! I hope to write some blog within tomorrow evening!
>
> On Tue, Nov 4, 2014 at 3:16 PM, Stephan Ewen <sewen@apache.org> wrote:
>
>> Absolutely, please share the example!
>>
>> On Tue, Nov 4, 2014 at 3:02 PM, Flavio Pompermaier <pompermaier@okkam.it>
>> wrote:
>>
>>> Sorry I was looking at the wrong MongoInputFormat..the correct one is
>>> this:
>>>
>>>
>>> https://github.com/mongodb/mongo-hadoop/blob/master/core/src/main/java/com/mongodb/hadoop/mapred/MongoInputFormat.java
>>>
>>> So now I have my working example. Could you be interested in sharing it?
>>> It works both with Avro and Kryo as default serializer (see
>>> GenericTypeInfo.createSerializer()).
>>>
>>> On Tue, Nov 4, 2014 at 10:30 AM, Flavio Pompermaier <
>>> pompermaier@okkam.it> wrote:
>>>
>>>> I don't know if that possible anymore..
>>>>
>>>> AzureTableInputFormat extends InputFormat<Text, WritableEntity>
>>>> while MongoInputFormat extends InputFormat<Object, BSONObject>
>>>>
>>>> and thus I cannot do the following..
>>>>
>>>> HadoopInputFormat<Object, BSONObject> hdIf = new
>>>> HadoopInputFormat<Object, BSONObject>(
>>>>      new MongoInputFormat(), Object.class, BSONObject.class, new
>>>> Job());
>>>>
>>>> Am I'm doing something wrong or is this a problem of Flink ?
>>>>
>>>>
>>>> On Tue, Nov 4, 2014 at 10:03 AM, Flavio Pompermaier <
>>>> pompermaier@okkam.it> wrote:
>>>>
>>>>> What do you mean for  "might lack support for local split
>>>>> assignment"?
>>>>> You mean that InputFormat is not serializable? This instead is not
>>>>> true for Mongodb?
>>>>>
>>>>>
>>>>> On Tue, Nov 4, 2014 at 10:00 AM, Fabian Hueske <fhueske@apache.org>
>>>>> wrote:
>>>>>
>>>>>> There's a page about Hadoop Compatibility that shows how to use the
>>>>>> wrapper.
>>>>>>
>>>>>> The HBase format should work as well, but might lack support for
>>>>>> local split assignment. In that case performance would suffer a lot.
>>>>>>
>>>>>> Am Dienstag, 4. November 2014 schrieb Flavio Pompermaier :
>>>>>>
>>>>>>> Should I start from
>>>>>>> http://flink.incubator.apache.org/docs/0.7-incubating/example_connectors.html
>>>>>>> ? Is it ok?
>>>>>>> Thus, in principle, also the TableInputFormat of HBase could
be used
>>>>>>> in a similar way..isn't it?
>>>>>>>
>>>>>>> On Tue, Nov 4, 2014 at 9:42 AM, Fabian Hueske <fhueske@apache.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> the blog post uses Flinks wrapper for Hadoop InputFormats.
>>>>>>>> This has been ported to the new API and is described in the
>>>>>>>> documentation.
>>>>>>>>
>>>>>>>> So you just need to take Mongos Hadoop IF and plug it into
the new
>>>>>>>> IF wrapper. :-)
>>>>>>>>
>>>>>>>> Fabian
>>>>>>>>
>>>>>>>> Am Dienstag, 4. November 2014 schrieb Flavio Pompermaier
:
>>>>>>>>
>>>>>>>> Hi to all,
>>>>>>>>>
>>>>>>>>> I saw this post
>>>>>>>>> https://flink.incubator.apache.org/news/2014/01/28/querying_mongodb.html
>>>>>>>>> but it use the old APIs (HadoopDataSource instead of
DataSource).
>>>>>>>>> How can I use Mongodb with the new Flink APIs?
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>> Flavio
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>
>

Mime
View raw message