flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Flavio Pompermaier <pomperma...@okkam.it>
Subject Re: Flink Mongodb
Date Tue, 04 Nov 2014 14:02:10 GMT
Sorry I was looking at the wrong MongoInputFormat..the correct one is this:

https://github.com/mongodb/mongo-hadoop/blob/master/core/src/main/java/com/mongodb/hadoop/mapred/MongoInputFormat.java

So now I have my working example. Could you be interested in sharing it?
It works both with Avro and Kryo as default serializer (see
GenericTypeInfo.createSerializer()).

On Tue, Nov 4, 2014 at 10:30 AM, Flavio Pompermaier <pompermaier@okkam.it>
wrote:

> I don't know if that possible anymore..
>
> AzureTableInputFormat extends InputFormat<Text, WritableEntity>
> while MongoInputFormat extends InputFormat<Object, BSONObject>
>
> and thus I cannot do the following..
>
> HadoopInputFormat<Object, BSONObject> hdIf = new HadoopInputFormat<Object,
> BSONObject>(
>      new MongoInputFormat(), Object.class, BSONObject.class, new Job());
>
> Am I'm doing something wrong or is this a problem of Flink ?
>
>
> On Tue, Nov 4, 2014 at 10:03 AM, Flavio Pompermaier <pompermaier@okkam.it>
> wrote:
>
>> What do you mean for  "might lack support for local split assignment"?
>> You mean that InputFormat is not serializable? This instead is not true
>> for Mongodb?
>>
>>
>> On Tue, Nov 4, 2014 at 10:00 AM, Fabian Hueske <fhueske@apache.org>
>> wrote:
>>
>>> There's a page about Hadoop Compatibility that shows how to use the
>>> wrapper.
>>>
>>> The HBase format should work as well, but might lack support for local
>>> split assignment. In that case performance would suffer a lot.
>>>
>>> Am Dienstag, 4. November 2014 schrieb Flavio Pompermaier :
>>>
>>>> Should I start from
>>>> http://flink.incubator.apache.org/docs/0.7-incubating/example_connectors.html
>>>> ? Is it ok?
>>>> Thus, in principle, also the TableInputFormat of HBase could be used in
>>>> a similar way..isn't it?
>>>>
>>>> On Tue, Nov 4, 2014 at 9:42 AM, Fabian Hueske <fhueske@apache.org>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> the blog post uses Flinks wrapper for Hadoop InputFormats.
>>>>> This has been ported to the new API and is described in the
>>>>> documentation.
>>>>>
>>>>> So you just need to take Mongos Hadoop IF and plug it into the new IF
>>>>> wrapper. :-)
>>>>>
>>>>> Fabian
>>>>>
>>>>> Am Dienstag, 4. November 2014 schrieb Flavio Pompermaier :
>>>>>
>>>>> Hi to all,
>>>>>>
>>>>>> I saw this post
>>>>>> https://flink.incubator.apache.org/news/2014/01/28/querying_mongodb.html
>>>>>> but it use the old APIs (HadoopDataSource instead of DataSource).
>>>>>> How can I use Mongodb with the new Flink APIs?
>>>>>>
>>>>>> Best,
>>>>>> Flavio
>>>>>>
>>>>>
>>>>

Mime
View raw message