beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "nevi_me (JIRA)" <>
Subject [jira] [Commented] (BEAM-2639) Unbounded Source for MongoDB
Date Tue, 28 Nov 2017 05:26:00 GMT


nevi_me commented on BEAM-2639:

MongoDB 3.6 will include a `$changeStream` operator, enabling unbounded access to collections.

I'm busy preparing to upgrade to MongoDB, I tail the Oplog a lot, and want to implement these
change streams. I'd like to contribute an unbounded source for MongoDB, I think it'll help
me learn more about Beam's internals. It's not a high priority, but since it'd be my first
contribution to ASF, I might need some hand-holding when the time comes.

The one thing I need to first investigate is if there are breaking changes to the way users
authenticate to databases. Some drivers have been logging deprecation warnings about upcoming
3.6 changes. I'll also look into that.

I'll provide feedback mid-December when I have downtime.

> Unbounded Source for MongoDB
> ----------------------------
>                 Key: BEAM-2639
>                 URL:
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-extensions
>    Affects Versions: 2.0.0
>            Reporter: nevi_me
>            Assignee: Jean-Baptiste Onofré
>            Priority: Minor
> The current MongoDB source is bounded, which means that we can't build streaming pipelines
directly from MongoDB.
> MongoDB publishes changes in each collection through the oplog. Would it be possible
to create a connector that reads the oplog to create an unbounded source?
> As an oplog is only available through replication, this creates that dependency. We would
need to also consider whether a polling method (using the ObjectId) could be an appropriate
> Thanks

This message was sent by Atlassian JIRA

View raw message