spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Stojanov <m...@danielstojanov.com>
Subject Re: MongoDB plugin to Spark - too many open cursors
Date Tue, 27 Oct 2020 04:07:20 GMT
Hi,

Thanks.

I believe that this is an error message coming from the MongoDB server 
itself. Essentially there are multiple instances of my application 
running at the same time. So with a single or small number of 
applications there are never issues. It's an issue when a sufficient 
number of applications are running.

I am not aware of how the MongoDB client manages connections. For 
example, is it leaving connections hanging (rather than closing them) 
after it pulls data from MongoDB? I do not know if there is a way to 
specify to individual running applications to limit the number of active 
connections to the database. The database instance is running on AWS' 
DocumentDB, so the only way to allow additional cursors is to upgrade to 
a larger instance type. This seems unnecessary since my concern is just 
the number of open cursors, rather than performance needs of the 
hardware itself.


Regards,




On 26/10/20 1:52 pm, lec ssmi wrote:
> Is the connection pool configured by mongodb full?
>
> Daniel Stojanov <mail@danielstojanov.com 
> <mailto:mail@danielstojanov.com>> 于2020年10月26日周一 上午10:28写道:
>
>     Hi,
>
>
>     I receive an error message from the MongoDB server if there are
>     too many
>     Spark applications trying to access the database at the same time
>     (about
>     3 or 4), "Cannot open a new cursor since too many cursors are already
>     opened." I am not too sure of how to remedy this. I am not sure
>     how the
>     plugin behaves when it's pulling data.
>
>     It appears that a given running application will open many
>     connections
>     to the database. The total number of cursors in the database's
>     setting
>     is many more than the number of read operations occurring in Spark.
>
>
>     Does the plugin keep a connection/cursor open to the database even
>     after
>     it has pulled out the data into a dataframe?
>
>     Why are there so many open cursors for a single read operation?
>
>     Does catching the exception, sleeping for a while, then trying again
>     make sense? If cursors are kept open throughout the life of the
>     application this would not make sense.
>
>
>     Plugin version: org.mongodb.spark:mongo-spark-connector_2.12:2.4.1
>
>
>     ---------------------------------------------------------------------
>     To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>     <mailto:user-unsubscribe@spark.apache.org>
>

Mime
View raw message