spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chetan Khatri <chetan.opensou...@gmail.com>
Subject How to parallelize JDBC Read in Spark
Date Thu, 06 Sep 2018 11:17:23 GMT
Hello Dev Users,

I am struggling to parallelize JDBC Read in Spark, It is using 1 - 2 task
only to read data and taking so much of time to read.

Ex.

val invoiceLineItemDF = ((spark.read.jdbc(url = t360jdbcURL,
  table = invoiceLineItemQuery,
  columnName = "INVOICE_LINE_ITEM_ID",
  lowerBound = 1L,
  upperBound = 1000000L,
  numPartitions = 200,
  connectionProperties = connectionProperties
)))


Thanks

Mime
View raw message