Hi, is the subquery is user defined sqls or table name in db.
If it is user Defined sql.
Make sure ur partition column is in main select clause.

I am trying to fetch data from Oracle DB using a subquery and experiencing lot of performance issues.


Below is the query I am using,


Using Spark 2.0.2


val df = spark_session.read.format("jdbc")
.option("url", jdbc_url)
   .option("user", user)
   .option("password", pwd)
   .option("dbtable", "subquery")
   .option("partitionColumn", "id")  //primary key column uniformly distributed
   .option("lowerBound", "1")
   .option("upperBound", "500000")
.option("numPartitions", 30)


The above query is running using the 30 partitions, but when I see the UI it is only using 1 partiton to run the query.


Can anyone tell if I am missing anything or do I need to anything else to tune the performance of the query.