spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lavelle, Shawn" <Shawn.Lave...@osii.com>
Subject DataSource API v2 & Spark-SQL
Date Mon, 03 Aug 2020 12:27:18 GMT
Hello Spark community,
   I have a custom datasource in v1 API that I'm trying to port to v2 API, in Java.  Currently
I have a DataSource registered via catalog.createTable(name, <package>, schema, options
map).  When trying to do this in data source API v2, I get an error saying my class (package)
isn't a valid data source Can you help me out?

Spark versions are 3.0.0 w/scala 2.12, artifacts are Spark-core, spark-sql, spark-hive, spark-hive-thriftserver,
spark-catalyst

Here's what the dataSource definition:  public class LogTableSource implements  TableProvider,
 SupportsRead,  DataSourceRegister, Serializable

I'm guessing that I am missing one of the required interfaces. Note, I did try this with using
the LogTableSource below as "DefaultSource" but the behavior is the same.  Also, I keep reading
about a DataSourceV2 Marker Interface, but it seems deprecated?

Also, I tried to add DataSourceV2ScanRelation but that won't compile:
Output() in DataSourceV2ScanRelation cannot override Output() in QueryPlan return type Seq<AttributeReference>
is not compatible with Seq<Attribute>

  I'm fairly stumped - everything I've read online says there's a marker interface of some
kind and yet I can't find it in my package list.

  Looking forward to hearing from you,

~ Shawn







[OSI]
Shawn Lavelle

Software Development

4101 Arrowhead Drive
Medina, Minnesota 55340-9457
Phone: 763 551 0559
Email: Shawn.Lavelle@osii.com
Website: www.osii.com<https://www.osii.com>

Mime
View raw message