From issues-return-177758-apmail-flink-issues-archive=flink.apache.org@flink.apache.org Sun Jul 15 08:26:03 2018 Return-Path: X-Original-To: apmail-flink-issues-archive@minotaur.apache.org Delivered-To: apmail-flink-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CF9D518D18 for ; Sun, 15 Jul 2018 08:26:03 +0000 (UTC) Received: (qmail 86045 invoked by uid 500); 15 Jul 2018 08:26:03 -0000 Delivered-To: apmail-flink-issues-archive@flink.apache.org Received: (qmail 86003 invoked by uid 500); 15 Jul 2018 08:26:03 -0000 Mailing-List: contact issues-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@flink.apache.org Delivered-To: mailing list issues@flink.apache.org Received: (qmail 85994 invoked by uid 99); 15 Jul 2018 08:26:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 15 Jul 2018 08:26:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 23EBCC031C for ; Sun, 15 Jul 2018 08:26:03 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -110.301 X-Spam-Level: X-Spam-Status: No, score=-110.301 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id j0oxl4hZ4K-U for ; Sun, 15 Jul 2018 08:26:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 2B5DE5F48F for ; Sun, 15 Jul 2018 08:26:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id A0460E00D2 for ; Sun, 15 Jul 2018 08:26:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 4471423F97 for ; Sun, 15 Jul 2018 08:26:00 +0000 (UTC) Date: Sun, 15 Jul 2018 08:26:00 +0000 (UTC) From: "ASF GitHub Bot (JIRA)" To: issues@flink.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (FLINK-8558) Add unified format interfaces and format discovery MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/FLINK-8558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16544464#comment-16544464 ] ASF GitHub Bot commented on FLINK-8558: --------------------------------------- Github user twalthr commented on a diff in the pull request: https://github.com/apache/flink/pull/6323#discussion_r202535180 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/factories/TableFactoryService.scala --- @@ -18,143 +18,358 @@ package org.apache.flink.table.factories -import java.util.{ServiceConfigurationError, ServiceLoader} +import java.util.{ServiceConfigurationError, ServiceLoader, Map => JMap} import org.apache.flink.table.api._ import org.apache.flink.table.descriptors.ConnectorDescriptorValidator._ import org.apache.flink.table.descriptors.FormatDescriptorValidator._ import org.apache.flink.table.descriptors.MetadataValidator._ import org.apache.flink.table.descriptors.StatisticsValidator._ -import org.apache.flink.table.descriptors.{DescriptorProperties, TableDescriptor, TableDescriptorValidator} +import org.apache.flink.table.descriptors._ import org.apache.flink.table.util.Logging +import org.apache.flink.util.Preconditions import _root_.scala.collection.JavaConverters._ import _root_.scala.collection.mutable /** - * Unified interface to search for TableFactoryDiscoverable of provided type and properties. + * Unified interface to search for a [[TableFactory]] of provided type and properties. */ object TableFactoryService extends Logging { private lazy val defaultLoader = ServiceLoader.load(classOf[TableFactory]) - def find(clz: Class[_], descriptor: TableDescriptor): TableFactory = { - find(clz, descriptor, null) + /** + * Finds a table factory of the given class and descriptor. + * + * @param factoryClass desired factory class + * @param descriptor descriptor describing the factory configuration + * @tparam T factory class type + * @return the matching factory + */ + def find[T](factoryClass: Class[T], descriptor: Descriptor): T = { + Preconditions.checkNotNull(factoryClass) + Preconditions.checkNotNull(descriptor) + + val descriptorProperties = new DescriptorProperties() + descriptor.addProperties(descriptorProperties) + findInternal(factoryClass, descriptorProperties.asMap, None) } - def find(clz: Class[_], descriptor: TableDescriptor, classLoader: ClassLoader) - : TableFactory = { + /** + * Finds a table factory of the given class, descriptor, and classloader. + * + * @param factoryClass desired factory class + * @param descriptor descriptor describing the factory configuration + * @param classLoader classloader for service loading + * @tparam T factory class type + * @return the matching factory + */ + def find[T](factoryClass: Class[T], descriptor: Descriptor, classLoader: ClassLoader): T = { + Preconditions.checkNotNull(factoryClass) + Preconditions.checkNotNull(descriptor) + Preconditions.checkNotNull(classLoader) - val properties = new DescriptorProperties() - descriptor.addProperties(properties) - find(clz, properties.asMap.asScala.toMap, classLoader) + val descriptorProperties = new DescriptorProperties() + descriptor.addProperties(descriptorProperties) + findInternal(factoryClass, descriptorProperties.asMap, None) } - def find(clz: Class[_], properties: Map[String, String]): TableFactory = { - find(clz: Class[_], properties, null) + /** + * Finds a table factory of the given class and property map. + * + * @param factoryClass desired factory class + * @param propertyMap properties that describe the factory configuration + * @tparam T factory class type + * @return the matching factory + */ + def find[T](factoryClass: Class[T], propertyMap: JMap[String, String]): T = { + Preconditions.checkNotNull(factoryClass) + Preconditions.checkNotNull(propertyMap) + + findInternal(factoryClass, propertyMap, None) } - def find(clz: Class[_], properties: Map[String, String], - classLoader: ClassLoader): TableFactory = { + /** + * Finds a table factory of the given class, property map, and classloader. + * + * @param factoryClass desired factory class + * @param propertyMap properties that describe the factory configuration + * @param classLoader classloader for service loading + * @tparam T factory class type + * @return the matching factory + */ + def find[T]( + factoryClass: Class[T], + propertyMap: JMap[String, String], + classLoader: ClassLoader) + : T = { + Preconditions.checkNotNull(factoryClass) + Preconditions.checkNotNull(propertyMap) + Preconditions.checkNotNull(classLoader) + + findInternal(factoryClass, propertyMap, Some(classLoader)) + } + + /** + * Finds a table factory of the given class, property map, and classloader. + * + * @param factoryClass desired factory class + * @param propertyMap properties that describe the factory configuration + * @param classLoader classloader for service loading + * @tparam T factory class type + * @return the matching factory + */ + private def findInternal[T]( + factoryClass: Class[T], + propertyMap: JMap[String, String], + classLoader: Option[ClassLoader]) + : T = { + + val properties = propertyMap.asScala.toMap + + // discover table factories + val foundFactories = discoverFactories(classLoader) - var matchingFactory: Option[(TableFactory, Seq[String])] = None + // filter by factory class + val classFactories = filterByFactoryClass( + factoryClass, + properties, + foundFactories) + + // find matching context + val contextFactories = filterByContext( + factoryClass, + properties, + foundFactories, + classFactories) + + // filter by supported keys + filterBySupportedProperties( + factoryClass, + properties, + foundFactories, + contextFactories) + } + + /** + * Searches for factories using Java service providers. + * + * @return all factories in the classpath + */ + private def discoverFactories[T](classLoader: Option[ClassLoader]): Seq[TableFactory] = { + val foundFactories = mutable.ArrayBuffer[TableFactory]() try { - val iter = if (classLoader == null) { - defaultLoader.iterator() - } else { - val customLoader = ServiceLoader.load(classOf[TableFactory], classLoader) - customLoader.iterator() + val iterator = classLoader match { + case Some(customClassLoader) => + val customLoader = ServiceLoader.load(classOf[TableFactory], customClassLoader) + customLoader.iterator() + case None => + defaultLoader.iterator() } - while (iter.hasNext) { - val factory = iter.next() - - if (clz.isAssignableFrom(factory.getClass)) { - val requiredContextJava = try { - factory.requiredContext() - } catch { - case t: Throwable => - throw new TableException( - s"Table source factory '${factory.getClass.getCanonicalName}' caused an exception.", - t) - } - - val requiredContext = if (requiredContextJava != null) { - // normalize properties - requiredContextJava.asScala.map(e => (e._1.toLowerCase, e._2)) - } else { - Map[String, String]() - } - - val plainContext = mutable.Map[String, String]() - plainContext ++= requiredContext - // we remove the versions for now until we have the first backwards compatibility case - // with the version we can provide mappings in case the format changes - plainContext.remove(CONNECTOR_PROPERTY_VERSION) - plainContext.remove(FORMAT_PROPERTY_VERSION) - plainContext.remove(METADATA_PROPERTY_VERSION) - plainContext.remove(STATISTICS_PROPERTY_VERSION) - - if (plainContext.forall(e => properties.contains(e._1) && properties(e._1) == e._2)) { - matchingFactory match { - case Some(_) => throw new AmbiguousTableFactoryException(properties) - case None => matchingFactory = - Some((factory.asInstanceOf[TableFactory], requiredContext.keys.toSeq)) - } - } - } + + while (iterator.hasNext) { + val factory = iterator.next() + foundFactories += factory } + + foundFactories } catch { case e: ServiceConfigurationError => LOG.error("Could not load service provider for table factories.", e) throw new TableException("Could not load service provider for table factories.", e) } + } + + /** + * Filters for factories with matching context. + * + * @return all matching factories + */ + private def filterByContext[T]( + factoryClass: Class[T], + properties: Map[String, String], + foundFactories: Seq[TableFactory], + classFactories: Seq[TableFactory]) + : Seq[TableFactory] = { + + val matchingFactories = mutable.ArrayBuffer[TableFactory]() + + classFactories.foreach { factory => + val requestedContext = normalizeContext(factory) + + val plainContext = mutable.Map[String, String]() + plainContext ++= requestedContext + // we remove the version for now until we have the first backwards compatibility case + // with the version we can provide mappings in case the format changes --- End diff -- I opened FLINK-9851 for that. > Add unified format interfaces and format discovery > -------------------------------------------------- > > Key: FLINK-8558 > URL: https://issues.apache.org/jira/browse/FLINK-8558 > Project: Flink > Issue Type: New Feature > Components: Streaming Connectors > Reporter: Timo Walther > Assignee: Timo Walther > Priority: Major > Labels: pull-request-available > > In the last release, we introduced a new module {{flink-formats}}. Currently only {{flink-avro}} is located there but we will add more formats such as {{flink-json}}, {{flink-protobuf}}, and so on. For better separation of concerns we want decouple connectors from formats: e.g., remove {{KafkaAvroTableSource}} and {{KafkaJsonTableSource}}. > A newly introduced {{FormatFactory}} will use Java service loaders to discovery available formats in the classpath (similar to how file systems are discovered now). A {{Format}} will provide a method for converting {{byte[]}} to target record type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)