kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Stein <joe.st...@stealth.ly>
Subject Re: [DISCUSSION] adding the serializer api back to the new java producer
Date Tue, 25 Nov 2014 15:49:12 GMT
The serializer is an expected use of the producer/consumer now and think we
should continue that support in the new client. As far as breaking the API
it is why we released the 0.8.2-beta to help get through just these type of
blocking issues in a way that the community at large could be involved in
easier with a build/binaries to download and use from maven also.

+1 on the change now prior to the 0.8.2 release.

- Joe Stein


On Mon, Nov 24, 2014 at 11:43 PM, Sriram Subramanian <
srsubramanian@linkedin.com.invalid> wrote:

> Looked at the patch. +1 from me.
>
> On 11/24/14 8:29 PM, "Gwen Shapira" <gshapira@cloudera.com> wrote:
>
> >As one of the people who spent too much time building Avro repositories,
> >+1
> >on bringing serializer API back.
> >
> >I think it will make the new producer easier to work with.
> >
> >Gwen
> >
> >On Mon, Nov 24, 2014 at 6:13 PM, Jay Kreps <jay.kreps@gmail.com> wrote:
> >
> >> This is admittedly late in the release cycle to make a change. To add to
> >> Jun's description the motivation was that we felt it would be better to
> >> change that interface now rather than after the release if it needed to
> >> change.
> >>
> >> The motivation for wanting to make a change was the ability to really be
> >> able to develop support for Avro and other serialization formats. The
> >> current status is pretty scattered--there is a schema repository on an
> >>Avro
> >> JIRA and another fork of that on github, and a bunch of people we have
> >> talked to have done similar things for other serialization systems. It
> >> would be nice if these things could be packaged in such a way that it
> >>was
> >> possible to just change a few configs in the producer and get rich
> >>metadata
> >> support for messages.
> >>
> >> As we were thinking this through we realized that the new api we were
> >>about
> >> to introduce was kind of not very compatable with this since it was just
> >> byte[] oriented.
> >>
> >> You can always do this by adding some kind of wrapper api that wraps the
> >> producer. But this puts us back in the position of trying to document
> >>and
> >> support multiple interfaces.
> >>
> >> This also opens up the possibility of adding a MessageValidator or
> >> MessageInterceptor plug-in transparently so that you can do other custom
> >> validation on the messages you are sending which obviously requires
> >>access
> >> to the original object not the byte array.
> >>
> >> This api doesn't prevent using byte[] by configuring the
> >> ByteArraySerializer it works as it currently does.
> >>
> >> -Jay
> >>
> >> On Mon, Nov 24, 2014 at 5:58 PM, Jun Rao <junrao@gmail.com> wrote:
> >>
> >> > Hi, Everyone,
> >> >
> >> > I'd like to start a discussion on whether it makes sense to add the
> >> > serializer api back to the new java producer. Currently, the new java
> >> > producer takes a byte array for both the key and the value. While this
> >> api
> >> > is simple, it pushes the serialization logic into the application.
> >>This
> >> > makes it hard to reason about what type of data is being sent to Kafka
> >> and
> >> > also makes it hard to share an implementation of the serializer. For
> >> > example, to support Avro, the serialization logic could be quite
> >>involved
> >> > since it might need to register the Avro schema in some remote
> >>registry
> >> and
> >> > maintain a schema cache locally, etc. Without a serialization api,
> >>it's
> >> > impossible to share such an implementation so that people can easily
> >> reuse.
> >> > We sort of overlooked this implication during the initial discussion
> >>of
> >> the
> >> > producer api.
> >> >
> >> > So, I'd like to propose an api change to the new producer by adding
> >>back
> >> > the serializer api similar to what we had in the old producer.
> >>Specially,
> >> > the proposed api changes are the following.
> >> >
> >> > First, we change KafkaProducer to take generic types K and V for the
> >>key
> >> > and the value, respectively.
> >> >
> >> > public class KafkaProducer<K,V> implements Producer<K,V> {
> >> >
> >> >     public Future<RecordMetadata> send(ProducerRecord<K,V>
record,
> >> Callback
> >> > callback);
> >> >
> >> >     public Future<RecordMetadata> send(ProducerRecord<K,V>
record);
> >> > }
> >> >
> >> > Second, we add two new configs, one for the key serializer and another
> >> for
> >> > the value serializer. Both serializers will default to the byte array
> >> > implementation.
> >> >
> >> > public class ProducerConfig extends AbstractConfig {
> >> >
> >> >     .define(KEY_SERIALIZER_CLASS_CONFIG, Type.CLASS,
> >> > "org.apache.kafka.clients.producer.ByteArraySerializer",
> >>Importance.HIGH,
> >> > KEY_SERIALIZER_CLASS_DOC)
> >> >     .define(VALUE_SERIALIZER_CLASS_CONFIG, Type.CLASS,
> >> > "org.apache.kafka.clients.producer.ByteArraySerializer",
> >>Importance.HIGH,
> >> > VALUE_SERIALIZER_CLASS_DOC);
> >> > }
> >> >
> >> > Both serializers will implement the following interface.
> >> >
> >> > public interface Serializer<T> extends Configurable {
> >> >     public byte[] serialize(String topic, T data, boolean isKey);
> >> >
> >> >     public void close();
> >> > }
> >> >
> >> > This is more or less the same as what's in the old producer. The
> >>slight
> >> > differences are (1) the serializer now only requires a parameter-less
> >> > constructor; (2) the serializer has a configure() and a close() method
> >> for
> >> > initialization and cleanup, respectively; (3) the serialize() method
> >> > additionally takes the topic and an isKey indicator, both of which are
> >> > useful for things like schema registration.
> >> >
> >> > The detailed changes are included in KAFKA-1797. For completeness, I
> >>also
> >> > made the corresponding changes for the new java consumer api as well.
> >> >
> >> > Note that the proposed api changes are incompatible with what's in the
> >> > 0.8.2 branch. However, if those api changes are beneficial, it's
> >>probably
> >> > better to include them now in the 0.8.2 release, rather than later.
> >> >
> >> > I'd like to discuss mainly two things in this thread.
> >> > 1. Do people feel that the proposed api changes are reasonable?
> >> > 2. Are there any concerns of including the api changes in the 0.8.2
> >>final
> >> > release?
> >> >
> >> > Thanks,
> >> >
> >> > Jun
> >> >
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message