kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thunder Stumpges <tstump...@ntent.com>
Subject RE: Stream naming conventions?
Date Tue, 03 Mar 2015 22:43:23 GMT
Sure, these are contrived, but you'll get the idea :)

Note: the suffixes are generally an enumeration or combination of two enumerations, so the
"domain" of values should always be bounded (so that the number of topics is also bounded).
The idea is any time we want to use the same avro schema but don't want the messages to be
in the same topic in kafka, we use the suffix to properly separate them.

As a phase of processing:
   org.ntent.addelivery.pageview-incoming
   org.ntent.addelivery.pageview-filtered
   org.ntent.addelivery.pageview-duplicate
   org.ntent.addelivery.pageview-clean

To separate "instances" of a particular kind of activity:
   org.ntent.addelivery.feedrequest-feed1
   org.ntent.addelivery.feedrequest-feed2
   org.ntent.addelivery.feedrequest-feed3

To denote the type of "statistic":
   org.ntent.addelivery.filterstats-knownoffender
   org.ntent.addelivery.filterstats-bot
   org.ntent.addelivery.filterstats-clickrate

Hope this helps :)


-----Original Message-----
From: Julio Castillo [mailto:jcastillo@FinancialEngines.com] 
Sent: Tuesday, March 03, 2015 10:56 AM
To: users@kafka.apache.org
Cc: tgautier@yahoo.com; kafka-users@incubator.apache.org
Subject: Re: Stream naming conventions?

Can you provide some examples on your naming patterns described below?

Thanks

** julio

On 3/3/15, 6:56 AM, "Thunder Stumpges" <tstumpges@ntent.com> wrote:

>I'm not sure who you were asking the question to, but since Gwen's was 
>not bound to any restrictions just a guideline, I'll assume you meant 
>me
>:)
>
>We have a concept of a "topic suffix property" that is some property in 
>the data that can change dynamically. The full topic name then becomes 
>"<avro_class>-<topic_suffix>" the dash is agreed never to be used in a 
>topic suffix so we can strip just the last dash to get back to the 
>class name. You could pick any delimiter not used in class names or suffixes.
>
>The topic suffix is then where we put things like processing stage 
>(incoming, cleaned, duplicate, etc) as well as any other orthogonal 
>delineation that needs to be in a different topic.
>
>We use .NET so I'm not sure the terminology for java but we have 
>property attributes to declare a property as the "topic suffix 
>property" (and also the "message key property") and we use "property 
>getters" in a partial class to do dynamic computation of these if necessary.
>
>A "message registry" then uses reflection to get the topic name and 
>message key for any message going out our producer. It also deals with 
>stripping the topic suffix for consumers looking for the avro type 
>given a topic name.
>
>So far this has worked great for us.
>Cheers,
>Thunder
>
>
>
>-----Original Message-----
>From: Maciej Jaƛkowski [maciej.jaskowski@gmail.com]
>Received: Tuesday, 03 Mar 2015, 2:34AM
>To: users@kafka.apache.org [users@kafka.apache.org]
>CC: Taylor Gautier [tgautier@yahoo.com]; 
>kafka-users@incubator.apache.org [kafka-users@incubator.apache.org]
>Subject: Re: Stream naming conventions?
>
>This approach sounds nice at first but it would fail if you start 
>sending the same message but partitioned in different (orthogonal) 
>ways. How would you go about that?
>
>Maciej
>
>On 25 February 2015 at 05:17, Gwen Shapira <gshapira@cloudera.com> wrote:
>> Nice :) I like the idea of tying topic name to avro schemas.
>>
>> I have experience with other people's data, and until now I mostly
>> recommended:
>> <app type>.<app name>.<data set name>.<stage of processing>
>>
>> So we end up with things like:
>> etl.onlineshop.searches.validated
>>
>> Or if I have my own test dataset that I don't want to share:
>> users.gshapira.newapp.testing1
>>
>> Makes it relatively easy to share datasets across the organization, 
>>and  also makes white-listing and black-listing relatively simple 
>>because of the  hierarchy (until we add a real topic hierarchy to 
>>kafka...).
>>
>> Gwen
>>
>> On Tue, Feb 24, 2015 at 1:13 PM, Thunder Stumpges 
>> <tstumpges@ntent.com>
>> wrote:
>>
>>> We have a global namespace hierarchy for topics that is exactly our 
>>>Avro  namespace with Class Name. The template is basically:
>>>
>>> <root_ns>.Core.<core_data_types_shared_across_company>
>>> <root_ns>.<product>.<product_specific_hierarchy>
>>>
>>> The up side of this for us is that since the topics are named based 
>>>on the  Avro schema namespace and type, we can look up the avro 
>>>schema in the Avro  Schema Repository using the topic name, and the 
>>>schema ID coded into the  message. Each product then also has the 
>>>flexibility of defining whatever  topics they find useful.
>>>
>>> Hope this helps,
>>> Thunder
>>>
>>> -----Original Message-----
>>> From: Taylor Gautier [mailto:tgautier@yahoo.com.INVALID]
>>> Sent: Tuesday, February 24, 2015 12:11 PM
>>> To: kafka-users@incubator.apache.org
>>> Subject: Stream naming conventions?
>>>
>>> Hello all,
>>> Just wondering if those with a good amount of experience using Kafka 
>>> in production with many streams have converged on any sort of naming 
>>> convention.  If so would you be willing to share?
>>> Thanks in advance,
>>> Taylor
>>>
>
>
>
>--
>
>Twitter: @mjaskowski

NOTICE: This e-mail and any attachments to it may be privileged, confidential or contain trade
secret information and is intended only for the use of the individual or entity to which it
is addressed. If this e-mail was sent to you in error, please notify me immediately by either
reply e-mail or by phone at 408.498.6000, and do not use, disseminate, retain, print or copy
the e-mail or any attachment. All messages sent to and from this e-mail address may be monitored
as permitted by or necessary under applicable law and regulations.

Mime
View raw message