kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Duggan <bryan.dug...@boxever.com>
Subject Re: Problems trying to make kafka 'rack-aware'
Date Fri, 21 Sep 2018 14:10:36 GMT
Hi Eno,

many thanks for trying that. That is very helpful for me.

That basic check didn't work for me but I have since discovered what my 
issue was. Despite using a version of kafka that supports rack-awareness 
we have been deliberately setting 'inter.broker.protocol.version' to an 
older version (due to various issues with some of our consumers). When I 
update this parameter to use a later version, I can see 'rack' being 
written to zookeeper.

For now I need to turn my attention to resolving the issues with my 
consumers.

Thanks again for helping out.

Bryan



On 21/09/2018 14:52, Eno Thereska wrote:
> Hi Bryan,
>
> I did a simple check with starting a broker with no rack id and then
> restarting with a rack id and I can confirm I could get the rack id from
> zookeeper after the restart. This was on trunk. Does that basic check work
> for you (i.e., without reassigning partitions)?
>
> Thanks
> Eno
>
> On Fri, Sep 21, 2018 at 2:07 PM, Bryan Duggan <bryan.duggan@boxever.com>
> wrote:
>
>> I didn't get a response to this, but I've been investigating more and can
>> now frame the problem slightly differently (hopefully, more accurately).
>>
>> According to this document
>>
>> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+
>> data+structures+in+Zookeeper
>>
>> Which defines broker data structures in zookeeper, the following is the
>> broker schema (from version 0.10 onwards - I am using version 0.11)
>>
>> { "fields":
>>      [ {"name": "version", "type": "int", "doc": "version id"},
>>        {"name": "host", "type": "string", "doc": "ip address or host name
>> of the broker"},
>>        {"name": "port", "type": "int", "doc": "port of the broker"},
>>        {"name": "jmx_port", "type": "int", "doc": "port for jmx"}
>>        {"name": "endpoints", "type": "array", "items": "string", "doc":
>> "endpoints supported by the broker"}
>>        {"name": "rack", "type": "string", "doc": "Rack of the broker.
>> Optional. This will be used in rack aware replication assignment for fault
>> tolerance."}
>>      ]
>> }
>>
>> when I check my broker data in zookeeper (which has a non-null broker.rack
>> setting in the properties file), I have the following;
>>
>> {"endpoints":["PLAINTEXT://x.x.x.x.abcd:9092"],"jmx_port":-1
>> ,"host":"x.x.x.x.abc","timestamp":"1537527988341","port":9092,"version":2}
>>
>> there is no 'rack'.
>>
>> In the server.log file on my kafka broker I see;
>> ----
>> [2018-09-21 13:00:40,227] INFO KafkaConfig values:
>>      advertised.host.name = null
>>      .
>>      .
>>      broker.id = 1234567
>>      broker.rack = rack1
>>      compression.type = producer
>>      .
>> -----
>>
>> so it looks fine from the broker side. However, when I restart kafka on
>> the host, it doesn't load any rack information into zookeeper.
>>
>> Can someone please confirm to me, if I have rack awareness, should I
>> expect to see a value for 'rack' in zookeeper? If so, do I need to do
>> something else on the broker side to get it to include it as part of the
>> meta-data it writes (as far as I can see it writes the metadata each time
>> kafka is restarted).
>>
>> thanks
>> Bryan
>>
>>
>>
>>
>>
>>
>>
>>
>> On 20/09/2018 11:31, Bryan Duggan wrote:
>>
>>> Hi,
>>>
>>> I have a kafka cluster consisting of 3 brokers across 3 different AWS
>>> availability zones.  It hosts several topics, each of which has a
>>> replication factor of 3. The cluster is currently not 'rack-aware'.
>>>
>>> I am trying to do the following;
>>>
>>>      - add 3 additional brokers (one in each of the 3 AZs)
>>>
>>>      - make the cluster 'rack-aware'. (ie: create 3 racks on a per-AZ
>>> basic, each containing 2 brokers)
>>>
>>>      - reassign the topics with the intention of having 1 replica in each
>>> of the 3 racks.
>>>
>>> To achieve this I've added 'broker.rack' to the properties file for each
>>> broker. The rack name is the same as the AZ name each broker is in. I've
>>> restarted kafka on all brokers (in case that's required for rack-awareness
>>> to take effect).
>>>
>>> Following restart I've attempted to reassign topics across all 6 brokers
>>> by running the following;
>>>
>>>      - ./kafka-reassign-partitions.sh --zookeeper $ZK
>>> --topics-to-move-json-file topics-to-move.json --broker-list '1,2,3,4,5,6'
>>>
>>> (where topics-to-move.json is a simple json file containing the topics to
>>> reassign)
>>>
>>> The problem I am having is, after running 'kafka-reassign-partitions.sh'
>>> with 6 brokers listed in the broker-list, it doesn't honour
>>> rack-awareness, and instead assigns 2 partitions to brokers in a single
>>> rack with a 3rd being assigned elsewhere.
>>>
>>> The version of kafka I am using is 2.11-1.1.1.
>>>
>>> Any documentation I've read suggests the above should have achieved what
>>> I want. However, it is not working as expected.
>>>
>>> Has anyone else make their kafka cluster 'rack-aware'? If so, did you
>>> experience any issues doing so?
>>>
>>> Or, can anyone tell me if there's some step I'm missing to make this work.
>>>
>>> TIA
>>>
>>> Bryan
>>>
>>>
>>>
>>>


Mime
View raw message