pulsar-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Pulsar Slack" <apache.pulsar.sl...@gmail.com>
Subject Slack digest for #general - 2019-01-19
Date Sat, 19 Jan 2019 09:11:02 GMT
2019-01-18 09:19:00 UTC - David Tinker: I have 2 services publishing the same messages to a
persistent topic at more or less the same time. Another service is consuming the messages
and it is not seeing duplicates as I would expect. As far as I can tell I haven't enabled
de-duplication. Any idea whats happening? Pulsar server is 2.2.0. Producer client is 2.2.0,
consumer client is 2.3.0-SNAPSHOT.
----
2019-01-18 09:32:40 UTC - jia zhai: @David Tinker You are using 2 producer right?
In messageMetadata, it could contain `producerName`, If setting it in your message produce
code, `MessageMetadata msgMetadata = metadataBuilder.setProducerName(“test”)....
`,
then when you consume it, you could know the message is coming from which producer.
----
2019-01-18 09:42:06 UTC - Gofu: Hello guys, I am using pulsar c++ client, somebody knows if
when we are consuming a  message there is a way to tell which partition it belongs to? I am
using a  partitionated topic, and I am only managing to get the topic, not the specific partition.
----
2019-01-18 09:48:52 UTC - jia zhai: @Gofu In C++, you could also set/get PartitonKey for a
message
----
2019-01-18 09:49:57 UTC - Gofu: @jia zhai I know, that's when I am publishing, in order to
route to a given partition, the thing is how to know which partition that the message belongs
(when consuming)
----
2019-01-18 09:50:40 UTC - Gofu: actually, I am using the golang client, not the c++ directly,
but the golang client uses the c++ client
----
2019-01-18 09:50:47 UTC - jia zhai: I see.
----
2019-01-18 09:51:11 UTC - Gofu: that would be helpful in order to launch a thread for each
partition
----
2019-01-18 09:58:15 UTC - jia zhai: @Gofu what did you get from message.getTopicName?
----
2019-01-18 09:59:22 UTC - jia zhai: It should contains the partition-id part
----
2019-01-18 10:22:05 UTC - Gofu: @jia zhai so  in the go client there is a Consumer interface
in the message that has a  method Topic(), that  method doesnt return the partition-id part,
i am assuming that method is the same than message.getTopicName, or am I wront?
----
2019-01-18 10:31:02 UTC - jia zhai: @Gofu Are you using PartitionedConsumer.Topic()?
----
2019-01-18 10:34:37 UTC - jia zhai: I remember, PartitionedConsumer.Topic() will not contains
partition part, but each message is assigned with topicName get from sub-consumer that contained
in PartitionedConsumer. It should contains that part.
----
2019-01-18 10:41:59 UTC - Gofu: @jia zhai I am using this
----
2019-01-18 10:43:24 UTC - Gofu: I can't find PartitionedConsumer
----
2019-01-18 10:58:41 UTC - Bogdan BUNECI: Short version: This buggy behavior is fixed in snapshot
by PR #2969
----
2019-01-18 10:59:27 UTC - Bogdan BUNECI: i will check also presto
----
2019-01-18 11:05:15 UTC - jia zhai: @Gofu <https://github.com/apache/pulsar/pull/3346>
----
2019-01-18 11:05:31 UTC - jia zhai: Here is the PR handling getTopicName for Go client
----
2019-01-18 11:24:03 UTC - Gofu: @jia zhai Yeah, that's the method I am using
----
2019-01-18 11:24:38 UTC - Gofu: but it returns the topic and not the partition.
----
2019-01-18 11:25:21 UTC - Gofu: nevermind, I am using the one from consumer
----
2019-01-18 11:26:03 UTC - Gofu: will try to update and see what happens
----
2019-01-18 11:26:04 UTC - Gofu: thanks
----
2019-01-18 11:46:45 UTC - David Tinker: Tx. I added logging for the producer name and I am
receiving messages more or less evenly from each producer with no dups. The /admin/v2/../stats
endpoint has "deduplicationStatus" : "Disabled".
----
2019-01-18 11:49:15 UTC - Gofu: @jia zhai probably it could solve my problem, but its missing
the release, can you tell me when you intend to do it?
----
2019-01-18 11:52:25 UTC - jia zhai: The 2.3.0 release will be in 1-2 weeks.
----
2019-01-18 11:52:53 UTC - jia zhai: @Gofu It depends on a bookkeeper release, and it will
be soon
----
2019-01-18 11:53:20 UTC - Gofu: @jia zhai thank you for the feedback
----
2019-01-18 11:54:03 UTC - jia zhai: welcome
----
2019-01-18 11:56:13 UTC - Gofu: @jia zhai There is another problem I have detected, perhaps
you can help me with. In this moment there is no method to send a batch to pulsar (only sendAsync
that holds in memory a bunch of messages and then on linger or max messages config it will
send the batch, but the thing is that some messages could fail, and then the order will be
lost), so I would need to send an atomic batch instead of having this buffer thing. This way
it will either send all messages or none. Do you have plans to do that?
----
2019-01-18 12:08:47 UTC - David Tinker: You could group your messages into one "batch" message?
----
2019-01-18 12:13:28 UTC - Sijie Guo: @Gofu currently you can do this atomic-ish send with
a batch of messages (total size is not larger than 5MB).

- you can set batchingMaxPublishDelay to disable periodical flush. so that you can control
the batch.
- then you can do following:
```
sendAsync(msg1);
sendAsync(msg2);
sendAsync(msg3);

flushAsync().whenComplete(...);
```
----
2019-01-18 12:13:32 UTC - Gofu: @David Tinker We had that idea but the consumer it isnt expective
a batch, it is expecting the individual message
----
2019-01-18 12:15:00 UTC - Sijie Guo: this should work for java client. I believe cpp/go has
similar settings, but not sure if we have added `flush` method or not. If there is no flush
method, we can add one
----
2019-01-18 12:18:04 UTC - Gofu: @Sijie Guo will look into and let you know, thanks
----
2019-01-18 14:19:49 UTC - Gofu: @Sijie Guo The go client doesnt have the flush method
----
2019-01-18 15:18:24 UTC - jia zhai: @Gofu If it is needed, Would you please help open an issue
for this? we could handle it later.
----
2019-01-18 15:20:24 UTC - Gofu: sure
----
2019-01-18 15:33:31 UTC - Gofu: @jia zhai created the issue <https://github.com/apache/pulsar/issues/3388>
----
2019-01-18 23:12:50 UTC - Emma Pollum: I'm having trouble getting one of my clusters to connect
to its replication cluster.
I have two clusters, C1 and C2. C2 is connecting to C1 fine, but C1 shows only an inbound
connection and under pulsar-admin topic stats it says "connected":"false"

Where would logs for this connection attempt be, so I can diagnose why it can't connect?
----
2019-01-19 00:30:22 UTC - Emma Pollum: I'm seeing at least one topic in this namespace that
is connected to its partner cluster, but many other topics are not connected.
----
2019-01-19 02:10:46 UTC - Grant Wu: ```
02:10:10.261 [Timer-1] INFO  org.apache.pulsar.functions.runtime.ProcessRuntime - Started
process successfully
Traceback (most recent call last):
  File "/pulsar/instances/python-instance/python_instance_main.py", line 33, in &amp;lt;module&amp;gt;
    from pulsar import Authentication
  File "/pulsar/instances/python-instance/pulsar/__init__.py", line 99, in &amp;lt;module&amp;gt;
    import _pulsar
ImportError: No module named _pulsar
```
----
2019-01-19 02:11:04 UTC - Grant Wu: I’m getting this in Pulsar, on a broker deployed with
an image based off of the 2.1.1 image :thinking_face:
----
2019-01-19 02:12:00 UTC - Grant Wu: Just checked `pip freeze` - I don’t see `pulsar-client`
:thinking_face:
----
2019-01-19 02:20:04 UTC - Grant Wu: cc @Jonathan Budd
----
2019-01-19 02:24:34 UTC - Matteo Merli: @Grant Wu is that `pulsar` or `pulsar-all` image?
----
2019-01-19 02:24:41 UTC - Grant Wu: Not completely sure.
----
2019-01-19 02:26:55 UTC - Grant Wu: ```
docker run -it apachepulsar/pulsar-all:2.1.1-incubating python -c "import pulsar_client"
```
----
2019-01-19 02:26:58 UTC - Grant Wu: Gives me no module found
----
2019-01-19 02:27:12 UTC - Matteo Merli: Import pulsar
----
2019-01-19 02:27:17 UTC - Grant Wu: oh, sorry
----
2019-01-19 02:27:34 UTC - Grant Wu: ```
Traceback (most recent call last):
  File "&lt;string&gt;", line 1, in &lt;module&gt;
ImportError: No module named pulsar
```
----
2019-01-19 02:27:36 UTC - Grant Wu: same thing
----
2019-01-19 02:27:41 UTC - Matteo Merli: :)
----
2019-01-19 02:27:47 UTC - Matteo Merli: Ok let me check
----
2019-01-19 02:28:05 UTC - Grant Wu: Currently waiting for
```
docker run -it apachepulsar/pulsar-all:2.2.1 python -c "import pulsar"
``` to complete
----
2019-01-19 02:28:52 UTC - Grant Wu: docker images too big ;-;
----
2019-01-19 02:31:12 UTC - Matteo Merli: yes yes, part of it is that it would need to be squashed
----
2019-01-19 02:31:48 UTC - Matteo Merli: the tgz that gets unpacked in the image is actually
stored twice by docker build (untar and mv of the directory)
----
2019-01-19 02:32:06 UTC - Grant Wu: oh dear :sweat_smile:
----
2019-01-19 02:32:08 UTC - Matteo Merli: Anyway, the client should be there, per <https://github.com/apache/pulsar/blob/v2.2.1/docker/pulsar/Dockerfile#L38>
----
2019-01-19 02:32:58 UTC - Matteo Merli: So yes, it wasn’t there in 2.1 but it’s there
in 2.2 images
----
2019-01-19 02:33:04 UTC - Grant Wu: okay.
----
2019-01-19 02:33:24 UTC - Grant Wu: I’ll figure out something; we really ought to be on
2.2 anyways
----
Mime
View raw message