pulsar-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Pulsar Slack" <apache.pulsar.sl...@gmail.com>
Subject Slack digest for #general - 2019-01-23
Date Wed, 23 Jan 2019 09:11:04 GMT
2019-01-22 13:16:27 UTC - Paul van der Linden: I'm experimenting how pulsar handles load, but
I have some surprising results so far:
- high round trip time (sending a message, and then back): 18ms (compared to 8ms for some
other systems), this is for the baseline test: 10msgs/s in, 30 msgs/s out, 7kb message size
(our average msg size currently)
- already having troubles with a message throughput of 1500 out (with 500 in): throughput
unstable, round trip: 50-150ms
----
2019-01-22 13:17:06 UTC - Paul van der Linden: are there things I can tweak (like not using
sync flush)?
----
2019-01-22 13:17:29 UTC - Pratik Narode: @Pratik Narode has joined the channel
----
2019-01-22 15:30:22 UTC - Brian: any recommendations on setting MaxDirectMemorySize?
----
2019-01-22 15:30:44 UTC - Brian: 4g default is causing us OOM and while I can bump it up,
seems like a tmpfix
----
2019-01-22 15:55:27 UTC - Brian: also why doesn't proxy say "Hey, i have a broker who's OOM,
lemme try another broker"
----
2019-01-22 15:57:49 UTC - Ali Ahmed: @naga you can use pulsar io to ingest the csv file
----
2019-01-22 15:58:16 UTC - Ali Ahmed: they either use pulsar sql or pulsar function to run
the downstream job
----
2019-01-22 16:21:20 UTC - Romain Castagnet: Hi, I have a weird bug.
I use namespace isolation policy and region aware.
When a consumer (pulsar-client consumer) and a producer (pulsar-perf produce) are running,
I stop a complete region. Consumer and producer are still working.
When I restart the region stopped precedently, consumer and producer are working but I can't
do "namespaces unload mytopic". An error 500 appear and in journald I see an error 401.
I don't understand why...
If I stop producer and consumer and I restart broker and bookeeper, unload works again.
Do you have any idea ?
----
2019-01-22 17:14:48 UTC - Paul van der Linden: Hi, Is there a way to speed up bookkeeper or
pulsar in general. I'm recording pretty bad performance on latency compared to other brokers
----
2019-01-22 17:15:45 UTC - Paul van der Linden: Something like kafka maybe where kafka doesn't
synchronously flushes?
----
2019-01-22 17:16:13 UTC - Paul van der Linden: I'm already seeing slow performance on the
baseline test with 10 msgs/s in, 30 msgs out, 7kb mssage size
----
2019-01-22 17:24:48 UTC - Matteo Merli: @Paul van der Linden you can set `journalSyncData=false`
in `bookkeeper.conf` (<https://github.com/apache/pulsar/blob/master/conf/bookkeeper.conf#L304>)
----
2019-01-22 17:25:44 UTC - Paul van der Linden: Thanks, exactly what I couldn't find so far
----
2019-01-22 17:26:08 UTC - Matteo Merli: There are several other tunables that can be used
to adjust the performances for different conditions. Can you expand a bit what you’re trying
to achieve?
----
2019-01-22 17:26:55 UTC - Paul van der Linden: Sure
----
2019-01-22 17:27:29 UTC - Paul van der Linden: I'm basically comparing some messaging systems
to see what we can replace Kafka with
----
2019-01-22 17:28:05 UTC - Paul van der Linden: One of the tests are some performance test,
and see how the brokers handle high load
----
2019-01-22 17:29:22 UTC - Paul van der Linden: We are replacing kafka, mainly because the
python clients are quiet bad and the majority of our code is python
----
2019-01-22 17:30:41 UTC - Matteo Merli: Sure, can you describe the deployment setup you’re
using, any config you changed from defaults and how you’re sending traffic and measuring
latency?
----
2019-01-22 17:31:44 UTC - Paul van der Linden: I'm deploying it to kubernetes
----
2019-01-22 17:31:52 UTC - Paul van der Linden: GKE to be exact
----
2019-01-22 17:32:34 UTC - Matteo Merli: Ok, the ~20 millis avg latency is typical on GKE
----
2019-01-22 17:32:44 UTC - Matteo Merli: (when fsyncing data)
----
2019-01-22 17:32:50 UTC - Paul van der Linden: compared to the examples directory in github
for gke:
- 3 bookies instead of 2
- running everything in 3 nodes with 15GB ram, 4 cpu's
- non-local ssd storage
----
2019-01-22 17:33:13 UTC - Paul van der Linden: ok, it shoots up quiet quickly with some tests
though
----
2019-01-22 17:33:41 UTC - Paul van der Linden: with 1500 msgs/s it's struggling to even manage
the 1500 msgs/s
----
2019-01-22 17:34:12 UTC - Matteo Merli: Is it publishing synchronously?
----
2019-01-22 17:34:27 UTC - Paul van der Linden: I'm measuring latency by sending a message
with a python client, then pinging back with a "PID" queue basically
----
2019-01-22 17:34:52 UTC - Paul van der Linden: is that a setting in the client?
----
2019-01-22 17:35:20 UTC - Matteo Merli: there are 2 methods on the `Producer` class, `send()`
and `send_async()` <http://pulsar.apache.org/api/python/#pulsar.Producer.send_async>
----
2019-01-22 17:35:36 UTC - Paul van der Linden: ah I missed that one
----
2019-01-22 17:36:11 UTC - Matteo Merli: if you want to get any decent throughput you need
to use the async variant, otherwise the throughput is limited by the latency (since there
will be only 1 message in flight)
----
2019-01-22 17:37:45 UTC - Matteo Merli: I’d also suggest to enable batching and to block
when producer queue is full (for easier backpressure handling):

```
producer = client.create_producer(
                'my-topic',
                block_if_queue_full=True,
                batching_enabled=True,
                batching_max_publish_delay_ms=10
            )
```
----
2019-01-22 17:38:11 UTC - Paul van der Linden: thanks I will test that tommorrow (just finished
all of the tests)
----
2019-01-22 17:38:22 UTC - Matteo Merli: :+1:
----
2019-01-22 17:39:27 UTC - Paul van der Linden: thanks for the help, It's good to do a fair
tests :slightly_smiling_face: The other systems I knew slightly better already, so it was
easier to troubleshoot these kind of throughput issues
----
2019-01-22 19:11:04 UTC - Brian: any reason why the pulsar-admin commands result in ```Server
redirected too many times```
----
2019-01-22 20:14:47 UTC - Kendall Magesh-Davis: Hey guys, I’ve got pulsar deployed using
your helm chart. I don’t see a `pulsar-admin` container, like exists with the generic k8s
deployment. Are the binaries hidden one of these containers?
```master ~/Code/pulsar/deployment/kubernetes/generic&gt; kubectl get pods -n pulsar --show-labels
NAME                                                    READY   STATUS      RESTARTS   AGE
  LABELS
foo-pulsar-autorecovery-576c97dcf4-zrdgq   1/1     Running     0          1d    app=pulsar,cluster=foo-pulsar,component=autorecovery,pod-template-hash=1327538790,release=foo
foo-pulsar-bastion-9658ffbf4-bnvg8         1/1     Running     0          1d    app=pulsar,cluster=foo-pulsar,component=bastion,pod-template-hash=521499690,release=foo
foo-pulsar-bookkeeper-0                    1/1     Running     0          1d    app=pulsar,cluster=foo-pulsar,component=bookkeeper,controller-revision-hash=foo-pulsar-bookkeeper-7b64dd9c47,release=foo,<http://statefulset.kubernetes.io/pod-name=foo-pulsar-bookkeeper-0|statefulset.kubernetes.io/pod-name=foo-pulsar-bookkeeper-0>
foo-pulsar-bookkeeper-1                    1/1     Running     0          1d    app=pulsar,cluster=foo-pulsar,component=bookkeeper,controller-revision-hash=foo-pulsar-bookkeeper-7b64dd9c47,release=foo,<http://statefulset.kubernetes.io/pod-name=foo-pulsar-bookkeeper-1|statefulset.kubernetes.io/pod-name=foo-pulsar-bookkeeper-1>
foo-pulsar-bookkeeper-2                    1/1     Running     0          1d    app=pulsar,cluster=foo-pulsar,component=bookkeeper,controller-revision-hash=foo-pulsar-bookkeeper-7b64dd9c47,release=foo,<http://statefulset.kubernetes.io/pod-name=foo-pulsar-bookkeeper-2|statefulset.kubernetes.io/pod-name=foo-pulsar-bookkeeper-2>
foo-pulsar-broker-74c6f7dcb-hq7x2          1/1     Running     3          1d    app=pulsar,cluster=foo-pulsar,component=broker,pod-template-hash=307293876,release=foo
foo-pulsar-broker-74c6f7dcb-kl8bb          1/1     Running     3          1d    app=pulsar,cluster=foo-pulsar,component=broker,pod-template-hash=307293876,release=foo
foo-pulsar-dashboard-5c8b94757f-4lb5l      1/1     Running     0          1d    app=pulsar,cluster=foo-pulsar,component=dashboard,pod-template-hash=1746503139,release=foo
foo-pulsar-grafana-77b945cdbd-g578f        1/1     Running     0          1d    app=pulsar,cluster=foo-pulsar,component=grafana,pod-template-hash=3365017868,release=foo
foo-pulsar-prometheus-767976df87-5w84n     1/1     Running     0          1d    app=pulsar,cluster=foo-pulsar,component=prometheus,pod-template-hash=3235328943,release=foo
foo-pulsar-proxy-68f7576dcd-pkbz9          1/1     Running     0          1d    app=pulsar,cluster=foo-pulsar,component=proxy,pod-template-hash=2493132878,release=foo
foo-pulsar-zookeeper-0                     1/1     Running     0          1d    app=pulsar,cluster=foo-pulsar,component=zookeeper,controller-revision-hash=foo-pulsar-zookeeper-5cf6ffdb4,release=foo,<http://statefulset.kubernetes.io/pod-name=foo-pulsar-zookeeper-0|statefulset.kubernetes.io/pod-name=foo-pulsar-zookeeper-0>
foo-pulsar-zookeeper-1                     1/1     Running     0          1d    app=pulsar,cluster=foo-pulsar,component=zookeeper,controller-revision-hash=foo-pulsar-zookeeper-5cf6ffdb4,release=foo,<http://statefulset.kubernetes.io/pod-name=foo-pulsar-zookeeper-1|statefulset.kubernetes.io/pod-name=foo-pulsar-zookeeper-1>
foo-pulsar-zookeeper-2                     1/1     Running     0          1d    app=pulsar,cluster=foo-pulsar,component=zookeeper,controller-revision-hash=foo-pulsar-zookeeper-5cf6ffdb4,release=foo,<http://statefulset.kubernetes.io/pod-name=foo-pulsar-zookeeper-2|statefulset.kubernetes.io/pod-name=foo-pulsar-zookeeper-2>
foo-pulsar-zookeeper-metadata-zl4ns        0/1     Completed   0          1d    controller-uid=80248831-1db3-11e9-aaa6-02b3804acdb4,job-name=foo-pulsar-zookeeper-metadata```
----
2019-01-22 20:15:39 UTC - Sijie Guo: foo-pulsar-bastion-9658ffbf4-bnvg8 - ‘bastion’ is
the ‘pulsar-admin’ container
----
2019-01-22 20:16:02 UTC - Kendall Magesh-Davis: nice. thanks @Sijie Guo :slightly_smiling_face:
I’ll poke at that one
----
2019-01-22 23:15:16 UTC - Grant Wu: @Matteo Merli @Jerry Peng Could I get a response to <https://apache-pulsar.slack.com/archives/C5Z4T36F7/p1547770573542400>
?
----
2019-01-22 23:16:03 UTC - Grant Wu: I’m about to wrap my entire `process` method in a catch-all
try except to avoid my Pulsar function going down - would be great to see if there’s a better
alternative that leverages Pulsar’s existing capabilities
----
2019-01-22 23:16:18 UTC - Grant Wu: Because I know `pulsar-admin functions getstatus` is a
thing…
----
2019-01-22 23:17:27 UTC - Grant Wu: Like - if I throw an exception when processing a message,
do I reprocess the same message after my function restarts, or does it drop that message
----
2019-01-22 23:21:14 UTC - Grant Wu: Sorry, corrected a typo
----
2019-01-22 23:21:37 UTC - Jerry Peng: @Grant Wu that depends on the processing guarantee set
for the function.  If the function’s processing guarantee is set to:
1. AT_MOST_ONCE - If there is an uncaught exception in the function code, the current message
will be dropped and not submitted for reprocessing
2. AT_LEAST_ONE - If there is an uncaught exception in the function code, the current message
will not be dropped and will be submitted for reprocessing
----
2019-01-22 23:22:08 UTC - Grant Wu: Ah, okay.  So basically “`process` function ran without
throwing exception” == success?
----
2019-01-22 23:22:33 UTC - Jerry Peng: 3. EXACTLY_ONCE - If there is an uncaught exception
in the function code, the function will fail and restart to maintain ordering
----
2019-01-22 23:22:42 UTC - Jerry Peng: @Grant Wu correct
----
2019-01-22 23:25:13 UTC - Grant Wu: Thanks for clarifying!
----
2019-01-22 23:25:43 UTC - Grant Wu: Not 100% I understand the difference between #2 and #3
though
----
2019-01-22 23:26:50 UTC - Grant Wu: In what instances would we get processing more than once
with #2?
----
2019-01-22 23:27:02 UTC - Grant Wu: Or are the only differences in processing order
----
2019-01-22 23:37:39 UTC - Grant Wu: @Jerry Peng
----
2019-01-23 01:29:43 UTC - Jerry Peng: @Grant Wu when there is a failure, the function may
execute process on the same message more than once
----
2019-01-23 01:31:44 UTC - Jerry Peng: That will happen for both #2 and #3 but number #3 will
ensure ordering as well as use idempotent producing to make sure that outputs derived from
a distinct message will only be written once in an output topic
----
2019-01-23 01:33:23 UTC - Grant Wu: Ah okay
----
2019-01-23 01:33:35 UTC - Grant Wu: But what if you're using context publishing?
----
2019-01-23 01:33:48 UTC - Grant Wu: And don't publish to a normal output topic at all
----
2019-01-23 06:39:38 UTC - bossbaby: i have 1 topic with many broker but only 1 broker connect
1 topic, in the case many producer and consumer connect this topic, it will be bottle neck.
how to do solve it?
----
2019-01-23 06:53:52 UTC - Samuel Sun: partitions ?
----
2019-01-23 06:54:49 UTC - bossbaby: i use normal topic
----
2019-01-23 06:58:08 UTC - bossbaby: i realize pulsar support partition topic with many broker
but a question how much partition that i must use if i have 6 broker
----
2019-01-23 07:03:22 UTC - Samuel Sun: I think it could be 6,12,18.. ?
----
2019-01-23 07:08:46 UTC - bossbaby: thanks you @Samuel Sun
----
2019-01-23 07:09:09 UTC - Samuel Sun: np
----
2019-01-23 08:04:49 UTC - Samuel Sun: hi , one question for the required parameters in broker.conf,
is this “brokerServicePort” a required one ? and how do I know which parameters are required
? thanks
----
2019-01-23 08:06:12 UTC - Samuel Sun: this question is related to this issue : <https://github.com/apache/pulsar/issues/3390>
----
2019-01-23 08:18:32 UTC - bossbaby: why PartitionedConsumer not support acknowledgeCumulative
in pulsar client c++?
----
Mime
View raw message