pulsar-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Pulsar Slack" <apache.pulsar.sl...@gmail.com>
Subject Slack digest for #general - 2019-04-12
Date Fri, 12 Apr 2019 09:11:02 GMT
2019-04-11 10:13:14 UTC - Michael Bongartz: Hi there,
Anyone knows if there is any issue with pulsar-admin handling `307 Temporary Redirect` when
authentication is enabled ?
I have a multi-broker with pulsar proxy and authentication enabled setup and I can only perform
operations on topics if I am lucky enough that my requests is directed to the broker owning
the topic by Pulsar proxy. When it hits another broker, it seems `pulsar-admin` receives a
HTTP 307 from the broker then tries to connect to the broker owning the topic but without
authenticating.
----
2019-04-11 10:40:36 UTC - Sijie Guo: What is the authentication method you are using? TLS
or JWT?
----
2019-04-11 12:49:16 UTC - Steve Kim: I am willing to contribute, but it will take me a while
because I am new to this code and don't have much time.
----
2019-04-11 12:49:26 UTC - Steve Kim: Do I need to sign a contributor agreement?
----
2019-04-11 14:01:19 UTC - jia zhai: no need to do that
----
2019-04-11 14:02:57 UTC - jia zhai: <https://pulsar.apache.org/en/contributing/>
----
2019-04-11 14:03:16 UTC - jia zhai: FYI. Here is a guide:slightly_smiling_face:
----
2019-04-11 15:58:33 UTC - chris: @Michael Bongartz if you are using JWT for authentication
there is an issue with java stripping the headers in the http url connection on a redirect.
This issue is addressed here <https://github.com/apache/pulsar/pull/3869>. The fix is
in master and will be out in 2.3.1.
----
2019-04-11 16:23:57 UTC - Grant Wu: Are Pulsar brokers supposed to crash loop while Zookeeper
is undergoing network partitions/having leader elections?  Not saying this is unreasonable
behavior, just wanted to double check this is expected
----
2019-04-11 16:26:07 UTC - Matteo Merli: It depends how long the network partition/election
lasts.
----
2019-04-11 16:27:47 UTC - Matteo Merli: Brokers have a ZK sessions and hold “locks” on
resources (like ownership of group of topics) within that session.

A session is valid until it cannot be refreshed within the session timeout. By default the
session time
----
2019-04-11 16:29:11 UTC - Matteo Merli: By default, we use 30 sec for session timeout. If
a broker is not able to talk with a functioning ZK ensemble for that amount of time, it will
not be able to ensure it’s still the owner of the resources.
----
2019-04-11 16:30:15 UTC - Matteo Merli: ..Hence we bounce the broker.

We have the plan of improving that behavior, by making sure we can keep writing to BookKeeper
while ZK is down (avoiding all attempts at metadata changes).
----
2019-04-11 16:30:52 UTC - Matteo Merli: In meantime, you can increase/decrease the ZK session
timeout depending on needs
----
2019-04-11 16:32:59 UTC - Matteo Merli: a long session timeout will make a broker “survive”
a long partition, at the expense of a longer time for this session to expire when a broker
crashes badly.

eg: in a case a broker hard crash, all the topics will still be seen as “owned” by that
broker until the session expires and it’s cleaned up by ZK. In the meantime, clients keep
trying to reconnect
----
2019-04-11 16:34:45 UTC - Grant Wu: I see
----
2019-04-11 16:37:18 UTC - George Wilk: Message retention policy question:  if message retention
is set to keep all messages indefinitely, does it apply to all messages ever published in
scope of a namespace?  If so, would it apply had there never been any open subscriptions?
 Actual use case scenario: ServiceA (publisher) is deployed before any client services (subscribers)
come online, but when they do we need to make sure they can get the all backlogged (pardon
misnomer) messages ever published by ServiceA.
----
2019-04-11 16:38:34 UTC - Yuvaraj Loganathan: Yes. Messages will be retained even if there
are no subscriptions
----
2019-04-11 16:39:35 UTC - Fredrick P Eisele: In the <https://pulsar.apache.org/docs/en/deploy-bare-metal/#initializing-cluster-metadata>
it says "It only needs to be written once". Is there a problem with running it more than once?
How can it be retracted?
----
2019-04-11 16:42:24 UTC - George Wilk: Thank you for this quick reply!  Quick follow-up: would
the same be true about scenario where some subscriptions exist when new subscription is added.
 Existing subscriptions have already consumed and ACKED all messages - would the new subscription
be able to consume all messages from the beginning?
----
2019-04-11 16:47:59 UTC - Grant Wu: You would need to reset its cursor to the beginning
----
2019-04-11 16:48:55 UTC - Matteo Merli: Or subscribing with initialPosition: <https://pulsar.apache.org/api/client/org/apache/pulsar/client/api/ConsumerBuilder.html#subscriptionInitialPosition-org.apache.pulsar.client.api.SubscriptionInitialPosition->
----
2019-04-11 16:49:19 UTC - Matteo Merli: (or creating a Reader and starting on `MessageId.earliest`)
----
2019-04-11 16:49:26 UTC - Grant Wu: Ah, yes, I see that’s landed into the Java client
----
2019-04-11 16:49:36 UTC - Grant Wu: Will we be getting that in the other clients? :stuck_out_tongue:
----
2019-04-11 16:50:04 UTC - George Wilk: ty!
----
2019-04-11 17:02:25 UTC - Chris Bartholomew: There is an issue in 2.3.0 on redirects when
using JWT. See <https://github.com/apache/pulsar/pull/3869>
----
2019-04-11 17:14:50 UTC - David Kjerrumgaard: @Fredrick P Eisele The command is used to initialize
all the bookie meta-data within zookeeper. This meta-data is crucial to keeping track of where
the Pulsar topic data is stored on BookKeeper. If you re-run the command all of that information
will be overwritten, resulting in full data loss from the Pulsar perspective.
----
2019-04-11 17:17:51 UTC - Matteo Merli: @Grant Wu It’s already there in C++ and Go..
----
2019-04-11 17:18:16 UTC - Matteo Merli: In Python too.. though the pdoc publish is broken..
and the docs on the webpage not update :confused:
----
2019-04-11 17:18:35 UTC - Matteo Merli: Anyway: `subscribe(....., initial_position=InitialPosition.Earliest)`
----
2019-04-11 17:21:42 UTC - Fredrick P Eisele: @David Kjerrumgaard If you overwrite metadata
with exactly the same values that will look like data loss to Pulsar?
----
2019-04-11 17:29:16 UTC - David Kjerrumgaard: @Fredrick P Eisele No, but I don't think it
is possible to have a copy of the same values.  The data that is stored includes each of the
ledger ids and the bookies that they were placed on for EACH pulsar TOPIC.
----
2019-04-11 17:30:27 UTC - Grant Wu: Ah, I see
----
2019-04-11 17:30:29 UTC - Grant Wu: That’s good to hear
----
2019-04-11 17:30:36 UTC - Grant Wu: Do you know about the Websocket API :stuck_out_tongue:
----
2019-04-11 20:52:01 UTC - Devin G. Bost: We have a pulsar Kafka source that we are creating
successfully, but it's not starting. (Checking the sink's status shows that it's not running.)
In the logs, we're seeing, "Was passed main parameter but no main parameter was defined."
Any ideas?
----
2019-04-11 20:58:48 UTC - David Kjerrumgaard: What was the command you used to start it? 
The error seems to indicate that you are providing it some additional information that it
is not expecting.
----
2019-04-11 21:10:06 UTC - Devin G. Bost: It should start automatically once it is created.
----
2019-04-11 21:17:41 UTC - Thor Sigurjonsson: I have a pressing question on roles, permissions
and functions in version 2.3.0. (we're doing a deploy)
----
2019-04-11 21:18:25 UTC - Thor Sigurjonsson: We've turned on token auth. We're figuring out
the hard way what roles the functions-worker (as a java thread executor) has to have. (we
were masking that issue with having an anonymous role).
----
2019-04-11 21:18:36 UTC - Thor Sigurjonsson: We got it to deploy but no data is flowing.
----
2019-04-11 21:18:46 UTC - Thor Sigurjonsson: Does it need produce,consume permissions?
----
2019-04-11 21:18:48 UTC - Devin G. Bost: (It's related to the error I reported. We think it's
related to removal of "anonymous" permissions.)
----
2019-04-11 21:19:25 UTC - Thor Sigurjonsson: (we initially could not deploy without a super-user
role token given to the worker)
----
2019-04-11 21:21:12 UTC - Emma Pollum: Is there an option to turn off function metrics?
----
2019-04-11 21:27:28 UTC - David Kjerrumgaard: @Devin G. Bost Is the issue with one of the
built-in Pulsar connectors or one you have developed yourself?
----
2019-04-11 21:27:45 UTC - Devin G. Bost: It's using the built-in Kafka connector for Pulsar.
----
2019-04-11 21:28:40 UTC - David Kjerrumgaard: Sink or source?
----
2019-04-11 21:33:24 UTC - David Kjerrumgaard: What configs did you pass to the `pulsar-admin
source create ...`  command?  @Devin G. Bost
----
2019-04-11 21:37:35 UTC - Thor Sigurjonsson: we have both sinks and sources, but data flow
would not begin until a source kicked in..
----
2019-04-11 21:37:50 UTC - Thor Sigurjonsson: we also see 0 instances of source
----
2019-04-11 21:51:09 UTC - Ali Ahmed: @Devin G. Bost @Thor Sigurjonsson can you post the full
command you use to start the source instance
----
2019-04-11 21:55:34 UTC - Devin G. Bost: ```bin/pulsar-admin source create \
              --source-type 'kafka' \
              --destinationTopicName <persistent://osp/obfuscated/log-topic> \
              --sourceConfigFile /data/provisioning/obfuscated-kafka-log-topic-source.conf
\
              --namespace obfuscated \
              --name kafka-log-topic-source \
              --tenant osp```

Then, the contents of the .conf file are:

```configs:
  bootstrapServers: obfuscated
  consumerConfigProperties:
    auto.offset.reset: latest
    sasl.jaas.config: com.sun.security.auth.module.Krb5LoginModule required doNotPrompt=true
      useTicketCache=false serviceName="kafka" principal="<mailto:pulsar_runtime@obfuscated.com|pulsar_runtime@obfuscated.com>"
      useKeyTab=true keyTab="/pulsar/conf/auth/pulsar_runtime_dev.keytab" client=true;
    sasl.kerberos.service.name: kafka
    security.protocol: SASL_PLAINTEXT
  groupId: log-group
  topic: log-topic```
----
2019-04-11 21:55:57 UTC - Thor Sigurjonsson: @Ali Ahmed we've had that work before
----
2019-04-11 21:56:23 UTC - Devin G. Bost: The only difference related to the roles and the
involvement of the `anonymous` role.
----
2019-04-11 21:57:24 UTC - Thor Sigurjonsson: (we think)
----
2019-04-11 21:58:14 UTC - Ali Ahmed: you seem to be providing the topic twice in different
formats
----
2019-04-11 21:59:11 UTC - Devin G. Bost: That part has worked before.
----
2019-04-11 21:59:58 UTC - Ali Ahmed: is this the kafka topic ```topic: log-topic```
----
2019-04-11 22:00:43 UTC - Jerry Peng: @Devin G. Bost @Thor Sigurjonsson so the only thing
changed is that your guys “anonymousUserRole=anonymous” ?
----
2019-04-11 22:01:00 UTC - Jerry Peng: or can you describe the changes you’ve made
----
2019-04-11 22:02:03 UTC - Thor Sigurjonsson: I just started the broker with superuser role
being anonymous role and kafka source is flowing data... (from a status call I can see that).
Before it was zero instances in the status response).
----
2019-04-11 22:05:29 UTC - Jerry Peng: ok then there is probably an authorization issue somewhere
where the anonymous didn’t have to correct permissions set to produce or consume in a namespace
----
2019-04-11 22:05:42 UTC - Jerry Peng: I would also check the function logs to see if there
are any errors
----
2019-04-11 22:06:05 UTC - Jerry Peng: Should see connections fail
----
2019-04-11 22:07:30 UTC - Thor Sigurjonsson: we were getting empty logs at one point
----
2019-04-11 22:07:34 UTC - Thor Sigurjonsson: and zero instances
----
2019-04-11 22:07:38 UTC - Thor Sigurjonsson: but let me look now
----
2019-04-11 22:07:55 UTC - Thor Sigurjonsson: (we have anonymous working, but want to get to
a more "secure" setup :wink: )
----
2019-04-11 22:08:35 UTC - Matteo Merli: @Thor Sigurjonsson a secure setup will involve containers
and K8S
----
2019-04-11 22:08:40 UTC - Thor Sigurjonsson: (log file is 0 bytes now)
----
2019-04-11 22:08:50 UTC - Devin G. Bost: Anonymous superusers just seems less than desirable...
----
2019-04-11 22:09:17 UTC - Thor Sigurjonsson: @Devin G. Bost "he used sarcasm" :slightly_smiling_face:
----
2019-04-11 22:09:57 UTC - Thor Sigurjonsson: @Matteo Merli what can we do with thread execution
stuff today in terms of auth?
----
2019-04-11 22:10:33 UTC - Matteo Merli: problem is mainly that if you have “untrusted”
code running as thread/process it will have access to the worker credentials
----
2019-04-11 22:12:12 UTC - Thor Sigurjonsson: We're ok waiting for a release with more auth
support as part of function support, we're just trying to get to a place first where we have
producer/consumer clients auth'd and functions/sources/sinks that work also (superuser is
ok there in that interrim). We have control of the code now but later will need to treat functions
as more "untrusted".
----
2019-04-11 22:13:33 UTC - Jerry Peng: Currenlty, functions running in process or runtime mode
can only assume the same role as the worker/broker
----
2019-04-11 22:14:07 UTC - Thor Sigurjonsson: That's fine for our current use case.
----
2019-04-11 22:14:21 UTC - Jerry Peng: which is what is specified for clientAuthenticationParameters
and clientAuthenticationPlugin
----
2019-04-11 22:14:22 UTC - Jerry Peng: ok
----
2019-04-11 22:15:36 UTC - Thor Sigurjonsson: Yes, we set that up as the super user (was working
with anonymous role as superuser before). Then deploys worked once we had a different super
user role in there but no data was flowing.
----
2019-04-11 22:15:50 UTC - Thor Sigurjonsson: We were guessing that a produce,consume was missing.
----
2019-04-11 22:16:07 UTC - Thor Sigurjonsson: But that needs to be applied to different constructs
we're creating (tenants, namespaces, etc).
----
2019-04-11 22:16:34 UTC - Thor Sigurjonsson: For now we just need a little more clarity on
that so we can turn off the anonymous hack.
----
2019-04-11 22:17:57 UTC - Jerry Peng: @Thor Sigurjonsson are you using tokens or TLS for auth?
----
2019-04-11 22:18:09 UTC - Thor Sigurjonsson: @Jerry Peng we're using token auth
----
2019-04-11 22:18:11 UTC - Jerry Peng: and what role will the token or certificate resovle
to?
----
2019-04-11 22:18:38 UTC - Jerry Peng: that role needs to be as super user role
----
2019-04-11 22:18:58 UTC - Thor Sigurjonsson: Yes, we got that working (for deploying).
----
2019-04-11 22:20:01 UTC - Thor Sigurjonsson: we named it 'superuser' and added it to `super-user`
config in the broker and put the token for that in the `function-worker.yml`. Deploy worked.
Source would not instantiate or flow.
----
2019-04-11 22:20:51 UTC - Thor Sigurjonsson: And, before also broker didn't start without
that role for the function worker.
----
2019-04-11 22:21:53 UTC - Jerry Peng: you are using version 2.3.0?
----
2019-04-11 22:22:27 UTC - Devin G. Bost: Yes.
----
2019-04-11 22:25:35 UTC - Jerry Peng: can you check the assignments for the sources:
curl &lt;BROKER_HOSTNAME&gt;:8080/admin/v2/worker/assignments
----
2019-04-11 22:25:45 UTC - Devin G. Bost: Sure thing.
----
2019-04-11 22:27:13 UTC - Thor Sigurjonsson: we're seeing those
----
2019-04-11 22:28:01 UTC - Devin G. Bost: What should we be looking for?
----
2019-04-11 22:28:12 UTC - Jerry Peng: can you go to a machine that is running an instance
of the source and check the command line arguments that are passed in
----
2019-04-11 22:29:16 UTC - Jerry Peng: see if the parameters client_auth_plugin and client_auth_params
are properly configured
+1 : Thor Sigurjonsson
----
2019-04-11 22:29:56 UTC - Thor Sigurjonsson: checking
----
2019-04-11 22:31:19 UTC - Thor Sigurjonsson: how should we check it?  -- we used to be able
to do it with a process executor with listing processes
----
2019-04-11 22:31:54 UTC - Jerry Peng: something like: ps aux | grep pulsar
----
2019-04-11 22:31:55 UTC - Thor Sigurjonsson: this is running as a thread (we're doing that
as we needed to plumb the kerberos params) and we're not on a build with the PR that would
fix that.
----
2019-04-11 22:32:03 UTC - Jerry Peng: oh gotcha
----
2019-04-11 22:33:56 UTC - Jerry Peng: give me one sec to check something
----
2019-04-11 22:36:16 UTC - Jerry Peng: ok in thread runtime those args won’t be printed out
anywhere.  Can you check in the broker logs to see if there are any exceptions?  Like connection
exceptions or auth exceptions?
----
2019-04-11 22:37:36 UTC - Jerry Peng: You can also check what the configs the function worker
is starting with by searching for line:
```
Worker Configs:

```
----
2019-04-11 22:37:55 UTC - Jerry Peng: see if clientAuthenticationParameters and clientAuthenticationPlugin
configs are set properly
----
2019-04-11 22:44:50 UTC - Thor Sigurjonsson: those are set in the config, I've also seen them
in log files (/var/log/messages) I see logs from when I had an old token, changed to the new
one, and when I didn't have one. RIght now we the super user one in play.
----
2019-04-11 22:45:30 UTC - Thor Sigurjonsson: org.apache.pulsar.client.impl.auth.AuthenticationToken
is also there..
----
2019-04-11 22:46:12 UTC - Thor Sigurjonsson: I guess my biggest question is if a super-user
role has to also have specific consume/produce permissions.
----
2019-04-11 22:47:05 UTC - Jerry Peng: it should not
----
2019-04-11 22:47:52 UTC - Jerry Peng: a role set in “superUserRoles” should have permissions
to do everything
----
Mime
View raw message