samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ash W Matheson <ash.mathe...@gmail.com>
Subject Re: New to Samza/Yarn and having Kafka issues
Date Sun, 22 Mar 2015 18:55:14 GMT
Ignore that last email - was reading the page stupidly.

On Sun, Mar 22, 2015 at 11:52 AM, Ash W Matheson <ash.matheson@gmail.com>
wrote:

> Is there any easy way to do that without recompiling samza?  I'm trying to
> localize that into the 'hello-samza' and looking at
> http://samza.apache.org/learn/documentation/latest/jobs/logging.html
> leads me to believe that I have to do this in the base samza project (not
> hello-samza).
>
> On Sun, Mar 22, 2015 at 11:37 AM, Ash W Matheson <ash.matheson@gmail.com>
> wrote:
>
>> Sure - I'll do that in a bit and send it up to pastebin.
>>
>> On Sun, Mar 22, 2015 at 11:35 AM, Chinmay Soman <
>> chinmay.cerebro@gmail.com> wrote:
>>
>>> Can you please enable debug level logging and paste the log?
>>>
>>> On Sun, Mar 22, 2015 at 11:28 AM, Ash W Matheson <ash.matheson@gmail.com
>>> >
>>> wrote:
>>>
>>> > No, it's behind some corporate stuff - I just redacted it so I could
>>> share
>>> > it up.
>>> >
>>> > On Sun, Mar 22, 2015 at 11:17 AM, Chinmay Soman <
>>> chinmay.cerebro@gmail.com
>>> > >
>>> > wrote:
>>> >
>>> > > Just for sanity check, is the broker host 'redacted:9092' or '
>>> > > redactedec:9092'.
>>> > >
>>> > > Just wanted to rule out any typos. Are the 2 above hosts the same ?
>>> > >
>>> > > On Sun, Mar 22, 2015 at 11:08 AM, Ash W Matheson <
>>> ash.matheson@gmail.com
>>> > >
>>> > > wrote:
>>> > >
>>> > > > Also, here's the producer: http://pastebin.com/qMNJabTZ
>>> > > >
>>> > > >
>>> > > > On Sun, Mar 22, 2015 at 10:57 AM, Ash W Matheson <
>>> > ash.matheson@gmail.com
>>> > > >
>>> > > > wrote:
>>> > > >
>>> > > > > Yep, first thing I checked (got bitten by that earlier in
the
>>> week
>>> > with
>>> > > > no
>>> > > > > data actually in the topic).
>>> > > > >
>>> > > > > On Sun, Mar 22, 2015 at 10:56 AM, Chinmay Soman <
>>> > > > chinmay.cerebro@gmail.com
>>> > > > > > wrote:
>>> > > > >
>>> > > > >> Can you double check that you can read data from your
Kafka
>>> broker ?
>>> > > > >>
>>> > > > >> > ./deploy/kafka/bin/kafka-topics.sh --describe --zookeeper
>>> > > > localhost:2181
>>> > > > >> --topic myTopic
>>> > > > >> > ./deploy/kafka/bin/kafka-console-consumer.sh --zookeeper
>>> > > > localhost:2181
>>> > > > >> --topic myTopic --from-beginning
>>> > > > >>
>>> > > > >> I've seen cases where if the Kafka broker isn't shutdown
>>> properly,
>>> > > > >> something like this happens.
>>> > > > >>
>>> > > > >> On Sun, Mar 22, 2015 at 10:35 AM, Ash W Matheson <
>>> > > > ash.matheson@gmail.com>
>>> > > > >> wrote:
>>> > > > >>
>>> > > > >> > Hey all,
>>> > > > >> >
>>> > > > >> > Evaluating Samza currently and am running into some
odd
>>> issues.
>>> > > > >> >
>>> > > > >> > I'm currently working off the 'hello-samza' repo
and trying to
>>> > > parse a
>>> > > > >> > simple kafka topic that I've produced through an
extenal java
>>> app
>>> > > > >> (nothing
>>> > > > >> > other than a series of sentences) and it's failing
pretty
>>> hard for
>>> > > me.
>>> > > > >> The
>>> > > > >> > base 'hello-samza' set of apps works fine, but as
soon as I
>>> change
>>> > > the
>>> > > > >> > configuration to look at a different Kafka/zookeeper
I get the
>>> > > > >> following in
>>> > > > >> > the userlogs:
>>> > > > >> >
>>> > > > >> > 2015-03-22 17:07:09 KafkaSystemAdmin [WARN] Unable
to fetch
>>> last
>>> > > > offsets
>>> > > > >> > for streams [myTopic] due to kafka.common.KafkaException:
>>> fetching
>>> > > > topic
>>> > > > >> > metadata for topics [Set(myTopic)] from broker
>>> > > > >> > [ArrayBuffer(id:0,host:redacted,port:9092)] failed.
Retrying.
>>> > > > >> >
>>> > > > >> >
>>> > > > >> > The modifications are pretty straightforward.  In
the
>>> > > > >> > Wikipedia-parser.properties, I've changed the following:
>>> > > > >> > task.inputs=kafka.myTopic
>>> > > > >> > systems.kafka.consumer.zookeeper.connect=redacted:2181/
>>> > > > >> > systems.kafka.consumer.auto.offset.reset=smallest
>>> > > > >> > systems.kafka.producer.metadata.broker.list=redacted:9092
>>> > > > >> >
>>> > > > >> > and in the actual java file WikipediaParserStreamTask.java
>>> > > > >> >   public void process(IncomingMessageEnvelope envelope,
>>> > > > MessageCollector
>>> > > > >> > collector, TaskCoordinator coordinator) {
>>> > > > >> >     Map<String, Object> jsonObject = (Map<String,
Object>)
>>> > > > >> > envelope.getMessage();
>>> > > > >> >     WikipediaFeedEvent event = new
>>> WikipediaFeedEvent(jsonObject);
>>> > > > >> >
>>> > > > >> >     try {
>>> > > > >> >         System.out.println(event.getRawEvent());
>>> > > > >> >
>>> > > > >> > And then following the compile/extract/run process
outlined
>>> in the
>>> > > > >> > hello-samza website.
>>> > > > >> >
>>> > > > >> > Any thoughts?  I've looked online for any 'super
simple'
>>> examples
>>> > of
>>> > > > >> > ingesting kafka in samza with very little success.
>>> > > > >> >
>>> > > > >>
>>> > > > >>
>>> > > > >>
>>> > > > >> --
>>> > > > >> Thanks and regards
>>> > > > >>
>>> > > > >> Chinmay Soman
>>> > > > >>
>>> > > > >
>>> > > > >
>>> > > >
>>> > >
>>> > >
>>> > >
>>> > > --
>>> > > Thanks and regards
>>> > >
>>> > > Chinmay Soman
>>> > >
>>> >
>>>
>>>
>>>
>>> --
>>> Thanks and regards
>>>
>>> Chinmay Soman
>>>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message