Thanks Juan! After playing with some processors a little bit I get the filtered results I wanted...almost! Still some work to do


Date: Sun, 1 Nov 2015 17:14:43 -0500
Subject: Re: Filtering GetTwitter with both terms and locations
From: bbende@gmail.com
To: users@nifi.apache.org

Hi David,

After re-reading the Twitter API documentation [1], it says:

"The track, follow, and locations fields should be considered to be combined with an OR operator. track=foo&follow=1234 returns Tweets matching “foo” OR created by user 1234."

-Bryan
[1] https://dev.twitter.com/streaming/reference/post/statuses/filter


On Sun, Nov 1, 2015 at 5:08 PM, Juan Jose Escobar <juanjose.escobar@gmail.com> wrote:
Hello, David

In a previous post you mentioned you are filtering by location and terms. Keep in mind the way the filtering works in the GetTwitter processor: in your case it will return all twits that are associated to the specified bounding box OR matching any of the terms. Do not expect the output of the processor to contain twits matching both conditions. You will need to implement one of the conditions separately to do so. Unless the frequency of the terms is high, I would say the best approach is to filter only by terms, and then add additional filtering for the location using additional processors. There are many options here, e.g. you could use EvaluateJSonPath (write to attribute), then RouteOnAttribute. 

Hope this helps

On Sun, Nov 1, 2015 at 10:47 PM, David Klim <davidklmlg@hotmail.com> wrote:

I have tested this again with different terms combinations but it seems it's ignoring the filtering. Any ideas on how to fix it?

Thanks in advance!


From: davidklmlg@hotmail.com
To: users@nifi.apache.org
Subject: RE: Filtering GetTwitter with both terms and locations
Date: Tue, 27 Oct 2015 23:46:00 +0100

Hello,

As far as I can see from my testing, terms seem to be ignored.

Here is the twitter processor configuration:

---
twitter endpoint: filter endpoint
terms to filter on: kmeans,kmean
locations to filter on: -124.476284,32.172870,-59.437227,48.862583
---



Date: Wed, 21 Oct 2015 16:09:02 -0400
Subject: Re: Filtering GetTwitter with both terms and locations
From: bbende@gmail.com
To: users@nifi.apache.org

From looking at the processor code it looks like it adds both the terms and locations to the filter endpoint and should be able to filter on both. The processor leverages the Hosebird Client [1] so it could be possible that library is not working as expected.

Is there a specific example of terms that aren't working? or they never work in conjunction with locations?


On Wed, Oct 21, 2015 at 3:03 PM, David Klim <davidklmlg@hotmail.com> wrote:
Hello,

I am trying to get data from Twitter filter endpoint using a both location (bounding box ) and terms to filter on. The data I get is not being filtered by terms at all. Is there any known problem with the feature? Not sure if the processor behaves as I expect.

Thanks a lot!