nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Klim <davidkl...@hotmail.com>
Subject RE: Filtering GetTwitter with both terms and locations
Date Mon, 02 Nov 2015 11:22:27 GMT
Thanks Juan! After playing with some processors a little bit I get the filtered results I wanted...almost!
Still some work to do

Date: Sun, 1 Nov 2015 17:14:43 -0500
Subject: Re: Filtering GetTwitter with both terms and locations
From: bbende@gmail.com
To: users@nifi.apache.org

Hi David,
After re-reading the Twitter API documentation [1], it says:
"The track, follow, and locations fields should be considered to be combined with an OR operator.
track=foo&follow=1234 returns Tweets matching “foo” OR created by user 1234."
-Bryan[1] https://dev.twitter.com/streaming/reference/post/statuses/filter

On Sun, Nov 1, 2015 at 5:08 PM, Juan Jose Escobar <juanjose.escobar@gmail.com> wrote:
Hello, David
In a previous post you mentioned you are filtering by location and terms. Keep in mind the
way the filtering works in the GetTwitter processor: in your case it will return all twits
that are associated to the specified bounding box OR matching any of the terms. Do not expect
the output of the processor to contain twits matching both conditions. You will need to implement
one of the conditions separately to do so. Unless the frequency of the terms is high, I would
say the best approach is to filter only by terms, and then add additional filtering for the
location using additional processors. There are many options here, e.g. you could use EvaluateJSonPath
(write to attribute), then RouteOnAttribute. 
Hope this helps
On Sun, Nov 1, 2015 at 10:47 PM, David Klim <davidklmlg@hotmail.com> wrote:




I have tested this again with different terms combinations but it seems it's ignoring the
filtering. Any ideas on how to fix it?
Thanks in advance!
From: davidklmlg@hotmail.com
To: users@nifi.apache.org
Subject: RE: Filtering GetTwitter with both terms and locations
Date: Tue, 27 Oct 2015 23:46:00 +0100




Hello,
As far as I can see from my testing, terms seem to be ignored.
Here is the twitter processor configuration:
---twitter endpoint: filter endpointterms to filter on: kmeans,kmeanlocations to filter on:
-124.476284,32.172870,-59.437227,48.862583
---

Date: Wed, 21 Oct 2015 16:09:02 -0400
Subject: Re: Filtering GetTwitter with both terms and locations
From: bbende@gmail.com
To: users@nifi.apache.org

>From looking at the processor code it looks like it adds both the terms and locations
to the filter endpoint and should be able to filter on both. The processor leverages the Hosebird
Client [1] so it could be possible that library is not working as expected.
Is there a specific example of terms that aren't working? or they never work in conjunction
with locations?
[1] https://github.com/twitter/hbc
On Wed, Oct 21, 2015 at 3:03 PM, David Klim <davidklmlg@hotmail.com> wrote:



Hello,
I am trying to get data from Twitter filter endpoint using a both location (bounding box )
and terms to filter on. The data I get is not being filtered by terms at all. Is there any
known problem with the feature? Not sure if the processor behaves as I expect.
Thanks a lot!

 		 	   		  

 		 	   		   		 	   		  



 		 	   		  
Mime
View raw message