mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Federico Castanedo <fcast...@inf.uc3m.es>
Subject Re: Query
Date Sun, 03 Oct 2010 19:57:56 GMT
Hi Ted,

Could you, please, post again the reference of the first paper.

IMHO the problem with this kind of projects is *how* do you obtain and
define the feature vector,
that is the key in order to compare with enough success the different
images and provided the k more
similar images of any given image.

gagan, look at the features provided in the UCI machine learning
dataset for images, they are a good starting
point without being necessary to perform image processing tasks. You
can also obtain those features with the
algorithms implemented on the OpenCV library.

Cheers,
Fede

2010/10/3 Ted Dunning <ted.dunning@gmail.com>:
> This paper had some interesting references.  The problem they worked on was
> different from yours, but if you
> know something abou the training images, this might work out.  The something
> might be the original web-site
> nearby text or almost anything.
>
> http://www.public.asu.edu/~huanliu/.../SBP09_3-31(Baoxin%20Li%20-4).pdf
>
> THis paper describes the use of Gabor transforms and histograms for image
> clustering:
>
> http://www-nlpir.nist.gov/projects/tvpubs/tv6.papers/eurecom.pdf
>
> HSV histogram clustering might be a reasonable scale effort for a student
> project.
>
> Another approach is to try a latent factor method to characterize images.
>  This paper describes an image completion task on a handwritten digit
> dataset.  I am pretty sure that clustering on these latent features would
> give very nice clustering because they inherently have a Euclidean metric
> imposed on them.
>
> http://arxiv.org/abs/1006.2156
>
> The recommendation that you use OpenCV for image extraction is a very good
> one.  You might want to use Mahout for clustering, but I doubt you will have
> enough images to make that worth-while.  Just extracting useful features
> will take a long time.
>
> On Sun, Oct 3, 2010 at 10:33 AM, gagan chhabra <gagan.13031990@gmail.com>wrote:
>
>> Hello Steven Bourke,
>>
>> The data is actually not text. Query is an Image and database again of
>> images.
>>
>> I wanted to know how can one declare one image similar to another, in
>> programming terms. I mean  there has to some parameter of analysis or
>> algorithm which can solve this problem.
>>
>>
>>
>> On Sun, Oct 3, 2010 at 10:44 PM, Steven Bourke <sbourke@gmail.com> wrote:
>>
>> > Where is the semantic data coming from? I think something like lucene
>> would
>> > be more relevant if you are searching text based on available meta data.
>> >
>> > On Sun, Oct 3, 2010 at 6:54 PM, Sean Owen <srowen@gmail.com> wrote:
>> >
>> > > You probably want to look at  Shannon's spectral clustering code?
>> That's
>> > > the
>> > > closest thing I can think of  in Mahout. It doesn't have much of
>> anything
>> > > for image processing.
>> > >
>> > > On Sun, Oct 3, 2010 at 5:02 PM, gagan chhabra <
>> gagan.13031990@gmail.com
>> > > >wrote:
>> > >
>> > > > Hello all,
>> > > >
>> > > > I am a Engineering candidate and took a project which is based on
>> > Machine
>> > > > Learning. The idea is to Query-by-Image, it is a research paper by
>> > > > Googlers.
>> > > > I am not getting any point to start off.
>> > > >
>> > > > I don know if Mahout is of any use to me but since it is meant for
>> > > Machine
>> > > > Learnig I joined to know more about it.
>> > > >
>> > > > My application will go like:
>> > > > >  User eneters a query( which is an image).
>> > > >
>> > > > >  Then the application searches for other images in database
with
>> same
>> > > > semantic.
>> > > >  for example- if user enter an image of dog the app will retrieve
>> other
>> > > > images of dog
>> > > > or if user enters an image of snowy-mountain it retrieves simila
>> image.
>> > > >
>> > > > So i don get  how to compare images. What metric to use to declare
>> any
>> > > > image
>> > > > similar to query image.
>> > > >
>> > > > Please suggest something... any help will make a huge difference.
>> > > >
>> > > > --
>> > > > gagan
>> > > >
>> > >
>> >
>>
>>
>>
>> --
>> gagan
>>
>

Mime
View raw message