spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jatinpreet <jatinpr...@gmail.com>
Subject Re: Accessing posterior probability of Naive Baye's prediction
Date Fri, 28 Nov 2014 16:34:25 GMT
Thanks Sean, it did turn out to be a simple mistake after all. I appreciate
your help.

Jatin

On Thu, Nov 27, 2014 at 7:52 PM, sowen [via Apache Spark User List] <
ml-node+s1001560n19975h65@n3.nabble.com> wrote:

> No, the feature vector is not converted. It contains count n_i of how
> often each term t_i occurs (or a TF-IDF transformation of those). You
> are finding the class c such that P(c) * P(t_1|c)^n_1 * ... is
> maximized.
>
> In log space it's log(P(c)) + n_1*log(P(t_1|c)) + ...
>
> So your n_1 counts (or TF-IDF values) are used as-is and this is where
> the dot product comes from.
>
> Your bug is probably something lower-level and simple. I'd debug the
> Spark example and print exactly its values for the log priors and
> conditional probabilities, and the matrix operations, and yours too,
> and see where the difference is.
>
> On Thu, Nov 27, 2014 at 11:37 AM, jatinpreet <[hidden email]
> <http://user/SendEmail.jtp?type=node&node=19975&i=0>> wrote:
>
> > Hi,
> >
> > I have been running through some troubles while converting the code to
> Java.
> > I have done the matrix operations as directed and tried to find the
> maximum
> > score for each category. But the predicted category is mostly different
> from
> > the prediction done by MLlib.
> >
> > I am fetching iterators of the pi, theta and testData to do my
> calculations.
> > pi and theta are in  log space while my testData vector is not, could
> that
> > be a problem because I didn't see explicit conversion in Mllib also?
> >
> > For example, for two categories and 5 features, I am doing the following
> > operation,
> >
> > [1,2] + [1 2 3 4 5  ] * [1,2,3,4,5]
> >            [6 7 8 9 10]
> > These are simple element-wise matrix multiplication and addition
> operators.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> <http://user/SendEmail.jtp?type=node&node=19975&i=1>
> For additional commands, e-mail: [hidden email]
> <http://user/SendEmail.jtp?type=node&node=19975&i=2>
>
>
>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://apache-spark-user-list.1001560.n3.nabble.com/Accessing-posterior-probability-of-Naive-Baye-s-prediction-tp19828p19975.html
>  To unsubscribe from Accessing posterior probability of Naive Baye's
> prediction, click here
> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=19828&code=amF0aW5wcmVldEBnbWFpbC5jb218MTk4Mjh8MTY0NDI0MzIyNw==>
> .
> NAML
> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>



-- 
Regards,
Jatinpreet Singh




-----
Novice Big Data Programmer
--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Accessing-posterior-probability-of-Naive-Baye-s-prediction-tp19828p20011.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
Mime
View raw message