From user-return-35780-apmail-spark-user-archive=spark.apache.org@spark.apache.org Wed Jun 17 23:43:28 2015 Return-Path: X-Original-To: apmail-spark-user-archive@minotaur.apache.org Delivered-To: apmail-spark-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E575F18520 for ; Wed, 17 Jun 2015 23:43:27 +0000 (UTC) Received: (qmail 18053 invoked by uid 500); 17 Jun 2015 23:43:24 -0000 Delivered-To: apmail-spark-user-archive@spark.apache.org Received: (qmail 17970 invoked by uid 500); 17 Jun 2015 23:43:24 -0000 Mailing-List: contact user-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list user@spark.apache.org Received: (qmail 17960 invoked by uid 99); 17 Jun 2015 23:43:24 -0000 Received: from Unknown (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Jun 2015 23:43:24 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 52D6CC0953 for ; Wed, 17 Jun 2015 23:43:24 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -1.209 X-Spam-Level: X-Spam-Status: No, score=-1.209 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_MSPIKE_H2=-1.108, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id w_i9_gE_kJ6G for ; Wed, 17 Jun 2015 23:43:23 +0000 (UTC) Received: from mail-ie0-f173.google.com (mail-ie0-f173.google.com [209.85.223.173]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id 0CAFD21021 for ; Wed, 17 Jun 2015 23:43:23 +0000 (UTC) Received: by iesa3 with SMTP id a3so44519820ies.2 for ; Wed, 17 Jun 2015 16:42:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=LkKD0j15ZH11FRJpTuiViDTq91WVdi8qywd0xeJlueE=; b=ZKXCXEj3CUrBvujck2T39ojVzRwPKKn3vvFlp3slwuZNj0eAb+62CahDppcQRch4dy Pwo6lAza/kpV1ltyRavQBVQzkwCDPDMUVVuXcy9arkbCEitdwVRdWB5agVS7EDWphHms pPeB1i7FiSADugaYimN5cnGcB1QXXhp4DZ9M/4eqeP1yNvfrS5scceW6CbUQiQAuw3un 2KyFWok82uPUkcI0GXrdlFU7xR+3LYxenEOMMtC+9AAN4BmF5MBY0k16ZHZ5PozxLhrN MqAvNFiM9k8K+WcNpipO96MadLe1fKrIViGnK9Eb1JM66YadG2irWxGXF1hBt5/RM7mk 09hw== MIME-Version: 1.0 X-Received: by 10.50.39.105 with SMTP id o9mr38971613igk.39.1434584557035; Wed, 17 Jun 2015 16:42:37 -0700 (PDT) Received: by 10.79.8.131 with HTTP; Wed, 17 Jun 2015 16:42:37 -0700 (PDT) In-Reply-To: References: Date: Wed, 17 Jun 2015 16:42:37 -0700 Message-ID: Subject: Re: k-means for text mining in a streaming context From: Xiangrui Meng To: Ruslan Dautkhanov Cc: user Content-Type: text/plain; charset=UTF-8 Yes. You can apply HashingTF on your input stream and then use StreamingKMeans for training and prediction. -Xiangrui On Mon, Jun 8, 2015 at 11:05 AM, Ruslan Dautkhanov wrote: > Hello, > > https://spark.apache.org/docs/latest/mllib-feature-extraction.html > would Feature Extraction and Transformation work in a streaming context? > > Wanted to extract text features, build K-means clusters for streaming > context > to detect anomalies on a continuous text stream. > > Would it be possible? > > > Best reagrds, > Ruslan Dautkhanov > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscribe@spark.apache.org For additional commands, e-mail: user-help@spark.apache.org