spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "guxiaobo1982" <>
Subject How to create distributed matrixes from hive tables.
Date Sun, 18 Jan 2015 12:07:08 GMT

We have large datasets with data format for Spark MLLib matrix, but there are pre-computed
by Hive and stored inside Hive, my question is can we create a distributed matrix such as
IndexedRowMatrix directlly from Hive tables, avoiding reading data from Hive tables and feed
them into an empty Matrix.

View raw message