Good idea, Ng. The UPSERT SELECT command doesn't use MR, it uses HBase APIs. It'd be interesting to see which way is fastest between regular Phoenix APIs vs our MR integration vs our Spark integration. Not 100% sure if these integrations support UPSERT SELECT without some minor modifications. Another option would be to use the CSV Bulk loader.
How about loading the data as data frame or RDD and just save the data to new salted table and drop earlier table. I feel spark is very very fast than MR. Just my idea thoughOn 18-Aug-2015 10:42 pm, "James Taylor" <firstname.lastname@example.org> wrote:You can use UPSERT SELECT from the old table to the new table and do this with a single statement: https://phoenix.apache.org/language/index.html#upsert_selectMake sure you set your timeouts high if the table is big.Thanks,James