spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kevin Jung <>
Subject Re: What is the optimal approach to do Secondary Sort in Spark?
Date Wed, 12 Aug 2015 01:14:50 GMT
You should create key as tuple type. In your case, RDD[((id, timeStamp) , value)] is the proper
way to do.


------- Original Message -------
Sender : swetha<>
Date : 2015-08-12 09:37 (GMT+09:00)
Title : What is the optimal approach to do Secondary Sort in Spark?


What is the optimal approach to do Secondary sort in Spark? I have to first
Sort by an Id in the key and further sort it by timeStamp which is present
in the value.


View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:
상기 메일은 지정된 수신인만을 위한 것이며, 부정경쟁방지 및 영업비밀보호에
관한 법률,개인정보 보호법을 포함하여
 관련 법령에 따라 보호의 대상이 되는 영업비밀, 산업기술,기밀정보,
개인정보 등을 포함하고 있을 수 있습니다.
본 문서에 포함된 정보의 전부 또는 일부를 무단으로 복사 또는 사용하거나
제3자에게 공개, 배포, 제공하는 것은 엄격히
 금지됩니다. 본 메일이 잘못 전송된 경우 발신인 또는 당사에게 알려주시고
본 메일을 즉시 삭제하여 주시기 바랍니다. 
The contents of this e-mail message and any attachments are confidential and are intended
solely for addressee.
 The information may also be legally privileged. This transmission is sent in trust, for the
sole purpose of delivery
 to the intended recipient. If you have received this transmission in error, any use, reproduction
or dissemination of
 this transmission is strictly prohibited. If you are not the intended recipient, please immediately
notify the sender
 by reply e-mail or phone and delete this message and its attachments, if any.
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message