spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mich Talebzadeh <mich.talebza...@gmail.com>
Subject Re: Possible DR solution
Date Fri, 11 Nov 2016 17:24:03 GMT
I think it differs as it starts streaming data through its own port as soon
as the first block is landed. so the granularity is a block.

however, think of it as oracle golden gate replication or sap replication
for databases. the only difference is that if the corruption in the block
with hdfs it will be freplicated much like srdf.

whereas with oracle or sap it is log based replication which stops when it
encounters corruption.

replication depends on the block. so can replicate hive metadata and
fsimage etc. but cannot replicate hbase memstore if hbase crashes.

so that is the gist of it. streaming replication as opposed to snapshot.

sounds familiar. think of it as log shipping in oracle old days versus
goldengate etc.

hth

Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 11 November 2016 at 17:14, Deepak Sharma <deepakmca05@gmail.com> wrote:

> Reason being you can set up hdfs duplication on your own to some other
> cluster .
>
> On Nov 11, 2016 22:42, "Mich Talebzadeh" <mich.talebzadeh@gmail.com>
> wrote:
>
>> reason being ?
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>> On 11 November 2016 at 17:11, Deepak Sharma <deepakmca05@gmail.com>
>> wrote:
>>
>>> This is waste of money I guess.
>>>
>>> On Nov 11, 2016 22:41, "Mich Talebzadeh" <mich.talebzadeh@gmail.com>
>>> wrote:
>>>
>>>> starts at $4,000 per node per year all inclusive.
>>>>
>>>> With discount it can be halved but we are talking a node itself so if
>>>> you have 5 nodes in primary and 5 nodes in DR we are talking about $40K
>>>> already.
>>>>
>>>> HTH
>>>>
>>>> Dr Mich Talebzadeh
>>>>
>>>>
>>>>
>>>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>>
>>>>
>>>>
>>>> http://talebzadehmich.wordpress.com
>>>>
>>>>
>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>>> any loss, damage or destruction of data or any other property which may
>>>> arise from relying on this email's technical content is explicitly
>>>> disclaimed. The author will in no case be liable for any monetary damages
>>>> arising from such loss, damage or destruction.
>>>>
>>>>
>>>>
>>>> On 11 November 2016 at 16:43, Mudit Kumar <mkumar128@sapient.com>
>>>> wrote:
>>>>
>>>>> Is it feasible cost wise?
>>>>>
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Mudit
>>>>>
>>>>>
>>>>>
>>>>> *From:* Mich Talebzadeh [mailto:mich.talebzadeh@gmail.com]
>>>>> *Sent:* Friday, November 11, 2016 2:56 PM
>>>>> *To:* user @spark
>>>>> *Subject:* Possible DR solution
>>>>>
>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>>
>>>>>
>>>>> Has anyone had experience of using WanDisco
>>>>> <https://www.wandisco.com/> block replication to create a fault
>>>>> tolerant solution to DR in Hadoop?
>>>>>
>>>>>
>>>>>
>>>>> The product claims that it starts replicating as soon as the first
>>>>> data block lands on HDFS and takes the block and sends it to DR/replicate
>>>>> site. The idea is that is faster than doing it through traditional HDFS
>>>>> copy tools which are normally batch oriented.
>>>>>
>>>>>
>>>>>
>>>>> It also claims to replicate Hive metadata as well.
>>>>>
>>>>>
>>>>>
>>>>> I wanted to gauge if anyone has used it or a competitor product. The
>>>>> claim is that they do not have competitors!
>>>>>
>>>>>
>>>>>
>>>>> Thanks
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Dr Mich Talebzadeh
>>>>>
>>>>>
>>>>>
>>>>> LinkedIn  *https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>>>
>>>>>
>>>>>
>>>>> http://talebzadehmich.wordpress.com
>>>>>
>>>>>
>>>>>
>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>>>> any loss, damage or destruction of data or any other property which may
>>>>> arise from relying on this email's technical content is explicitly
>>>>> disclaimed. The author will in no case be liable for any monetary damages
>>>>> arising from such loss, damage or destruction.
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>

Mime
View raw message