spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Hamstra <m...@clearstorydata.com>
Subject Re: Spark and N-tier architecture
Date Tue, 29 Mar 2016 23:22:49 GMT
Yes and no.  The idea of n-tier architecture is about 20 years older than
Spark and doesn't really apply to Spark as n-tier was original conceived.
If the n-tier model helps you make sense of some things related to Spark,
then use it; but don't get hung up on trying to force a Spark architecture
into an outdated model.

On Tue, Mar 29, 2016 at 5:02 PM, Ashok Kumar <ashok34668@yahoo.com.invalid>
wrote:

> Thank you both.
>
> So am I correct that Spark fits in within the application tier in N-tier
> architecture?
>
>
> On Tuesday, 29 March 2016, 23:50, Alexander Pivovarov <
> apivovarov@gmail.com> wrote:
>
>
> Spark is a distributed data processing engine plus distributed in-memory /
> disk data cache
>
> spark-jobserver provides REST API to your spark applications. It allows
> you to submit jobs to spark and get results in sync or async mode
>
> It also can create long running Spark context to cache RDDs in memory with
> some name (namedRDD) and then use it to serve requests from multiple users.
> Because RDD is in memory response should be super fast (seconds)
>
> https://github.com/spark-jobserver/spark-jobserver
>
>
> On Tue, Mar 29, 2016 at 2:50 PM, Mich Talebzadeh <
> mich.talebzadeh@gmail.com> wrote:
>
> Interesting question.
>
> The most widely used application of N-tier is the traditional three-tier
> architecture that has been the backbone of Client-server architecture by
> having presentation layer, application layer and data layer. This is
> primarily for performance, scalability and maintenance. The most profound
> changes that Big data space has introduced to N-tier architecture is the
> concept of horizontal scaling as opposed to the previous tiers that relied
> on vertical scaling. HDFS is an example of horizontal scaling at the data
> tier by adding more JBODS to storage. Similarly adding more nodes to Spark
> cluster should result in better performance.
>
> Bear in mind that these tiers are at Logical levels which means that there
> or may not be so many so many physical layers. For example multiple virtual
> servers can be hosted on the same physical server.
>
> With regard to Spark, it is effectively a powerful query tools that sits
> in between the presentation layer (say Tableau) and the HDFS or Hive as you
> alluded. In that sense you can think of Spark as part of the application
> layer that communicates with the backend via a number of protocols
> including the standard JDBC. There is rather a blurred vision here whether
> Spark is a database or query tool. IMO it is a query tool in a sense that
> Spark by itself does not have its own storage concept or metastore. Thus it
> relies on others to provide that service.
>
> HTH
>
>
>
> Dr Mich Talebzadeh
>
> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
> http://talebzadehmich.wordpress.com
>
>
> On 29 March 2016 at 22:07, Ashok Kumar <ashok34668@yahoo.com.invalid>
> wrote:
>
> Experts,
>
> One of terms used and I hear is N-tier architecture within Big Data used
> for availability, performance etc. I also hear that Spark by means of its
> query engine and in-memory caching fits into middle tier (application
> layer) with HDFS and Hive may be providing the data tier.  Can someone
> elaborate the role of Spark here. For example A Scala program that we write
> uses JDBC to talk to databases so in that sense is Spark a middle tier
> application?
>
> I hope that someone can clarify this and if so what would the best
> practice in using Spark as middle tier and within Big data.
>
> Thanks
>
>
>
>
>
>

Mime
View raw message