spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chen Jin <karen...@gmail.com>
Subject JavaRDD.collect()
Date Sat, 25 Jan 2014 01:55:51 GMT
Hi All,

I have some metadata saved as a single partition on HDFS (a few
hundred bytes) and when I want to get the content of the data:

JavaRDD blob = sc.textFile();
List<String> lines = blob.collect();

However, collect takes probably more than 3 seconds at least but
first() only take 0.1 second,

Could you advise on what's the best practice to read small files using spark.

-chen


On Fri, Jan 24, 2014 at 3:23 PM, Kapil Malik <kmalik@adobe.com> wrote:
> Hi Andrew,
>
>
>
> Here's the exception I get while trying to build an OSGi bundle using maven SCR plugin
-
>
>
>
> [ERROR] Failed to execute goal org.apache.felix:maven-scr-plugin:1.9.0:scr (generate-scr-scrdescriptor)
on project repo-spark: Execution generate-scr-scrdescriptor of goal org.apache.felix:maven-scr-plugin:1.9.0:scr
failed: Invalid signature file digest for Manifest main attributes -> [Help 1]
>
> org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.apache.felix:maven-scr-plugin:1.9.0:scr
(generate-scr-scrdescriptor) on project repo-spark: Execution generate-scr-scrdescriptor of
goal org.apache.felix:maven-scr-plugin:1.9.0:scr failed: Invalid signature file digest for
Manifest main attributes
>
>   at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:225)
>
> ...
>
> Caused by: org.apache.maven.plugin.PluginExecutionException: Execution generate-scr-scrdescriptor
of goal org.apache.felix:maven-scr-plugin:1.9.0:scr failed: Invalid signature file digest
for Manifest main attributes
>
>   at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:110)
>
>   at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:209)
>
>   ... 19 more
>
> Caused by: java.lang.SecurityException: Invalid signature file digest for Manifest main
attributes
>
>   at sun.security.util.SignatureFileVerifier.processImpl(SignatureFileVerifier.java:240)
>
> ...
>
>
>
>
>
> Also, from eclipse, if I build a simple main program. Then, I can create an executable
JAR in 3 ways -
>
> a.       Extract required libraries into generated JAR ( individual classes inside my
JAR)
>
> On running main program on this JAR –
>
> Exception in thread "main" com.typesafe.config.ConfigException$Missing: No configuration
setting found for key 'akka.remote.log-received-messages'
>
>         at com.typesafe.config.impl.SimpleConfig.findKey(SimpleConfig.java:126)
>
>         at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:146)
>
>
>
> b.      Package required libraries into generated JAR (all JARs inside my JAR)
>
> On running main program on this JAR –
>
> Exception in thread "main" java.lang.reflect.InvocationTargetException
>
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
>         at java.lang.reflect.Method.invoke(Method.java:616)
>
>         at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)
>
> Caused by: java.lang.Exception: Could not find resource path for Web UI: org/apache/spark/ui/static
>
>         at org.apache.spark.ui.JettyUtils$.createStaticHandler(JettyUtils.scala:89)
>
>         at org.apache.spark.ui.SparkUI.<init>(SparkUI.scala:40)
>
>         at org.apache.spark.SparkContext.<init>(SparkContext.scala:122)
>
>         at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:67)
>
>
>
> c.       Copy required libraries into a sub-folder next to generated JAR
>
> This works well ! But problem is that it’s not portable to a java server.
>
>
>
> Regards,
>
>
>
> Kapil Malik | kmalik@adobe.com
>
>
>
> -----Original Message-----
> From: Andrew Ash [mailto:andrew@andrewash.com]
> Sent: 25 January 2014 04:08
> To: user@spark.incubator.apache.org
> Cc: dev@spark.incubator.apache.org
> Subject: Re: Running spark driver inside a servlet
>
>
>
> Can you paste the exception you're seeing?
>
>
>
> Sent from my mobile phone
>
> On Jan 24, 2014 2:36 PM, "Kapil Malik" <kmalik@adobe.com<mailto:kmalik@adobe.com>>
wrote:
>
>
>
>>  Hi all,
>
>>
>
>>
>
>>
>
>> Is it possible to create a Spark Context (i.e. the driver program)
>
>> from a servlet deployed on some application server ?
>
>>
>
>> I am able to run spark Java driver successfully via maven / standalone
>
>> (after specifying the classpath), but when I bundle spark libraries in
>
>> a JAR, along with my servlet (using maven shade plugin), it gives me
>
>> security exception. Any suggestions?
>
>>
>
>>
>
>>
>
>> Thanks and regards,
>
>>
>
>>
>
>>
>
>> Kapil Malik
>
>>
>
>>
>
>>

Mime
View raw message