The Artima Developer Community
Sponsored Link

Java Buzz Forum
How to create rdd in apache spark using java

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
instanceof java

Posts: 576
Nickname: instanceof
Registered: Jan, 2015

instanceof java is a java related one.
How to create rdd in apache spark using java Posted: Aug 15, 2017 5:04 AM
Reply to this message Reply

This post originated from an RSS feed registered with Java Buzz by instanceof java.
Original Post: How to create rdd in apache spark using java
Feed Title: Instance Of Java
Feed URL:
Feed Description: Instance of Java. A place where you can learn java in simple way each and every topic covered with many points and sample programs.
Latest Java Buzz Posts
Latest Java Buzz Posts by instanceof java
Latest Posts From Instance Of Java

  • RDD is the spark's core abstraction.
  • Full form of RDD is resilient distributed dataset.
  • That means it is immutable collection of objects.
  • Each RDD will split into multiple partitions which may be computed in different machines of cluster
  • We can create Apache spark RDD in two ways
    1. Parallelizing a collection
    2. Loading an external dataset.
  • Now we sill see an example program on creating RDD by parallelizing a collection.
  • In Apache spark JavaSparkContext  class providing parallelize() method.
  • Let us see the simple example program to create Apache spark RDD in java

Program #1: Write a Apache spark java example program to create simple RDD using parallelize method of JavaSparkContext.

  1. package com.instanceofjava.sparkInterview;
  2. import java.util.Arrays;
  4. import org.apache.spark.SparkConf;
  5. import;
  6. import;
  8. /**
  9.  *  Apache spark examples:RDD in spark example program
  10.  * @author
  11.  */
  12. public class SparkTest {
  14.  public static void main(String[] args) {
  16.  SparkConf conf = new SparkConf().setMaster("local[2]").setAppName("InstanceofjavaAPP");
  17.  JavaSparkContext sc = new JavaSparkContext(conf);
  19.  JavaRDD<String> strRdd=sc.parallelize(Arrays.asList("apache spark element1","apache
  20. spark element2"));
  21.  System.out.println("apache spark rdd created: "+strRdd);
  23. /**
  24.  * Return the first element in this RDD.
  25.  */
  26. System.out.println(strRdd.first());
  27.  }
  29. }
  31. }


  1. apache spark rdd created: ParallelCollectionRDD[0] at parallelize at
  2. spark element1

create apache spark rdd java

    Read: How to create rdd in apache spark using java

    Topic: How to create rdd in apache spark using java Previous Topic   Next Topic Topic: Top 4 OCPJP 7 books for 1Z0-804 and 1Z0-805 Exam

    Sponsored Links


    Copyright © 1996-2018 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use