E5D0 GitHub - Effyis/BoardreaderSparkRDD: Custom SparkRDD to read Boardreader REST APIs
[go: up one dir, main page]

Skip to content

Effyis/BoardreaderSparkRDD

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BoardreaderSparkRDD

Custom SparkRDD to read Boardreader REST APIs

Parameters of the RDD are as follows:

BoardreaderRDD(sc: SparkContext, url: String, numPartitions: Int, mapRow: (T) => String)

Where url corresponds to the Boardreader REST API

RDD Usage: Copy the jar file https://github.com/Effyis/BoardreaderSparkRDD/tree/master/target/boardreader-spark-rdd-project-0.0.1-SNAPSHOT-jar-with-dependencies.jar.

To use with Spark Scala shell, refer the above jar file location while invoking the Spark Scala Shell

./spark-shell --jars /<jar location>//boardreader-spark-rdd-project-0.0.1-SNAPSHOT-jar-with-dependencies.jar

Invoke BoardreaderRDD by passing the Boardreader API URL

import boardreader.spark.rdd._

val data = new BoardreaderRDD(sc, "http://api.boardreader.com/v1/Boards/Search?&offset=0&query=trump&filter_date_from=1459828800&filter_date_to=1467911528&sort_mode=default&filter_language=English&body=snippet&key=boardreaderkey&rt=json", numPartitions = 1, mapRow = {x=>x.asInstanceOf[String]})

println(data.collect().toList)

About

Custom SparkRDD to read Boardreader REST APIs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

0