Maven
1
2
3
4
5 <dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.3.1</version>
</dependency>
Start Point
Spark 2.x : 创建一个SparkSession
1 | SparkSession spark = SparkSession |
Create DataFrames from an existing RDD
record.json1
{"key":1,"value":"enda"}
Application1
2
3
4
5
6
7
8
9
10Dataset<Row> dataset = spark.read().json("./record.json");
dataset.show();
// show
// +---+-----+
// |key|value|
// +---+-----+
// | 1| enda|
// +---+-----+
快速生成可执行Jar 包
1 | <build> |