提示
Spark SQL 教程 正在筹划编写中,使用过程中有任何建议,提供意见、建议、纠错、催更加微信 gr99123。
Spark SQL 的 LATERAL VIEW 子句与生成器函数(如 EXPLODE)结合使用,后者将生成包含一行或多行的虚拟表。LATERAL VIEW 将行应用于每个原始输出行。
Spark SQL 的 LATERAL VIEW 子句结构为:
LATERAL VIEW [ OUTER ] generator_function ( expression [ , ... ] ) [ table_alias ] AS column_alias [ , ... ]
参数:
以下是一些示例讲解:
CREATE TABLE person (id INT, name STRING, age INT, class INT, address STRING);
INSERT INTO person VALUES
(100, 'John', 30, 1, 'Street 1'),
(200, 'Mary', NULL, 1, 'Street 2'),
(300, 'Mike', 80, 3, 'Street 3'),
(400, 'Dan', 50, 4, 'Street 4');
SELECT * FROM person
LATERAL VIEW EXPLODE(ARRAY(30, 60)) tableName AS c_age
LATERAL VIEW EXPLODE(ARRAY(40, 80)) AS d_age;
+------+-------+-------+--------+-----------+--------+--------+
| id | name | age | class | address | c_age | d_age |
+------+-------+-------+--------+-----------+--------+--------+
| 100 | John | 30 | 1 | Street 1 | 30 | 40 |
| 100 | John | 30 | 1 | Street 1 | 30 | 80 |
| 100 | John | 30 | 1 | Street 1 | 60 | 40 |
| 100 | John | 30 | 1 | Street 1 | 60 | 80 |
| 200 | Mary | NULL | 1 | Street 2 | 30 | 40 |
| 200 | Mary | NULL | 1 | Street 2 | 30 | 80 |
| 200 | Mary | NULL | 1 | Street 2 | 60 | 40 |
| 200 | Mary | NULL | 1 | Street 2 | 60 | 80 |
| 300 | Mike | 80 | 3 | Street 3 | 30 | 40 |
| 300 | Mike | 80 | 3 | Street 3 | 30 | 80 |
| 300 | Mike | 80 | 3 | Street 3 | 60 | 40 |
| 300 | Mike | 80 | 3 | Street 3 | 60 | 80 |
| 400 | Dan | 50 | 4 | Street 4 | 30 | 40 |
| 400 | Dan | 50 | 4 | Street 4 | 30 | 80 |
| 400 | Dan | 50 | 4 | Street 4 | 60 | 40 |
| 400 | Dan | 50 | 4 | Street 4 | 60 | 80 |
+------+-------+-------+--------+-----------+--------+--------+
SELECT c_age, COUNT(1) FROM person
LATERAL VIEW EXPLODE(ARRAY(30, 60)) AS c_age
LATERAL VIEW EXPLODE(ARRAY(40, 80)) AS d_age
GROUP BY c_age;
+--------+-----------+
| c_age | count(1) |
+--------+-----------+
| 60 | 8 |
| 30 | 8 |
+--------+-----------+
SELECT * FROM person
LATERAL VIEW EXPLODE(ARRAY()) tableName AS c_age;
+-----+-------+------+--------+----------+--------+
| id | name | age | class | address | c_age |
+-----+-------+------+--------+----------+--------+
+-----+-------+------+--------+----------+--------+
SELECT * FROM person
LATERAL VIEW OUTER EXPLODE(ARRAY()) tableName AS c_age;
+------+-------+-------+--------+-----------+--------+
| id | name | age | class | address | c_age |
+------+-------+-------+--------+-----------+--------+
| 100 | John | 30 | 1 | Street 1 | NULL |
| 200 | Mary | NULL | 1 | Street 2 | NULL |
| 300 | Mike | 80 | 3 | Street 3 | NULL |
| 400 | Dan | 50 | 4 | Street 4 | NULL |
+------+-------+-------+--------+-----------+--------+
更新时间:2021-08-29 14:48:11 标签:sql spark 行转列