提示
Spark SQL 教程 正在筹划编写中,使用过程中有任何建议,提供意见、建议、纠错、催更加微信 gr99123。
Spark SQL 的 WHERE 子句用于根据指定的条件限制查询或子查询的 FROM 子句的结果。
Spark SQL 的 WHERE 子句结构为:
WHERE boolean_expression
boolean_expression 是 指定计算结果类型为布尔值的任何表达式。可以使用逻辑运算符(AND、OR)将两个或多个表达式组合在一起。
以下是一些示例讲解:
-- 创建数据
CREATE TABLE person (id INT, name STRING, age INT);
INSERT INTO person VALUES
(100, 'John', 30),
(200, 'Mary', NULL),
(300, 'Mike', 80),
(400, 'Dan', 50);
-- “WHERE” 子句中的比较运算符
SELECT * FROM person WHERE id > 200 ORDER BY id;
+---+----+---+
| id|name|age|
+---+----+---+
|300|Mike| 80|
|400| Dan| 50|
+---+----+---+
-- WHERE 子句中的比较运算符和逻辑运算符
SELECT * FROM person WHERE id = 200 OR id = 300 ORDER BY id;
+---+----+----+
| id|name| age|
+---+----+----+
|200|Mary|null|
|300|Mike| 80|
+---+----+----+
-- IS NULL 表达式:为空判断
SELECT * FROM person WHERE id > 300 OR age IS NULL ORDER BY id;
+---+----+----+
| id|name| age|
+---+----+----+
|200|Mary|null|
|400| Dan| 50|
+---+----+----+
-- 函数表达式
SELECT * FROM person WHERE length(name) > 3 ORDER BY id;
+---+----+----+
| id|name| age|
+---+----+----+
|100|John| 30|
|200|Mary|null|
|300|Mike| 80|
+---+----+----+
-- `BETWEEN` 表达式:在一个区间
SELECT * FROM person
WHERE id BETWEEN 200 AND 300
ORDER BY id;
+---+----+----+
| id|name| age|
+---+----+----+
|200|Mary|null|
|300|Mike| 80|
+---+----+----+
-- “WHERE”子句中的标量子查询
SELECT * FROM person
WHERE age > (SELECT avg(age) FROM person);
+---+----+---+
| id|name|age|
+---+----+---+
|300|Mike| 80|
+---+----+---+
-- “WHERE”子句中的相关子查询
SELECT * FROM person AS parent
WHERE EXISTS (
SELECT 1 FROM person AS child
WHERE parent.id = child.id AND child.age IS NULL
);
+---+----+----+
|id |name|age |
+---+----+----+
|200|Mary|null|
+---+----+----+
更新时间:2021-06-29 22:54:50 标签:sql spark where