랜덤 샘플링은 RAND() 함수를 사용하면 된다.
N개의 행(row)을 랜덤 샘플링하는 쿼리는 다음과 같다.
--샘플 크기가 100개 행이라고 가정했을 때,
SELECT *
FROM `project.dataset.table`
WHERE RAND() < 100/(SELECT COUNT(*) FROM `project.dataset.table`)
;
RAND() 함수를 활용한 BigQuery 랜덤 샘플링
랜덤 샘플링은 RAND() 함수를 사용하면 된다.
N개의 행(row)을 랜덤 샘플링하는 쿼리는 다음과 같다.
--샘플 크기가 100개 행이라고 가정했을 때,
SELECT *
FROM `project.dataset.table`
WHERE RAND() < 100/(SELECT COUNT(*) FROM `project.dataset.table`)
;
An Oracle Database SQL Certified Associate demonstrates understanding of fundamental SQL concepts needed to undertake any database project. Candidates have illustrated a depth of knowledge of SQL and its use when working with the Oracle Database server, and a working knowledge of queries, insert, update and delete SQL statements as well as some Data Definition language and Data Control Language, the optimizer, tables and indexes, data modeling and normalization.
18 Sep 2018
No. | 문법 | 실행 |
1
2 3 4 5 6 |
SELECT
FROM WHERE GROUP BY HAVING ORDER BY |
FROM WHERE GROUP BY HAVING SELECT ORDER BY |
Outer Join (출처: JS’s Blog)
테이블에서 컬럼의 중복된 값만 추출
SELECT *
FROM 테이블
WHERE 컬럼 IN(
SELECT 컬럼
FROM (
SELECT 컬럼, ROW_NUMBER() OVER(PARTITION BY 컬럼 ORDER BY NULL) AS 행번호
FROM 테이블
WHERE 행번호=’2′
)
ORDER BY 컬럼
SELECT *
FROM 테이블
WHERE 컬럼 IN(
SELECT 컬럼
FROM 테이블
GROUP BY 컬럼
HAVING COUNT(컬럼)>1
)
ORDER BY 컬럼
성능을 고려했을 때 GROUP BY를 사용하지 않는 것이 권고된다.