Google BigQuery Glossary

Clear Definitions of Key Google BigQuery Terms
Explore the Google BigQuery glossary below. Each term is listed individually for easy reference and future expansion with examples or best practices.
Categories
1. Core Concepts
Project
Top-level container in Google Cloud that holds datasets, tables, and resources. Each project has a unique ID and billing account.Dataset
Logical grouping of tables and views within a project. Helps organize and manage access to related data.Table
Primary storage structure in BigQuery. Stores data in rows and columns and supports both native and external sources.View
Virtual table defined by a SQL query. Views return results dynamically and do not store data themselves.Partitioned Table
Table divided into segments (partitions) based on a column (e.g., date). Improves query performance and cost efficiency.Clustered Table
Table organized by the values of one or more columns. Optimizes query performance on large datasets.Materialized View
Precomputed view that stores query results for faster access. Automatically refreshed based on source table changes.2. Querying & SQL
For more information on SQL, please read our SQL Glossary
Standard SQL
Default dialect in BigQuery, compliant with ANSI SQL 2011. Supports advanced features like window functions and array handling
Legacy SQL
Older dialect used in early BigQuery versions. Still supported but not recommended for new development.
WITH Clause (CTE)
Common Table Expressions allow temporary named result sets within a query, improving readability and modularity.
Window Function
Performs calculations across a set of table rows related to the current row. Examples include ROW_NUMBER(), RANK(), and LEAD().
User-Defined Function (UDF)
Custom JavaScript or SQL functions that extend BigQuery’s capabilities. Useful for reusable logic across queries.
SAFE Functions
Functions like SAFE_CAST() or SAFE_DIVIDE() that prevent errors by returning NULL instead of failing.
EXCEPT / INTERSECT
Set operations that return differences or commonalities between query results, similar to SQL joins but based on row equality.
MERGE Statement
Combines INSERT, UPDATE, and DELETE logic in one query to synchronize data.
3. Data Types & Structures
ARRAY
Data type that holds an ordered list of values. Useful for nested and repeated data structures.
STRUCT (RECORD)
Complex data type that groups multiple fields. Enables hierarchical data modelling.
REPEATED Field
Allows multiple values in a single column, useful for denormalized or JSON-like data.
TIMESTAMP
Includes timezone-aware date and time values.
DATETIME
Date and time without timezone information.
DATE
Only the calendar date (YYYY-MM-DD).
4. Storage & Performance
Columnar Storage
BigQuery stores data in a columnar format, enabling efficient compression and faster analytical queries.Slot
Unit of computational capacity in BigQuery. Queries consume slots, and pricing is based on slot usage in some models.Caching
BigQuery caches query results for 24 hours. Re-running the same query within this window may return results at no cost.Query Plan Explanation
Visual breakdown of query stages and resource usage. Helps diagnose performance bottlenecks.Bytes Processed
Indicates how much data a query scans. Key metric for cost estimation and optimization.Table Preview
Allows sampling data without incurring full query costs.5. Access & Security
IAM (Identity and Access Management)
Controls who can access BigQuery resources and what actions they can perform. Permissions are granted at the project, dataset, or table level.
Authorized View
View that restricts access to specific columns or rows, allowing secure data sharing without exposing raw tables.
Service Account
Special Google account used by applications or services to access BigQuery programmatically.
Audit Logs
Logs that track access and changes to BigQuery resources. Useful for compliance and troubleshooting.
6. Data Loading & Export
Load Job
Imports data into BigQuery from sources like Cloud Storage, Google Sheets, or local files.Streaming Insert
Real-time data ingestion method that allows row-by-row inserts into BigQuery tables.Export Job
Writes BigQuery table data to external storage, typically Google Cloud Storage.Federated Query
Queries data stored outside BigQuery (e.g., Cloud Storage, Google Sheets) without loading it into a table.Scheduled Query
Automates query execution at defined intervals. Often used for ETL workflows or dashboard refreshes.Data Transfer Service
Imports data from external sources like Google Ads, YouTube, or SaaS apps into BigQuery.7. Monitoring & Metadata
INFORMATION_SCHEMA
System views that expose metadata about datasets, tables, columns, and jobs. Useful for auditing and automation.
Job
Represents a unit of work in BigQuery, such as a query, load, or export operation. Each job has a unique ID and status.
8. Integration & Ecosystem
BigQuery ML
Enables machine learning model creation and prediction directly within BigQuery using SQL.
BigQuery GIS
Adds support for geospatial data types and functions, enabling spatial analysis.
BigQuery Omni
Allows querying data across multiple clouds (AWS, Azure) using BigQuery’s interface.
Looker Studio Connector
Native integration for visualizing BigQuery data in Looker Studio dashboards.
9. Pricing & Optimisation
Query Pricing Models
BigQuery offers two primary models for query costs:- On-demand: Pay per bytes processed (best for variable workloads).
- Flat-rate (Capacity): Pay for dedicated slots (best for predictable budgets).
Get Involved:
Suggest Terms & Request Clarification
Help Us Make the Glossary Even Better
We’re committed to keeping our glossary comprehensive and up to date, but we know there’s always room for improvement. If you’ve come across a term that’s missing or need further explanation on any topic, we’d love to hear from you.
Suggest a New Term
Have a technical phrase or concept you think should be added? Let us know! Your input helps us ensure the glossary remains useful to everyone in the community.
Request Clarification
If you find any definition unclear or would like more examples, feel free to request clarification. We aim to provide clear, practical insights for all users.
Share Your Feedback
- Email your suggestions directly
- Fill in our feedback form
Thank you for helping us build a better resource for everyone!
Google BigQuery– Frequently Asked Questions
What is the difference between BigQuery and a traditional SQL database?
While traditional databases (like SQL Server or MySQL) are designed for transactional processing (adding/editing single rows), BigQuery is a cloud-native data warehouse designed for analytical processing. It uses a columnar storage format and massive parallel processing to scan billions of rows in seconds, making it ideal for big data rather than simple application backends.
What is the difference between a Dataset and a Table?
In BigQuery, a Dataset is a top-level container that holds your tables and views. Think of it like a folder or a schema. A Table lives inside a dataset and contains your actual data. You manage permissions (who can see what) primarily at the Dataset level.
How does BigQuery's "On-Demand" pricing work?
Unlike traditional servers where you pay for "uptime," BigQuery’s default pricing is based on analysis: you are charged for the amount of data processed by each query. Generally, this is $6.25 per Terabyte (TB) scanned. This is why using
SELECT *is discouraged—it costs more because you are scanning every column!What is the difference between Partitioning and Clustering?
These are the two main ways to optimize performance and cost:
Partitioning: Divides a table into segments based on a date or integer column. BigQuery only "charges" you for the segments you query.
Clustering: Sorts the data within those partitions based on specific columns (like Category or ID), which makes searching for specific values much faster.
What is BigQuery ML (Machine Learning)?
BigQuery ML allows data analysts to create and execute machine learning models directly inside BigQuery using standard SQL. You don’t need to move data to a separate tool or write Python; you can train models (like linear regression or forecasting) using the same syntax you use to select data.
Other Google BigQuery Resources
You can read all of our Google BigQuery blog archives here
