Restrict data access using analysis rules
This document provides general information about analysis rules in GoogleSQL for BigQuery.
What is an analysis rule?
An analysis rule enforces policies for sharing data. A policy represents a condition that needs to be met before a query can be run. With BigQuery, you can enforce an analysis rule on a view, using a data clean room or by applying the analysis rule directly to the view. When you enforce an analysis rule, you enforce that everyone querying that view must abide by that analysis rule on the view. If the analysis rule is satisfied, the query produces output that satisfies the analysis rule. If the query doesn't satisfy the analysis rule, an error is produced.
Supported analysis rules
The following analysis rules are supported:
Aggregation threshold analysis rule: Enforces the minimum number of distinct entities that must be present in a dataset. You can enforce this rule on a view, using DDL statements or data clean rooms. This rule supports the aggregation threshold policy and the join restriction policy.
Differential privacy analysis rule: Enforces a privacy budget, which limits the data that is revealed to a subscriber when the data is protected with differential privacy. You can enforce this rule on a view, using DDL statements or data clean rooms. This rule supports the differential privacy policy and the join restriction policy.
List overlap analysis rule: Overlapping rows can only be queried after a join operation, which conforms to the rule. You can enforce this rule on a view, using DDL statements or data clean rooms. This rule supports the join restriction policy.
Aggregation threshold analysis rule
An aggregation threshold analysis rule enforces the minimum number of distinct entities that must contribute to an output row of a query, so that the output row is included in the query result.
When enforced, the aggregation threshold analysis rule groups data across dimensions, while ensuring the aggregation threshold is met. It counts the number of distinct privacy units (represented by the privacy unit column) for each group, and only outputs the groups where the distinct privacy unit count satisfies the aggregation threshold.
A view that includes this analysis rule must include the aggregation threshold policy and can optionally include the join restriction policy.
Define an aggregation threshold analysis rule for a view
You can define an aggregation threshold analysis rule for a view in a data clean room or with the CREATE VIEW
statement:
CREATE OR REPLACE VIEW VIEW_NAME OPTIONS ( privacy_policy= '''{ "aggregation_threshold_policy": { "threshold" : THRESHOLD, "privacy_unit_column": "PRIVACY_UNIT_COLUMN" }, "join_restriction_policy": { "join_condition": "JOIN_CONDITION", "join_allowed_columns": JOIN_ALLOWED_COLUMNS } }''' ) AS QUERY;
Definitions:
aggregation_threshold_policy
: The aggregation threshold policy for the aggregation threshold analysis rule.VIEW_NAME: The path and name of the view.
THRESHOLD: The minimum number of distinct privacy units that need to contribute to each row in the query results. If a potential row doesn't satisfy this threshold, that row is omitted from the query results.
PRIVACY_UNIT_COLUMN: Represents the privacy unit column. A privacy unit column is a unique identifier for a privacy unit. A privacy unit is a value from the privacy unit column that represents the entity in a set of data that is being protected.
You can use only one privacy unit column, and the data type for the privacy unit column must be groupable.
The values in the privacy unit column cannot be directly projected through a query, and you can use only analysis rule-supported aggregate functions to aggregate the data in this column.
join_restriction_policy
(optional): The optional join restriction policy for the aggregation threshold analysis rule.JOIN_CONDITION: The type of join restriction to enforce on a view. This can be one of the following values:
JOIN_ALL
: All columns injoin_allowed_columns
must be inner joined upon for this view to be queried.JOIN_ANY
: At least one column injoin_allowed_columns
must be joined upon for this view to be queried.JOIN_BLOCKED
: This view can't be joined along any column. Don't setjoin_allowed_columns
in this case.JOIN_NOT_REQUIRED
: A join is not required to query this view. If a join is used, only the columns injoin_allowed_columns
can be used.
JOIN_ALLOWED_COLUMNS: The columns that can be part of a join operation.
QUERY: The query for the view.
Example:
In the following example, an aggregation threshold analysis rule is created on a view called ExamView
. ExamView
references a table called ExamTable
:
CREATE OR REPLACE VIEW mydataset.ExamView OPTIONS( privacy_policy= '{"aggregation_threshold_policy": {"threshold": 3, "privacy_unit_column": "last_name"}}' ) AS ( SELECT * FROM mydataset.ExamTable );
To review the privacy_policy
syntax for CREATE VIEW
, see the OPTIONS
list in CREATE VIEW
.
Update an aggregation threshold analysis rule for a view
You can change the aggregation threshold analysis rule for a view in a data clean room or with the ALTER VIEW
statement:
ALTER VIEW VIEW_NAME SET OPTIONS ( privacy_policy= '''{ "aggregation_threshold_policy": { "threshold" : THRESHOLD, "privacy_unit_column": "PRIVACY_UNIT_COLUMN" } }''' )
For more information about the values you can set for the privacy policies in the preceding syntax, see Define an aggregation threshold analysis rule for a view.
Example:
In the following example, an aggregation threshold analysis rule is updated on a view called ExamView
.
ALTER VIEW mydataset.ExamView SET OPTIONS ( privacy_policy= '{"aggregation_threshold_policy": {"threshold": 50, "privacy_unit_column": "last_name"}}' );
To review the privacy_policy
syntax for ALTER VIEW
, see the OPTIONS
list in ALTER VIEW SET OPTIONS
.
Query an aggregation threshold analysis rule–enforced view
You can query a view that has an aggregation threshold analysis rule with the AGGREGATION_THRESHOLD
clause. The query must include aggregation functions, and you can use only aggregation threshold-supported aggregate functions in this query.
Example:
In the following example, an aggregation threshold analysis rule is queried on a view called ExamView
:
SELECT WITH AGGREGATION_THRESHOLD test_id, COUNT(DISTINCT last_name) AS student_count FROM mydataset.ExamView GROUP BY test_id; /*---------+---------------* | test_id | student_count | +---------+---------------+ | P91 | 3 | | U25 | 4 | *---------+---------------*/
The aggregation threshold analysis rule can also optionally include the join restriction policy. To learn how to use the join restriction policy with an analysis rule, see Join restriction policy in analysis rules.
To review additional examples for the AGGREGATION_THRESHOLD
clause, see AGGREGATION_THRESHOLD
clause.
Differential privacy analysis rule
The differential privacy analysis rule enforces a privacy budget, which limits the data that is revealed to a subscriber when the data is protected with differential privacy. A privacy budget prevents any subscriber from querying shared data when the sum of all queries' epsilon or delta reaches the total epsilon or total delta value. You can use this analysis rule in a view.
A view that includes this analysis rule must include the differential privacy policy and can optionally include the join restriction policy.
Define a differential privacy analysis rule for a view
You can define a differential privacy analysis rule for a view in a data clean room or with the CREATE VIEW
statement:
CREATE OR REPLACE VIEW VIEW_NAME OPTIONS ( privacy_policy= '''{ "differential_privacy_policy": { "privacy_unit_column": "PRIVACY_UNIT_COLUMN", "max_epsilon_per_query": MAX_EPSILON_PER_QUERY, "epsilon_budget": EPSILON_BUDGET, "delta_per_query": DELTA_PER_QUERY, "delta_budget": DELTA_BUDGET, "max_groups_contributed": MAX_GROUPS_CONTRIBUTED }, "join_restriction_policy": { "join_condition": "JOIN_CONDITION", "join_allowed_columns": JOIN_ALLOWED_COLUMNS } }''' ) AS QUERY;
Definitions:
differential_privacy_policy
: The differential privacy policy for the differential privacy analysis rule.PRIVACY_UNIT_COLUMN: The column that identifies the entity in a dataset that is protected using a privacy analysis rule. This value is a JSON string.
MAX_EPSILON_PER_QUERY: Determines the amount of noise added to the results per query and prevents the total epsilon from being reached by a single query. This value is a JSON number from
0.001
to1e+15
.EPSILON_BUDGET: The epsilon budget that represents the total epsilon that can be used across all differentially private queries on the view. This value must be larger than
MAX_EPSILON_PER_QUERY
, and is a JSON number from0.001
to1e+15
.DELTA_PER_QUERY: The probability that any row in the result fails to be epsilon-differentially private. This value is a JSON number from
1e-15
to1
.DELTA_BUDGET: The delta budget, which represents the total delta that can be used across all differentially private queries on the view. This value must be larger than
DELTA_PER_QUERY
, and is a JSON number from1e-15
to1000
.MAX_GROUPS_CONTRIBUTED (optional): Limits the number of groups to which an entity in a privacy unit column can contribute. This value must be a non-negative JSON integer.
join_restriction_policy
(optional): The optional join restriction policy for the differential privacy analysis rule.JOIN_CONDITION: The type of join restriction to enforce on a view. This can be one of the following values:
JOIN_ALL
: All columns injoin_allowed_columns
must be inner joined upon for this view to be queried.JOIN_ANY
: At least one column injoin_allowed_columns
must be joined upon for this view to be queried.JOIN_BLOCKED
: This view can't be joined along any column. Don't setjoin_allowed_columns
in this case.JOIN_NOT_REQUIRED
: A join is not required to query this view. If a join is used, only the columns injoin_allowed_columns
can be used.
JOIN_ALLOWED_COLUMNS: The columns that can be part of a join operation.
QUERY: The query for the view.
Example:
In the following example, a differential privacy analysis rule is created on a view called ExamView
. ExamView
references a table called ExamTable
:
CREATE OR REPLACE VIEW mydataset.ExamView OPTIONS( privacy_policy= '{"differential_privacy_policy": {"privacy_unit_column": "last_name", "max_epsilon_per_query": 1000.0, "epsilon_budget": 10000.1, "delta_per_query": 0.01, "delta_budget": 0.1, "max_groups_contributed": 2}}' ) AS ( SELECT * FROM mydataset.ExamTable ); -- NOTE: Delta and epsilon parameters are set very high due to the small -- dataset. In practice, these should be much smaller.
To review the privacy_policy
syntax for CREATE VIEW
, see the OPTIONS
list in CREATE VIEW
.
Update a differential privacy analysis rule for a view
You can change the differential privacy analysis rule for a view in a data clean room or with the ALTER VIEW
statement:
ALTER VIEW VIEW_NAME SET OPTIONS ( privacy_policy= '''{ "differential_privacy_policy": { "privacy_unit_column": "PRIVACY_UNIT_COLUMN", "max_epsilon_per_query": MAX_EPSILON_PER_QUERY, "epsilon_budget": EPSILON_BUDGET, "delta_per_query": DELTA_PER_QUERY, "delta_budget": DELTA_BUDGET, "max_groups_contributed": MAX_GROUPS_CONTRIBUTED } }''' )
For more information about the values you can set for the privacy policies in the preceding syntax, see Define a differential privacy analysis rule for a view.
Example:
In the following example, a differential privacy analysis rule is updated on a view called ExamView
.
ALTER VIEW mydataset.ExamView SET OPTIONS( privacy_policy= '{"differential_privacy_policy": {"privacy_unit_column": "last_name", "max_epsilon_per_query": 0.01, "epsilon_budget": 1000.0, "delta_per_query": 0.05, "delta_budget": 0.1, "max_groups_contributed": 2}}' ); -- NOTE: Delta and epsilon parameters are set very high due to the small -- dataset. In practice, these should be much smaller.
To review the privacy_policy
syntax for ALTER VIEW
, see the OPTIONS
list in ALTER VIEW SET OPTIONS
.
Query a differential privacy analysis rule–enforced view
You can query a view that has a differential privacy analysis rule with the DIFFERENTIAL_PRIVACY
clause. To review the syntax and additional examples for the DIFFERENTIAL_PRIVACY
clause, see DIFFERENTIAL_PRIVACY
clause.
Example:
In the following example, a differential privacy analysis rule is queried on a view called ExamView
. The differentially private data should be successfully returned from ExamView
because epsilon
, delta
, and max_groups_contributed
all satisfy the conditions of the differential analysis rule in ExamView
.
-- Query an analysis–rule enforced view called ExamView. SELECT WITH DIFFERENTIAL_PRIVACY test_id, AVG(test_score) AS average_test_score FROM mydataset.ExamView GROUP BY test_id; -- Results will vary. /*---------+--------------------* | test_id | average_test_score | +---------+--------------------+ | P91 | 512.627693163311 | | C83 | 506.01565971561649 | | U25 | 524.81202728847893 | *---------+--------------------*/
Block a query with an out-of-bounds epsilon
Epsilon can be used to add or remove noise. More epsilon means less noise will be added. If you want to ensure that a differentially private query has a minimal amount of noise, pay close attention to the value for max_epsilon_per_query
in your differential privacy analysis rule.
Example:
In the following query, the query is blocked with an error because epsilon
in the DIFFERENTIAL_PRIVACY
clause is higher than max_epsilon_per_query
in ExamView
:
-- Create a view that includes a table called ExamTable. CREATE OR REPLACE VIEW mydataset.ExamView OPTIONS( privacy_policy= '{"differential_privacy_policy": {"privacy_unit_column": "last_name", "max_epsilon_per_query": 10.01, "epsilon_budget": 1000.0, "delta_per_query": 0.01, "delta_budget": 0.1, "max_groups_contributed": 2}}' ) AS ( SELECT * FROM mydataset.ExamTable ); -- NOTE: Delta and epsilon parameters are set very high due to the small -- dataset. In practice, these should be much smaller.
After you've created your view, briefly wait, and then run the following query:
-- Error: Epsilon is too high: 1e+20, policy for table mydataset. -- ExamView allows max 10.01 SELECT WITH DIFFERENTIAL_PRIVACY OPTIONS(epsilon=1e20) test_id, AVG(test_score) AS average_test_score FROM mydataset.ExamView GROUP BY test_id;
Block queries that have exceeded an epsilon budget
Epsilon can be used to add or remove noise. Less epsilon increases noise, more epsilon reduces noise. Even when noise is high, multiple queries over the same data can eventually reveal the un-noised version of the data. To stop this from happening, you can create an epsilon budget. If you want to add an epsilon budget, review the value for epsilon_budget
in the differential privacy analysis rule for your view.
Example:
Run the following query three times. On the third time, the query is blocked because the total epsilon used is 30
, but epsilon_budget
in ExamView
only allows 25.6
:
-- Create a view that includes a table called ExamTable. CREATE OR REPLACE VIEW mydataset.ExamView OPTIONS( privacy_policy= '{"differential_privacy_policy": {"privacy_unit_column": "last_name", "max_epsilon_per_query": 10.01, "epsilon_budget": 25.6, "delta_per_query": 0.01, "delta_budget": 0.1, "max_groups_contributed": 2}}' ) AS ( SELECT * FROM mydataset.ExamTable ); -- NOTE: Delta and epsilon parameters are set very high due to the small -- dataset. In practice, these should be much smaller.
After you've created your view, briefly wait, and then run the following query three times:
-- Error after three query runs: Privacy budget is not sufficient for -- table 'mydataset.ExamView' in this query. SELECT WITH DIFFERENTIAL_PRIVACY OPTIONS(epsilon=10) test_id, AVG(test_score) AS average_test_score FROM mydataset.ExamView GROUP BY test_id;
List overlap analysis rule
Only overlapping rows can be queried after a join operation, which conforms to the list overlap rule. You can enforce this rule on a view, using DDL statements or data clean rooms.
A view that includes this analysis rule must only include the join restriction policy.
Define a list overlap analysis rule for a view
You can define a list overlap analysis rule for a view in a data clean room or with the CREATE VIEW
statement:
CREATE OR REPLACE VIEW VIEW_NAME OPTIONS ( privacy_policy= '''{ "join_restriction_policy": { "join_condition": "JOIN_CONDITION", "join_allowed_columns": JOIN_ALLOWED_COLUMNS } }''' ) AS QUERY;
Definitions:
join_restriction_policy
: The join restriction policy for the list overlap analysis rule.JOIN_CONDITION: The type of list overlap to enforce on a view. This can be one of the following values:
JOIN_ALL
: All columns injoin_allowed_columns
must be inner joined upon for this view to be queried.JOIN_ANY
: At least one column injoin_allowed_columns
must be joined upon for this view to be queried.
JOIN_ALLOWED_COLUMNS: The columns that can be part of a join operation.
QUERY: The query for the view.
Example:
In the following example, a list overlap analysis rule is created on a view called ExamView
. ExamView
references a table called ExamTable
:
-- Create a view that includes a table called ExamTable. CREATE OR REPLACE VIEW mydataset.ExamView OPTIONS( privacy_policy= '{"join_restriction_policy": {"join_condition": "JOIN_ANY", "join_allowed_columns": ["test_id", "test_score"]}}' ) AS ( SELECT * FROM mydataset.ExamTable );
Update a list overlap analysis rule for a view
You can change the list overlap analysis rule for a view with a data clean room or with the ALTER VIEW
statement:
ALTER VIEW VIEW_NAME SET OPTIONS ( privacy_policy= '''{ "join_restriction_policy": { "join_condition": "JOIN_CONDITION", "join_allowed_columns": JOIN_ALLOWED_COLUMNS } }''' )
For more information about the values you can set for the privacy policy in the preceding syntax, see Define a list overlap analysis rule for a view.
Example:
In the following example, a list overlap analysis rule is updated on a view called ExamView
.
ALTER VIEW mydataset.ExamView SET OPTIONS( privacy_policy= '{"join_restriction_policy": {"join_condition": "JOIN_ALL", "join_allowed_columns": ["test_id", "test_score"]}}' );
To review the privacy_policy
syntax for ALTER VIEW
, see the OPTIONS
list in ALTER VIEW SET OPTIONS
.
Query a list overlap analysis rule–enforced view
You can perform a join operation on a view that has a list overlap analysis rule. To review the syntax for the JOIN
operation, see Join operation.
Block a join operation with no overlap
You can block a join operation if it doesn't include at least one overlap with a required column.
Example:
In the following query, a view called ExamView
is joined with a table called StudentTable
. Because the view contains the JOIN_ANY
list overlap analysis rule, at least one overlapping row from ExamView
and StudentTable
is required. Because there is at least one overlap, the query runs successfully.
-- Create a view that includes a table called ExamTable. CREATE OR REPLACE VIEW mydataset.ExamView OPTIONS( privacy_policy= '{"join_restriction_policy": {"join_condition": "JOIN_ANY", "join_allowed_columns": ["test_score", "last_name"]}}' ) AS ( SELECT * FROM mydataset.ExamTable ); -- Query a view called ExamView and a table called StudentTable. SELECT * FROM mydataset.ExamView INNER JOIN mydataset.StudentTable USING (test_score); /*------------+-----------+---------+-------------* | test_score | last_name | test_id | last_name_1 | +------------+-----------+---------+-------------+ | 490 | Ivanov | U25 | Ivanov | | 500 | Wang | U25 | Wang | | 510 | Hansen | P91 | Hansen | | 550 | Silva | U25 | Silva | | 580 | Devi | U25 | Devi | *------------+-----------+---------+-------------*/
Block an inner join operation without entire overlap
You can block a join operation if it doesn't include an overlap with all required columns.
Example:
In the following example, a join operation is attempted on a view called ExamView
and a table called StudentTable
, but the query fails. The failure occurs because the ExamView
list overlap analysis rule requires joining on all columns present in the join restriction policy. Because the table called StudentTable
doesn't contain these columns, not all rows overlap and an error is produced.
-- Create a view that includes ExamTable. CREATE OR REPLACE VIEW mydataset.ExamView OPTIONS( privacy_policy= '{"join_restriction_policy": {"join_condition": "JOIN_ALL", "join_allowed_columns": ["test_score", "last_name"]}}' ) AS ( SELECT * FROM mydataset.ExamTable ); -- Query error: Joining must occur on all of the following columns -- [test_score, last_name] on table mydataset.ExamView. SELECT * FROM mydataset.ExamView INNER JOIN mydataset.StudentTable USING (last_name);
Use a join restriction policy with another policy
The join restriction policy can be used with other policies in the aggregation threshold and differential privacy analysis rules. However, once you've used a join restriction policy with another policy, you can't change that other policy afterwards.
Example:
In the following example, a join restriction policy is used in an aggregation threshold analysis rule:
-- Create a view that includes a table called ExamTable. CREATE OR REPLACE VIEW mydataset.ExamView OPTIONS( privacy_policy= '{"aggregation_threshold_policy":{"threshold": 3, "privacy_unit_column": "last_name"}, "join_restriction_policy": {"join_condition": "JOIN_ANY", "join_allowed_columns": ["test_id", "test_score"]}}' ) AS ( SELECT * FROM mydataset.ExamTable );
Block a join operation with no required column
You can block a join operation if it doesn't include at least one required column. To do this include the following parts in your list overlap analysis rule:
"join_restriction_policy": { "join_condition": "JOIN_ANY", "join_allowed_columns": ["column_name", ...] }
Example:
In the following query, the query is blocked with an error because the query does not contain any join operations on the test_score
or test_id
column in ExamView
and StudentTable
:
-- Create a view that includes a table called ExamTable. CREATE OR REPLACE VIEW mydataset.ExamView OPTIONS( privacy_policy= '{"aggregation_threshold_policy": {"threshold": 3, "privacy_unit_column": "last_name"}, "join_restriction_policy": {"join_condition": "JOIN_ANY", "join_allowed_columns": ["test_score", "test_id"]}}' ) AS ( SELECT * FROM mydataset.ExamTable ); -- Query error: Joining must occur on at least one of the following columns -- [test_score, test_id] on table mydataset.ExamView. SELECT * FROM mydataset.ExamView INNER JOIN mydataset.StudentTable USING (last_name);
To get the preceding query to run, in the USING
clause, replace last_name
with test_score
.
Block a query with no join operation
If the query must have a join operation, you can block the query if no join operation is present by using one of the following list overlap analysis rules:
"join_restriction_policy": { "join_condition": "JOIN_NOT_REQUIRED" }
"join_restriction_policy": { "join_condition": "JOIN_NOT_REQUIRED", "join_allowed_columns": [] }
Example:
In the following query, the query is blocked because there is no join operation with ExamView
in the query:
-- Create a view that includes a table called ExamTable. CREATE OR REPLACE VIEW mydataset.ExamView OPTIONS( privacy_policy= '{"aggregation_threshold_policy": {"threshold": 3, "privacy_unit_column": "last_name"}, "join_restriction_policy": {"join_condition": "JOIN_NOT_REQUIRED"}}' ) AS ( SELECT * FROM mydataset.ExamTable ); -- Query error: At least one allowed column must be specified with -- join_condition = 'JOIN_NOT_REQUIRED'. SELECT * FROM mydataset.ExamView;
Block a query with no join operation and no required column
If the query must have a join operation and the join operation must have at least one required column, include the following parts in your list overlap analysis rule:
"join_restriction_policy": { "join_condition": "JOIN_NOT_REQUIRED", "join_allowed_columns": ["column_name", ...] }
Example:
In the following query, the query is blocked because the join operation does not include a column in the ExamView
join_allowed_columns
array:
-- Create a view that includes a table called ExamTable. CREATE OR REPLACE VIEW mydataset.ExamView OPTIONS( privacy_policy= '{"aggregation_threshold_policy": {"threshold": 3, "privacy_unit_column": "last_name"}, "join_restriction_policy": {"join_condition": "JOIN_NOT_REQUIRED", "join_allowed_columns": ["test_score"]}}' ) AS ( SELECT * FROM mydataset.ExamTable ); -- Query error: Join occurring on a restricted column. SELECT * FROM mydataset.ExamView INNER JOIN mydataset.StudentTable USING (last_name);
To get the preceding query to run, in the USING
clause, replace last_name
with test_score
.
Block all join operations
You can block all join operations. To do this, only include the following parts in your list overlap analysis rule:
"join_restriction_policy": { "join_condition": "JOIN_BLOCKED", }
Example:
In the following query, the query is blocked because there is a join operation with a view called ExamView
:
-- Create a view that includes a table called ExamTable. CREATE OR REPLACE VIEW mydataset.ExamView OPTIONS( privacy_policy= '{"aggregation_threshold_policy": {"threshold": 3, "privacy_unit_column": "last_name"}, "join_restriction_policy": {"join_condition": "JOIN_BLOCKED"}}' ) AS ( SELECT * FROM mydataset.ExamTable ); -- Query error: Join occurring on a restricted column. SELECT * FROM mydataset.ExamView INNER JOIN mydataset.StudentTable USING (last_name);
To get the preceding query to run, remove the INNER JOIN
operation.
Block an inner join operation without all required columns
You can block an inner join operation if it doesn't include all required columns. To do this, include the following parts in your list overlap analysis rule:
"join_restriction_policy": { "join_condition": "JOIN_ALL", "join_allowed_columns": ["column_name", ...] }
Example:
In the following query, the query is blocked with an error because the query does not includetest_score
in the join operation with the view called ExamView
:
-- Create a view that includes a table called ExamTable. CREATE OR REPLACE VIEW mydataset.ExamView OPTIONS( privacy_policy= '{"aggregation_threshold_policy": {"threshold": 3, "privacy_unit_column": "last_name"}, "join_restriction_policy": {"join_condition": "JOIN_ALL", "join_allowed_columns": ["test_score", "last_name"]}}' ) AS ( SELECT * FROM mydataset.ExamTable ); -- Query error: Joining must occur on all of the following columns -- [test_score, last_name] on table mydataset.ExamView. SELECT * FROM mydataset.ExamView INNER JOIN mydataset.StudentTable USING (last_name);
To get the preceding query to run, replace USING (last_name)
with USING (last_name, test_score)
.
Example tables
Several examples in this document reference two tables called ExamTable
and StudentTable
. ExamTable
contains a list of test scores produced by students and StudentTable
contains a list of students and their test scores.
To test the examples in this document, first add the following sample tables to your project:
-- Create a table called ExamTable. CREATE OR REPLACE TABLE mydataset.ExamTable AS ( SELECT "Hansen" AS last_name, "P91" AS test_id, 510 AS test_score UNION ALL SELECT "Wang", "U25", 500 UNION ALL SELECT "Wang", "C83", 520 UNION ALL SELECT "Wang", "U25", 460 UNION ALL SELECT "Hansen", "C83", 420 UNION ALL SELECT "Hansen", "C83", 560 UNION ALL SELECT "Devi", "U25", 580 UNION ALL SELECT "Devi", "P91", 480 UNION ALL SELECT "Ivanov", "U25", 490 UNION ALL SELECT "Ivanov", "P91", 540 UNION ALL SELECT "Silva", "U25", 550); -- Create a table called StudentTable. CREATE OR REPLACE TABLE mydataset.StudentTable AS ( SELECT "Hansen" AS last_name, 510 AS test_score UNION ALL SELECT "Wang", 500 UNION ALL SELECT "Devi", 580 UNION ALL SELECT "Ivanov", 490 UNION ALL SELECT "Silva", 550);
Limitations
Analysis rules have the following limitations:
- If you've already added an analysis rule to a view, you can't switch between aggregation threshold analysis rules and differential privacy analysis rules.
An aggregation threshold analysis rule has the following limitations:
- You can only use supported aggregate functions in a query on an aggregation threshold analysis rule–enforced view.
- You can't add an aggregation threshold analysis rule to a materialized view.
- If you use an aggregation threshold analysis rule–enforced view in an aggregation threshold query, they must both have the same value for the privacy unit column.
- If you use an aggregation threshold analysis rule–enforced view in an aggregation threshold query, the threshold in the query must be greater than or equal to the threshold in the view.
- Time travel is disabled on any view that has an aggregation threshold analysis rule.
A differential privacy analysis rule has the following limitations:
- Once a privacy budget is exhausted for a view, that view can't be used and you must create a new view.
A list overlap analysis rule has the following limitations:
- If you combine an aggregation threshold analysis rule or a differential privacy analysis rule with a list overlap analysis rule and you don't place the
privacy_unit_column
as ajoin_allowed_column
in the list overlap analysis rule, you might not be able to join any columns in certain situations.
Pricing
- There is no additional cost to attach an analysis rule to a view.
- Standard BigQuery pricing for analysis applies.