Skip to main content
Version: Next

Assertions

Why Would You Use Assertions APIs?

The Assertions APIs allow you to create, schedule, run, and delete Assertions with Acryl Cloud.

Supported Assertion Types include:

Goal Of This Guide

This guide will show you how to create, schedule, run and delete Assertions for a Table.

Prerequisites

The actor making API calls must have the Edit Assertions and Edit Monitors privileges for the Tables at hand.

Create Assertions

You can create new dataset Assertions to DataHub using the following APIs.

Freshness Assertion

To create a new freshness assertion, use the upsertDatasetFreshnessAssertionMonitor GraphQL Mutation.

mutation upsertDatasetFreshnessAssertionMonitor {
upsertDatasetFreshnessAssertionMonitor(
input: {
entityUrn: "<urn of entity being monitored>",
schedule: {
type: FIXED_INTERVAL,
fixedInterval: { unit: HOUR, multiple: 8 }
}
evaluationSchedule: {
timezone: "America/Los_Angeles",
cron: "0 */8 * * *"
}
evaluationParameters: {
sourceType: INFORMATION_SCHEMA
}
mode: ACTIVE
}
) {
urn
}
}

For more details, see the Freshness Assertions guide.

Volume Assertions

To create a new volume assertion, use the upsertDatasetVolumeAssertionMonitor GraphQL Mutation.

mutation upsertDatasetVolumeAssertionMonitor {
upsertDatasetVolumeAssertionMonitor(
input: {
entityUrn: "<urn of entity being monitored>"
type: ROW_COUNT_TOTAL
rowCountTotal: {
operator: BETWEEN
parameters: {
minValue: {
value: "10"
type: NUMBER
}
maxValue: {
value: "20"
type: NUMBER
}
}
}
evaluationSchedule: {
timezone: "America/Los_Angeles"
cron: "0 */8 * * *"
}
evaluationParameters: {
sourceType: INFORMATION_SCHEMA
}
mode: ACTIVE
}
) {
urn
}
}

For more details, see the Volume Assertions guide.

Column Assertions

To create a new column assertion, use the upsertDatasetFieldAssertionMonitor GraphQL Mutation.

mutation upsertDatasetFieldAssertionMonitor {
upsertDatasetFieldAssertionMonitor(
input: {
entityUrn: "<urn of entity being monitored>"
type: FIELD_VALUES,
fieldValuesAssertion: {
field: {
path: "<name of the column to be monitored>",
type: "NUMBER",
nativeType: "NUMBER(38,0)"
},
operator: GREATER_THAN,
parameters: {
value: {
type: NUMBER,
value: "10"
}
},
failThreshold: {
type: COUNT,
value: 0
},
excludeNulls: true
}
evaluationSchedule: {
timezone: "America/Los_Angeles"
cron: "0 */8 * * *"
}
evaluationParameters: {
sourceType: ALL_ROWS_QUERY
}
mode: ACTIVE
}
){
urn
}
}

For more details, see the Column Assertions guide.

Custom SQL Assertions

To create a new column assertion, use the upsertDatasetSqlAssertionMonitor GraphQL Mutation.

mutation upsertDatasetSqlAssertionMonitor {
upsertDatasetSqlAssertionMonitor(
assertionUrn: "<urn of assertion created in earlier query>"
input: {
entityUrn: "<urn of entity being monitored>"
type: METRIC,
description: "<description of the custom assertion>",
statement: "<SQL query to be evaluated>",
operator: GREATER_THAN_OR_EQUAL_TO,
parameters: {
value: {
value: "100",
type: NUMBER
}
}
evaluationSchedule: {
timezone: "America/Los_Angeles"
cron: "0 */6 * * *"
}
mode: ACTIVE
}
) {
urn
}
}

For more details, see the Custom SQL Assertions guide.

Schema Assertions

To create a new schema assertion, use the upsertDatasetSchemaAssertionMonitor GraphQL Mutation.

mutation upsertDatasetSchemaAssertionMonitor {
upsertDatasetSchemaAssertionMonitor(
assertionUrn: "urn:li:assertion:existing-assertion-id",
input: {
entityUrn: "<urn of the table to be monitored>",
assertion: {
compatibility: EXACT_MATCH,
fields: [
{
path: "id",
type: STRING
},
{
path: "count",
type: NUMBER
},
{
path: "struct",
type: STRUCT
},
{
path: "struct.nestedBooleanField",
type: BOOLEAN
}
]
},
description: "<description of the schema assertion>",
mode: ACTIVE
}
)
}

For more details, see the Schema Assertions guide.

Get Assertions

You can use the following APIs to

  1. Fetch existing assertion definitions + run history
  2. Fetch the assertions associated with a given table + their run history.

Get Assertions for a Table

To retrieve all the assertions for a table, you can use the following (super long) GraphQL Query.

query dataset {
dataset(urn: "urn:li:dataset:(urn:li:dataPlatform:snowflake,purchases,PROD)") {
assertions(start: 0, count: 1000) {
start
count
total
assertions {
# Fetch the last run of each associated assertion.
runEvents(status: COMPLETE, limit: 1) {
total
failed
succeeded
runEvents {
timestampMillis
status
result {
type
nativeResults {
key
value
}
}
}
}
info {
type
description
lastUpdated {
time
actor
}
datasetAssertion {
datasetUrn
scope
aggregation
operator
parameters {
value {
value
type
}
minValue {
value
type
}
maxValue {
value
type
}
}
fields {
urn
path
}
nativeType
nativeParameters {
key
value
}
logic
}
freshnessAssertion {
type
entityUrn
schedule {
type
cron {
cron
timezone
}
fixedInterval {
unit
multiple
}
}
filter {
type
sql
}
}
sqlAssertion {
type
entityUrn
statement
changeType
operator
parameters {
value {
value
type
}
minValue {
value
type
}
maxValue {
value
type
}
}
}
fieldAssertion {
type
entityUrn
filter {
type
sql
}
fieldValuesAssertion {
field {
path
type
nativeType
}
transform {
type
}
operator
parameters {
value {
value
type
}
minValue {
value
type
}
maxValue {
value
type
}
}
failThreshold {
type
value
}
excludeNulls
}
fieldMetricAssertion {
field {
path
type
nativeType
}
metric
operator
parameters {
value {
value
type
}
minValue {
value
type
}
maxValue {
value
type
}
}
}
}
volumeAssertion {
type
entityUrn
filter {
type
sql
}
rowCountTotal {
operator
parameters {
value {
value
type
}
minValue {
value
type
}
maxValue {
value
type
}
}
}
rowCountChange {
type
operator
parameters {
value {
value
type
}
minValue {
value
type
}
maxValue {
value
type
}
}
}
}
schemaAssertion {
entityUrn
compatibility
fields {
path
type
nativeType
}
schema {
fields {
fieldPath
type
nativeDataType
}
}
}
source {
type
created {
time
actor
}
}
}
}
}
}
}

Get a single assertion

You can use the following GraphQL query to fetch a single assertion by its URN.

query getAssertion {
assertion(urn: "urn:li:assertion:assertion-id") {
# Fetch the last 10 runs for the assertion.
runEvents(status: COMPLETE, limit: 10) {
total
failed
succeeded
runEvents {
timestampMillis
status
result {
type
nativeResults {
key
value
}
}
}
}
info {
type
description
lastUpdated {
time
actor
}
datasetAssertion {
datasetUrn
scope
aggregation
operator
parameters {
value {
value
type
}
minValue {
value
type
}
maxValue {
value
type
}
}
fields {
urn
path
}
nativeType
nativeParameters {
key
value
}
logic
}
freshnessAssertion {
type
entityUrn
schedule {
type
cron {
cron
timezone
}
fixedInterval {
unit
multiple
}
}
filter {
type
sql
}
}
sqlAssertion {
type
entityUrn
statement
changeType
operator
parameters {
value {
value
type
}
minValue {
value
type
}
maxValue {
value
type
}
}
}
fieldAssertion {
type
entityUrn
filter {
type
sql
}
fieldValuesAssertion {
field {
path
type
nativeType
}
transform {
type
}
operator
parameters {
value {
value
type
}
minValue {
value
type
}
maxValue {
value
type
}
}
failThreshold {
type
value
}
excludeNulls
}
fieldMetricAssertion {
field {
path
type
nativeType
}
metric
operator
parameters {
value {
value
type
}
minValue {
value
type
}
maxValue {
value
type
}
}
}
}
volumeAssertion {
type
entityUrn
filter {
type
sql
}
rowCountTotal {
operator
parameters {
value {
value
type
}
minValue {
value
type
}
maxValue {
value
type
}
}
}
rowCountChange {
type
operator
parameters {
value {
value
type
}
minValue {
value
type
}
maxValue {
value
type
}
}
}
}
schemaAssertion {
entityUrn
compatibility
fields {
path
type
nativeType
}
schema {
fields {
fieldPath
type
nativeDataType
}
}
}
source {
type
created {
time
actor
}
}
}
}
}

Run Assertions

You can use the following APIs to trigger the assertions you've created to run on-demand. This is particularly useful for running assertions on a custom schedule, for example from your production data pipelines.

Run an assertion

mutation runAssertion {
runAssertion(urn: "urn:li:assertion:your-assertion-id", saveResult: true) {
type
nativeResults {
key
value
}
}
}

Where type will contain the Result of the assertion run, either SUCCESS, FAILURE, or ERROR.

The saveResult argument determines whether the result of the assertion will be saved to DataHub's backend, and available to view through the DataHub UI. If this is set to false, the result will NOT be stored in DataHub's backend. The value defaults to true.

If the assertion is external (not natively executed by Acryl), this API will return an error.

If running the assertion is successful, the result will be returned as follows:

{
"data": {
"runAssertion": {
"type": "SUCCESS",
"nativeResults": [
{
"key": "Value",
"value": "1382"
}
]
}
},
"extensions": {}
}

Run multiple assertions

mutation runAssertions {
runAssertions(urns: ["urn:li:assertion:your-assertion-id-1", "urn:li:assertion:your-assertion-id-2"], saveResults: true) {
passingCount
failingCount
errorCount
results {
urn
type
nativeResults {
key
value
}
}
}
}

Where type will contain the Result of the assertion run, either SUCCESS, FAILURE, or ERROR.

The saveResults argument determines whether the result of the assertion will be saved to DataHub's backend, and available to view through the DataHub UI. If this is set to false, the result will NOT be stored in DataHub's backend. The value defaults to true.

If any of the assertion are external (not natively executed by Acryl), they will simply be omitted from the result set.

If running the assertions is successful, the results will be returned as follows:

{
"data": {
"runAssertions": {
"passingCount": 2,
"failingCount": 0,
"errorCount": 0,
"results": [
{
"urn": "urn:li:assertion:your-assertion-id-1",
"type": "SUCCESS",
"nativeResults": [
{
"key": "Value",
"value": "1382"
}
]
},
{
"urn": "urn:li:assertion:your-assertion-id-2",
"type": "FAILURE",
"nativeResults": [
{
"key": "Value",
"value": "12323"
}
]
}
]
}
},
"extensions": {}
}

Where you should see one result object for each assertion.

Run all assertions for table

You can also run all assertions for a specific data asset using the runAssetAssertions mutation.

mutation runAssertionsForAsset {
runAssertionsForAsset(urn: "urn:li:dataset:(urn:li:dataPlatform:snowflake,purchase_events,PROD)", saveResults: true) {
passingCount
failingCount
errorCount
results {
urn
type
nativeResults {
key
value
}
}
}
}

Where type will contain the Result of the assertion run, either SUCCESS, FAILURE, or ERROR.

The saveResults argument determines whether the result of the assertion will be saved to DataHub's backend, and available to view through the DataHub UI. If this is set to false, the result will NOT be stored in DataHub's backend. The value defaults to true.

If any of the assertion are external (not natively executed by Acryl), they will simply be omitted from the result set.

If running the assertions is successful, the results will be returned as follows:

{
"data": {
"runAssertionsForAsset": {
"passingCount": 2,
"failingCount": 0,
"errorCount": 0,
"results": [
{
"urn": "urn:li:assertion:your-assertion-id-1",
"type": "SUCCESS",
"nativeResults": [
{
"key": "Value",
"value": "1382"
}
]
},
{
"urn": "urn:li:assertion:your-assertion-id-2",
"type": "FAILURE",
"nativeResults": [
{
"key": "Value",
"value": "12323"
}
]
}
]
}
},
"extensions": {}
}

Where you should see one result object for each assertion.

Delete Assertions

You can use delete dataset operations to DataHub using the following APIs.

mutation deleteAssertion {
deleteAssertion(urn: "urn:li:assertion:test")
}

If you see the following response, the operation was successful:

{
"data": {
"deleteAssertion": true
},
"extensions": {}
}