Assertions
Why Would You Use Assertions APIs?
The Assertions APIs allow you to create, schedule, run, and delete Assertions with Acryl Cloud.
Supported Assertion Types include:
Goal Of This Guide
This guide will show you how to create, schedule, run and delete Assertions for a Table.
Prerequisites
The actor making API calls must have the Edit Assertions and Edit Monitors privileges for the Tables at hand.
Create Assertions
You can create new dataset Assertions to DataHub using the following APIs.
- GraphQL
Freshness Assertion
To create a new freshness assertion, use the upsertDatasetFreshnessAssertionMonitor GraphQL Mutation.
mutation upsertDatasetFreshnessAssertionMonitor {
  upsertDatasetFreshnessAssertionMonitor(
      input: {
        entityUrn: "<urn of entity being monitored>",
        schedule: {
          type: FIXED_INTERVAL,
          fixedInterval: { unit: HOUR, multiple: 8 }
        }
        evaluationSchedule: {
          timezone: "America/Los_Angeles",
          cron: "0 */8 * * *"
        }
        evaluationParameters: {
          sourceType: INFORMATION_SCHEMA
        }
        mode: ACTIVE
      }
  ) {
      urn
    }
}
For more details, see the Freshness Assertions guide.
Volume Assertions
To create a new volume assertion, use the upsertDatasetVolumeAssertionMonitor GraphQL Mutation.
mutation upsertDatasetVolumeAssertionMonitor {
  upsertDatasetVolumeAssertionMonitor(
    input: {
      entityUrn: "<urn of entity being monitored>"
      type: ROW_COUNT_TOTAL
      rowCountTotal: {
        operator: BETWEEN
        parameters: {
          minValue: {
            value: "10"
            type: NUMBER
          }
          maxValue: {
            value: "20"
            type: NUMBER
          }
        }
      }
      evaluationSchedule: {
        timezone: "America/Los_Angeles"
        cron: "0 */8 * * *"
      }
      evaluationParameters: {
        sourceType: INFORMATION_SCHEMA
      }
      mode: ACTIVE
    }
  ) {
    urn
  }
}
For more details, see the Volume Assertions guide.
Column Assertions
To create a new column assertion, use the upsertDatasetFieldAssertionMonitor GraphQL Mutation.
mutation upsertDatasetFieldAssertionMonitor {
  upsertDatasetFieldAssertionMonitor(
    input: {
      entityUrn: "<urn of entity being monitored>"
      type: FIELD_VALUES,
      fieldValuesAssertion: {
        field: {
          path: "<name of the column to be monitored>",
          type: "NUMBER",
          nativeType: "NUMBER(38,0)"
        },
        operator: GREATER_THAN,
        parameters: {
          value: {
            type: NUMBER,
            value: "10"
          }
        },
        failThreshold: {
          type: COUNT,
          value: 0
        },
        excludeNulls: true
      }
      evaluationSchedule: {
        timezone: "America/Los_Angeles"
        cron: "0 */8 * * *"
      }
      evaluationParameters: {
        sourceType: ALL_ROWS_QUERY
      }
      mode: ACTIVE
    }
  ){
    urn
  }
}
For more details, see the Column Assertions guide.
Custom SQL Assertions
To create a new column assertion, use the upsertDatasetSqlAssertionMonitor GraphQL Mutation.
mutation upsertDatasetSqlAssertionMonitor {
  upsertDatasetSqlAssertionMonitor(
    assertionUrn: "<urn of assertion created in earlier query>"
    input: {
      entityUrn: "<urn of entity being monitored>"
      type: METRIC,
      description: "<description of the custom assertion>",
      statement: "<SQL query to be evaluated>",
      operator: GREATER_THAN_OR_EQUAL_TO,
      parameters: {
        value: {
          value: "100",
          type: NUMBER
        }
      }
      evaluationSchedule: {
        timezone: "America/Los_Angeles"
        cron: "0 */6 * * *"
      }
      mode: ACTIVE   
    }
  ) {
    urn
  }
}
For more details, see the Custom SQL Assertions guide.
Schema Assertions
To create a new schema assertion, use the upsertDatasetSchemaAssertionMonitor GraphQL Mutation.
mutation upsertDatasetSchemaAssertionMonitor {
    upsertDatasetSchemaAssertionMonitor(
        assertionUrn: "urn:li:assertion:existing-assertion-id",
        input: {
            entityUrn: "<urn of the table to be monitored>",
            assertion: {
                compatibility: EXACT_MATCH,
                fields: [
                    {
                        path: "id",
                        type: STRING
                    },
                    {
                        path: "count",
                        type: NUMBER
                    },
                    {
                        path: "struct",
                        type: STRUCT
                    },
                    {
                        path: "struct.nestedBooleanField",
                        type: BOOLEAN
                    }
                ]
            },
            description: "<description of the schema assertion>",
            mode: ACTIVE
        }
    )
}
For more details, see the Schema Assertions guide.
Get Assertions
You can use the following APIs to
- Fetch existing assertion definitions + run history
- Fetch the assertions associated with a given table + their run history.
- GraphQL
- Python
Get Assertions for a Table
To retrieve all the assertions for a table, you can use the following (super long) GraphQL Query.
query dataset {
    dataset(urn: "urn:li:dataset:(urn:li:dataPlatform:snowflake,purchases,PROD)") {
        assertions(start: 0, count: 1000) {
            start
            count
            total
            assertions {
                # Fetch the last run of each associated assertion. 
                runEvents(status: COMPLETE, limit: 1) {
                    total
                    failed
                    succeeded
                    runEvents {
                        timestampMillis
                        status
                        result {
                            type
                            nativeResults {
                                key
                                value
                            }
                        }
                    }
                }
                info {
                    type
                    description
                    lastUpdated {
                        time
                        actor
                    }
                    datasetAssertion {
                        datasetUrn
                        scope
                        aggregation
                        operator
                        parameters {
                            value {
                                value
                                type
                            }
                            minValue {
                                value
                                type
                            }
                            maxValue {
                                value
                                type
                            }
                        }
                        fields {
                            urn
                            path
                        }
                        nativeType
                        nativeParameters {
                            key
                            value
                        }
                        logic
                    }
                    freshnessAssertion {
                        type
                        entityUrn
                        schedule {
                            type
                            cron {
                                cron
                                timezone
                            }
                            fixedInterval {
                                unit
                                multiple
                            }
                        }
                        filter {
                            type
                            sql
                        }
                    }
                    sqlAssertion {
                        type
                        entityUrn
                        statement
                        changeType
                        operator
                        parameters {
                            value {
                                value
                                type
                            }
                            minValue {
                                value
                                type
                            }
                            maxValue {
                                value
                                type
                            }
                        }
                    }
                    fieldAssertion {
                        type
                        entityUrn
                        filter {
                            type
                            sql
                        }
                        fieldValuesAssertion {
                            field {
                                path
                                type
                                nativeType
                            }
                            transform {
                                type
                            }
                            operator
                            parameters {
                                value {
                                    value
                                    type
                                }
                                minValue {
                                    value
                                    type
                                }
                                maxValue {
                                    value
                                    type
                                }
                            }
                            failThreshold {
                                type
                                value
                            }
                            excludeNulls
                        }
                        fieldMetricAssertion {
                            field {
                                path
                                type
                                nativeType
                            }
                            metric
                            operator
                            parameters {
                                value {
                                    value
                                    type
                                }
                                minValue {
                                    value
                                    type
                                }
                                maxValue {
                                    value
                                    type
                                }
                            }
                        }
                    }
                    volumeAssertion {
                        type
                        entityUrn
                        filter {
                            type
                            sql
                        }
                        rowCountTotal {
                            operator
                            parameters {
                                value {
                                    value
                                    type
                                }
                                minValue {
                                    value
                                    type
                                }
                                maxValue {
                                    value
                                    type
                                }
                            }
                        }
                        rowCountChange {
                            type
                            operator
                            parameters {
                                value {
                                    value
                                    type
                                }
                                minValue {
                                    value
                                    type
                                }
                                maxValue {
                                    value
                                    type
                                }
                            }
                        }
                    }
                    schemaAssertion {
                        entityUrn
                        compatibility
                        fields {
                            path
                            type
                            nativeType
                        }
                        schema {
                            fields {
                                fieldPath
                                type
                                nativeDataType
                            }
                        }
                    }
                    source {
                        type
                        created {
                            time
                            actor
                        }
                    }
                }
            }
        }
    }
}
Get a single assertion
You can use the following GraphQL query to fetch a single assertion by its URN.
query getAssertion {
    assertion(urn: "urn:li:assertion:assertion-id") {
        # Fetch the last 10 runs for the assertion. 
        runEvents(status: COMPLETE, limit: 10) {
            total
            failed
            succeeded
            runEvents {
                timestampMillis
                status
                result {
                    type
                    nativeResults {
                        key
                        value
                    }
                }
            }
        }
        info {
            type
            description
            lastUpdated {
                time
                actor
            }
            datasetAssertion {
                datasetUrn
                scope
                aggregation
                operator
                parameters {
                    value {
                        value
                        type
                    }
                    minValue {
                        value
                        type
                    }
                    maxValue {
                        value
                        type
                    }
                }
                fields {
                    urn
                    path
                }
                nativeType
                nativeParameters {
                    key
                    value
                }
                logic
            }
            freshnessAssertion {
                type
                entityUrn
                schedule {
                    type
                    cron {
                        cron
                        timezone
                    }
                    fixedInterval {
                        unit
                        multiple
                    }
                }
                filter {
                    type
                    sql
                }
            }
            sqlAssertion {
                type
                entityUrn
                statement
                changeType
                operator
                parameters {
                    value {
                        value
                        type
                    }
                    minValue {
                        value
                        type
                    }
                    maxValue {
                        value
                        type
                    }
                }
            }
            fieldAssertion {
                type
                entityUrn
                filter {
                    type
                    sql
                }
                fieldValuesAssertion {
                    field {
                        path
                        type
                        nativeType
                    }
                    transform {
                        type
                    }
                    operator
                    parameters {
                        value {
                            value
                            type
                        }
                        minValue {
                            value
                            type
                        }
                        maxValue {
                            value
                            type
                        }
                    }
                    failThreshold {
                        type
                        value
                    }
                    excludeNulls
                }
                fieldMetricAssertion {
                    field {
                        path
                        type
                        nativeType
                    }
                    metric
                    operator
                    parameters {
                        value {
                            value
                            type
                        }
                        minValue {
                            value
                            type
                        }
                        maxValue {
                            value
                            type
                        }
                    }
                }
            }
            volumeAssertion {
                type
                entityUrn
                filter {
                    type
                    sql
                }
                rowCountTotal {
                    operator
                    parameters {
                        value {
                            value
                            type
                        }
                        minValue {
                            value
                            type
                        }
                        maxValue {
                            value
                            type
                        }
                    }
                }
                rowCountChange {
                    type
                    operator
                    parameters {
                        value {
                            value
                            type
                        }
                        minValue {
                            value
                            type
                        }
                        maxValue {
                            value
                            type
                        }
                    }
                }
            }
            schemaAssertion {
                entityUrn
                compatibility
                fields {
                    path
                    type
                    nativeType
                }
                schema {
                    fields {
                        fieldPath
                        type
                        nativeDataType
                    }
                }
            }
            source {
                type
                created {
                    time
                    actor
                }
            }
        }
    }
}
Python support coming soon!
Run Assertions
You can use the following APIs to trigger the assertions you've created to run on-demand. This is particularly useful for running assertions on a custom schedule, for example from your production data pipelines.
- GraphQL
- Python
Run an assertion
mutation runAssertion {
    runAssertion(urn: "urn:li:assertion:your-assertion-id", saveResult: true) {
        type 
        nativeResults {
            key
            value
        }
    }
}
Where type will contain the Result of the assertion run, either SUCCESS, FAILURE, or ERROR. 
The saveResult argument determines whether the result of the assertion will be saved to DataHub's backend,
and available to view through the DataHub UI. If this is set to false, the result will NOT be stored in DataHub's
backend. The value defaults to true.
If the assertion is external (not natively executed by Acryl), this API will return an error.
If running the assertion is successful, the result will be returned as follows:
{
  "data": {
    "runAssertion": {
        "type": "SUCCESS",
        "nativeResults": [
          {
            "key": "Value",
            "value": "1382"
          }
        ]
    }
  },
  "extensions": {}
}
Run multiple assertions
mutation runAssertions {
    runAssertions(urns: ["urn:li:assertion:your-assertion-id-1", "urn:li:assertion:your-assertion-id-2"], saveResults: true) {
        passingCount
        failingCount
        errorCount
        results {
            urn
            type
            nativeResults {
                key
                value
            }
        }
    }
}
Where type will contain the Result of the assertion run, either SUCCESS, FAILURE, or ERROR.
The saveResults argument determines whether the result of the assertion will be saved to DataHub's backend,
and available to view through the DataHub UI. If this is set to false, the result will NOT be stored in DataHub's
backend. The value defaults to true.
If any of the assertion are external (not natively executed by Acryl), they will simply be omitted from the result set.
If running the assertions is successful, the results will be returned as follows:
{
  "data": {
    "runAssertions": {
      "passingCount": 2,
      "failingCount": 0,
      "errorCount": 0,
      "results": [
        {
          "urn": "urn:li:assertion:your-assertion-id-1",
          "type": "SUCCESS",
          "nativeResults": [
            {
              "key": "Value",
              "value": "1382"
            }
          ]
        },
        {
          "urn": "urn:li:assertion:your-assertion-id-2",
          "type": "FAILURE",
          "nativeResults": [
            {
              "key": "Value",
              "value": "12323"
            }
          ]
        }
      ]
    }
  },
  "extensions": {}
}
Where you should see one result object for each assertion.
Run all assertions for table
You can also run all assertions for a specific data asset using the runAssetAssertions mutation.
mutation runAssertionsForAsset {
    runAssertionsForAsset(urn: "urn:li:dataset:(urn:li:dataPlatform:snowflake,purchase_events,PROD)", saveResults: true) {
        passingCount
        failingCount
        errorCount
        results {
            urn
            type
            nativeResults {
                key
                value
            }
        }
    }
}
Where type will contain the Result of the assertion run, either SUCCESS, FAILURE, or ERROR.
The saveResults argument determines whether the result of the assertion will be saved to DataHub's backend,
and available to view through the DataHub UI. If this is set to false, the result will NOT be stored in DataHub's
backend. The value defaults to true.
If any of the assertion are external (not natively executed by Acryl), they will simply be omitted from the result set.
If running the assertions is successful, the results will be returned as follows:
{
  "data": {
    "runAssertionsForAsset": {
      "passingCount": 2,
      "failingCount": 0,
      "errorCount": 0,
      "results": [
        {
          "urn": "urn:li:assertion:your-assertion-id-1",
          "type": "SUCCESS",
          "nativeResults": [
            {
              "key": "Value",
              "value": "1382"
            }
          ]
        },
        {
          "urn": "urn:li:assertion:your-assertion-id-2",
          "type": "FAILURE",
          "nativeResults": [
            {
              "key": "Value",
              "value": "12323"
            }
          ]
        }
      ]
    }
  },
  "extensions": {}
}
Where you should see one result object for each assertion.
Run assertion
# Inlined from /metadata-ingestion/examples/library/run_assertion.py
import logging
from datahub.ingestion.graph.client import DatahubClientConfig, DataHubGraph
log = logging.getLogger(__name__)
graph = DataHubGraph(
    config=DatahubClientConfig(
        server="http://localhost:8080",
    )
)
assertion_urn = "urn:li:assertion:6e3f9e09-1483-40f9-b9cd-30e5f182694a"
# Run the assertion
assertion_result = graph.run_assertion(urn=assertion_urn, saveResult=True)
log.info(f"Assertion result (SUCCESS / FAILURE / ERROR): {assertion_result.get("type")}")
Run multiple assertions
# Inlined from /metadata-ingestion/examples/library/run_assertions.py
import logging
from datahub.ingestion.graph.client import DatahubClientConfig, DataHubGraph
log = logging.getLogger(__name__)
graph = DataHubGraph(
    config=DatahubClientConfig(
        server="http://localhost:8080",
    )
)
assertion_urns = [
    "urn:li:assertion:6e3f9e09-1483-40f9-b9cd-30e5f182694a",
    "urn:li:assertion:9e3f9e09-1483-40f9-b9cd-30e5f182694g",
]
# Run the assertions
assertion_results = graph.run_assertions(urns=assertion_urns, saveResults=True).get("results")
assertion_result_1 = assertion_results.get("urn:li:assertion:6e3f9e09-1483-40f9-b9cd-30e5f182694a")
assertion_result_2 = assertion_results.get("urn:li:assertion:9e3f9e09-1483-40f9-b9cd-30e5f182694g")
log.info(f"Assertion results: {assertion_results}")
log.info(f"Assertion result 1 (SUCCESS / FAILURE / ERROR): {assertion_result_1.get('type')}")
log.info(f"Assertion result 2 (SUCCESS / FAILURE / ERROR): {assertion_result_2.get('type')}")
Run all assertions for table
# Inlined from /metadata-ingestion/examples/library/run_assertions_for_asset.py
import logging
from datahub.ingestion.graph.client import DatahubClientConfig, DataHubGraph
log = logging.getLogger(__name__)
graph = DataHubGraph(
    config=DatahubClientConfig(
        server="http://localhost:8080",
    )
)
assertion_urns = [
    "urn:li:assertion:6e3f9e09-1483-40f9-b9cd-30e5f182694a",
    "urn:li:assertion:9e3f9e09-1483-40f9-b9cd-30e5f182694g",
]
# Run the assertions
assertion_results = graph.run_assertions(urns=assertion_urns, saveResults=True).get("results")
assertion_result_1 = assertion_results.get("urn:li:assertion:6e3f9e09-1483-40f9-b9cd-30e5f182694a")
assertion_result_2 = assertion_results.get("urn:li:assertion:9e3f9e09-1483-40f9-b9cd-30e5f182694g")
log.info(f"Assertion results: {assertion_results}")
log.info(f"Assertion result 1 (SUCCESS / FAILURE / ERROR): {assertion_result_1.get('type')}")
log.info(f"Assertion result 2 (SUCCESS / FAILURE / ERROR): {assertion_result_2.get('type')}")
Delete Assertions
You can use delete dataset operations to DataHub using the following APIs.
- GraphQL
- Python
mutation deleteAssertion {
    deleteAssertion(urn: "urn:li:assertion:test")
}
If you see the following response, the operation was successful:
{
  "data": {
    "deleteAssertion": true
  },
  "extensions": {}
}
# Inlined from /metadata-ingestion/examples/library/delete_assertion.py
import logging
from datahub.ingestion.graph.client import DatahubClientConfig, DataHubGraph
log = logging.getLogger(__name__)
graph = DataHubGraph(
    config=DatahubClientConfig(
        server="http://localhost:8080",
    )
)
assertion_urn = "urn:li:assertion:my-assertion"
# Delete the Assertion
graph.delete_entity(urn=assertion_urn, hard=True)
log.info(f"Deleted assertion {assertion_urn}")