It’s very convenient to use YAML file as configuration. We also can use YAML files to keep track of information related to list of object as the following example. To avoid typo, invalid YAML syntax or unsupported properties, we can use jsonschema with a given JSON schema to validate those YAML files.

For example, we can set up the directory as following:

  • metadata directory contains all YAML configuration files
  • schemas directory contains JSON schemas used to validate the syntax of YAML files
  • tests directory contains unit tests
.
├── metadata
│   └── teams.yaml
├── schemas
│   └── teams.json
└── tests
    └── test_metadata.py

3 directories, 3 files

Place YAML files into metadata directory. For example, teams.yaml is a YAML file contains information about teams.

teams:
- name: alpha
  slack: '#team-alpha'
  email: alpha@alphabet.com
  jira:
    labels:
      - alpha
    owner: a@alphabet.com
    scrum_board: Alpha Board
- name: beta
  slack: '#team-beta'
  email: beta@alphabet.com
  jira:
    labels:
      - beta
    owner: b@alphabet.com
    scrum_board: Beta Board
- name: gamma
  email: gamma@alphabet.com
  jira:
    labels:
      - gamma
    owner: g@alphabet.com
    scrum_board: Gamma Board

Write JSON schema for the above YAML file and save to teams.json. Learn more about JSON schema syntax at https://json-schema.org/

{
  "type": "object",
  "properties": {
    "teams": {
      "type": "array",
      "description": "List of teams",
      "items": {
        "type": "object",
        "properties": {
          "name": {
            "type:": "string",
            "description": "Name of the team"
          },
          "email": {
            "type": "string",
            "description": "Name of the team email address"
          },
          "slack": {
            "type": ["string", "null"],
            "description": "Name of the team slack channel"
          },
          "jira": {
            "type": "object",
            "properties": {
              "labels": {
                "type": "array",
                "description": "List of Jira labels assigned to the team",
                "items": {
                  "type": "string"
                },
                "minItems": 1,
                "uniqueItems": true
              },
              "owner": {
                "type": "string",
                "description": "Email address of the team owner"
              },
              "scrum_board": {
                "type": "string",
                "description": "Name of the team scrum board"
              }
            },
            "additionalProperties": false
          }
        },
        "additionalProperties": false
      },
      "minItems": 1,
      "uniqueItems": true
    }
  },
  "additionalProperties": false
}

Write unit tests in test_metadata.py which will read both YAML file and its associated JSON schema and use jsonschema.validate method to run the validation.

import json

import jsonschema
import pytest
import yaml


@pytest.mark.parametrize('metadata_name', [
    'teams'
])
def test_metadata(metadata_name: str) -> None:
    """
    Use the json schema to validate the metadata yaml files
    :params metadata_name: metadata name
    :return:
    """
    with open(f'metadata/{metadata_name}.yaml', 'r') as _f:
        metadata = yaml.safe_load(_f)

    with open(f'schemas/{metadata_name}.json', 'r') as _f:
        schema = json.load(_f)

    jsonschema.validate(instance=metadata, schema=schema)

Set up a virtual environment and install required packages.

$ virtualenv .venv
$ source .venv/bin/activate
$ pip install jsonschema pytest pyyaml

Run unit tests.

$ .venv/bin/python -m pytest -v --junitxml=test_output/output.xml tests/

...
tests/test_metadata.py::test_metadata[teams] PASSED
...