Validating datasets

The BlobToolKit Validator is included alongside the Specification and can be used to validate BlobDir datasets.

To run the Validator, clone the Specification GitHub repository and ensure you have the ujson and fastjsonschema python modules available in your environment:

mkdir -p /home/ubuntu/blobtoolkit
cd /home/ubuntu/blobtoolkit
git clone https://github.com/blobtoolkit/specification

pip install ujson fastjsonschema

Run the validator by passing the meta.json file of your BlobDir dataset as an argument to the Validator:

/home/ubuntu/blobtoolkit/specification/validate.py /path/to/BlobDir/meta.json

A valid dataset will return VALID with a zero exit code, otherwise the Validator will throw an error and print an accompanying message.