Create a Vocabulary from a JSON schema

Suppose your company belongs to an industry consortium that has defined a standard format for JSON messages for communication between suppliers and customers. The consortium may opt to define a JSON schema for the JSON. JSON schema providers a greater ability to define valid content for JSON payloads.

The use of JSON schema is in the early days of being adopted. JSON Schema is primarily used when different organizations need a formal definition of an agreed upon data model. Using JSON schema has advantages for vocabulary generation such as options for defining enumerated values and for transcribing comments into the Vocabulary. Be careful: Some schemas are very large and have more than you need. You may want to cut the schema down to just what you need before generating the vocabulary.

Note: Corticon uses JSON Schema Draft-07 to infer the patterns in the given source—whether a JSON payload file or parsing a JSON schema file—to make its best effort to set up the entire Vocabulary complete with associations. You might be using a different draft. As the specification gets more refined, improvements are added to the schema.

Sample JSON Schema

The following code is an example of a JSON schema:

{
   "$schema": "http://json-schema.org/draft-07/schema#",
   "type": "object",
   "properties": {
      "BillingAddress": {
         "description": "Address to where a Customer's invoice must go",
         "type": "object",
         "properties": {
            "Zip": {
               "type": "string"
            },
            "State": {
               "type": "string"
            },
            "Address2": {
               "type": "string"
            },
            "Address1": {
               "type": "string"
            },
            "City": {
               "type": "string"
            }
         }
      },
      "CompanyName": {
         "type": "string"
      },
      "Phone": {
         "type": "string"
      },
      "ShippingAddress": {
         "description": "Address to where a Customer's product must go",
         "type": "object",
         "properties": {
            "Zip": {
               "type": "string"
            },
            "State": {
               "type": "string"
            },
            "Address2": {
               "type": "string"
            },
            "Address1": {
               "type": "string"
            },
            "City": {
               "type": "string"
            }
         }
      },
      "Notes": {
         "type": "string"
      },
      "Contact": {
         "type": "string"
      }
   }
 }

To populate a Vocabulary from a JSON schema

  1. Copy the preceding JSON and then save in a temporary file. Set the file extension to .json.
  2. In Corticon Studio, create a new Rule Project named CustomerSchema.
  3. In the project, create a new Rule Vocabulary named CustomerSchema.
  4. Click in the Vocabulary edit window, and then select Vocabulary > Populate Vocabulary from JSON.
  5. Select the sample file CustomerSchema.json, and then click Open.

The Vocabulary that the JSON schema generates is the following:

How Corticon generates a vocabulary from JSON

To generate a vocabulary from a JSON schema document, Corticon examines the contents of the document to identify the entities in the document, their attributes, and their associations. Where data types are not defined with JSON, Corticon infers the data type of attributes based on the values present.

The process of inferring the schema is essentially as follows:

  • Entities: Entity names follow Corticon naming conventions and uppercase the first character of the entity name.
    • The entity Root entity always generated.
    • If an existing entity has already been mapped to a JSON object, use that entity.
    • If no entity is found, then create a new entity, and set the entity name to the object name.
  • Attributes: For each attribute in an Entity:
    • If an entity has no attributes, assign it one string attribute with the name item
    • Create a new attribute (no duplicate names including case) with attribute name in the Entity
    • Data type
      • For a JSON schema where a data type is specified, use that data type.
      • For a JSON instance:
        • For a number that can be successfully converted to a relevant Java Date, set its data type as DateTime.
        • For a number with a decimal point, set its data type as Decimal.
        • For a number without a decimal point, set its data type as Integer.
        • For a string that is an ISO 8601 value, set its data type as DateTime, else it is a String.
        • For an attribute with a data type of null, it is a String.
        • For an empty array, it is a String array.
  • Associations: Association role names are auto-assigned.
    • Arrays are specified as a one-to-many with its corresponding parent entity.
    • Associations are not be bidirectional.
    • Both ends are not mandatory.

How descriptions in your schema are handled

The JSON Schema specification has description attributes that can be used to document your data structure. The Vocabulary Generator puts the description fields in the schema into the Vocabulary's Comments tab, as shown:

How enumerations in your schema are handled

The JSON Schema specification might have enumerations. When the Vocabulary Generator sees an enum tag, it creates a Custom Data Type of that enumeration and use that as the attributes data type.

When a schema with an enum populates the Vocabulary, it generates a custom data type:

The type attribute is still String, but its Data Type is now the custom data type TypeEnumeration, as shown:

How references in your schema are handled

The JSON Schema specification provides for the use of $ref attributes to have a single definition of an object that can then be incorporated elsewhere in the schema. An example is an address object defined once and included as part of customer and supplier objects in the schema.

When Corticon generates a vocabulary from JSON schema, associations will be added from the referring entity to the target entity. In the example, the generated vocabulary would contain Customer, Supplier, and Address entities. Corticon then adds associations from both Customer and Supplier entities to the Address entity.

How to extend type definitions in your schema

The JSON Schema specification allows you to specify different validation rules through the use of oneOf, anyOf, or allOf tags. For the most part, these tags do not effect vocabulary generation except when used to extend a type definition. In the following example, the Type enumeration was added to the address definition because it is needed for ShippingAddress. However, it is not needed for other types of addresses, so does it make sense to include it, optionally, in all addresses? This is where the allOf tag comes in handy. You can use it to extend the address type only for the ShippingAddress. A schema fragment that uses allOf is shown:

...
      "ShippingAddress": {
         "description": "Address to where a Customer's product must go",
         "allOf": [
            { "$ref": "#/definitions/address" },
            { "properties":
               { "type": 
                  { 
                  "title": "Address Type Enumeration",
                  "description": "Specifies if the address is a Business or Residence",
                  "enum": [ "residential", "business" ] 

The difference in the vocabulary generated by this schema and the previous one is that the type attribute will only be in the ShippingAddress entity and not the BillingAddress entity.