An Introduction to JSON Schema

Kashyap  - April 5, 2014  |   , ,

JSON, or JavaScript Object Notation has become the most widely used serialization and transport mechanism for information across various web-services. From it’s initial conception, the format garnered swift and wide appreciation for being really simple and non-verbose.

Lets say you want to consume the following JSON object via an API:

{
  id: 3232,
  name: "Kashyap",
  email: "kashyap@example.com"
  contact: {
    id: 123,
    address1: "Shire",
    zipcode: LSQ424
  }
}

Now, let’s assume that you want to ensure that before consuming this data, email and contact.zipcode must be present in the JSON. If that data is not present, you shouldn’t be using it. The typical way is to check for presence of those fields but this whack-a-mole quickly gets tiresome.

Similarly, lets say you are an API provider and you want to let your API users know the basic structure to which data is going to conform to, so that your API users can automatically test validity of data.

If you ever had to deal with above two problems, you should be using JSON schemas.

What’s a Schema?

A schema is defined in Wikipedia as a way to define the structure, content, and to some extent, the semantics of XML documents; which probably is the simplest way one could explain it. For every element — or node — in a document, a rule is given to which it needs to conform. Having constraints defined at this level will make it unnecessary to handle the edge cases in the application logic. This is a pretty powerful tool. This was missing from the original JSON specification but efforts were made to design one later on.

Why do we need a Schema?

If you’re familiar with HTML, the doctype declaration on the first line is a schema declaration. (Specific to HTML 4 and below.)

HTML 4 Transitional DOCTYPE declaration:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

This line declares that the rest of the document conforms to the directives specified at the url http://www.w3.org/TR/html4/loose.dtd. That means, if you declare the document as strict, then the usage of any new elements like <sp></sp> will cause the page to display nothing. In other words, if you make a typo or forget to close a tag somewhere, then the page will not get rendered and your users will end up with a blank page.

At first glance, this looks like a pain — and it is, actually. That’s part of the reason why this was abandoned altogether in the newer version of HTML. However, HTML is not really a good use case for a schema. Having a well-defined schema upfront helps in validating user input at the language/protocol level than at the application’s implementation level. Let’s see how defining a schema makes it easy to handle user input errors.

JSON Schema

The JSON Schema specification is divided into three parts:

  1. JSON Schema Core: The JSON Schema Core specification is where the terminology for a schema is defined. Technically, this is simply the JSON spec with the only addition being definition of a new media type of application/schema+json. Oh! a more important contribution of this document is the $schema keyword which is used to identify the version of the schema and the location of a resource that defines a schema. This is analogous to the DOCTYPE declaration in the HTML 4.01 and other older HTML versions.

    The versions of the schema separate changes in the keywords and the general structure of a schema document. The resource of a schema is usually a webpage which provides a JSON object that defines a specification. Confused? Go open up the url http://www.w3.org/TR/html4/loose.dtd which I’m linking to here in a browser and go through the contents. This is the specification of HTML 4.01 Loose API. Tags like ENTITY, ELEMENT, ATTLIST are used to define the accepted elements, entities and attributes for a valid HTML document.

    Similarly, the JSON Schema Core resource URL (downloads the schema document) defines a superset of constraints.

  2. JSON Schema Validation: The JSON Schema Validation specification is the document that defines the valid ways to define validation constraints. This document also defines a set of keywords that can be used to specify validations for a JSON API. For example, keywords like multipleOf, maxLength, minLength etc. are defined in this specification. In the examples that follow, we will be using some of these keywords.

  3. JSON Hyper-Schema: This is another extension of the JSON Schema spec, where-in, the hyperlink and hypermedia-related keywords are defined. For example, consider the case of a globally available avatar (or, Gravatar). Every Gravatar is composed of three different components:

    1. A Picture ID,
    2. A Link to the picture,
    3. Details of the User (name and email ID).

    When we query the API provided by Gravatar, we get a reponse typically having this data encoded as JSON. This JSON response will not download the entire image but will have a link to the image. Let’s look at a JSON representation of a fake profile I’ve setup on Gravatar:

    {
      "entry":[{
        "id":"61443191",
        "hash":"756b5a91c931f6177e2ca3f3687298db",
        "requestHash":"756b5a91c931f6177e2ca3f3687298db",
        "profileUrl":"http:\/\/gravatar.com\/jsonguerilla",
        "preferredUsername":"jsonguerilla",
        "thumbnailUrl":"http:\/\/1.gravatar.com\/avatar\/756b5a91c931f6177e2ca3f3687298db",
        "photos":[{
          "value":"http:\/\/1.gravatar.com\/avatar\/756b5a91c931f6177e2ca3f3687298db",
          "type":"thumbnail"
        }],
        "name":{
          "givenName":"JSON",
          "familyName":"Schema",
          "formatted":"JSON Schema Blogpost"
        },
        "displayName":"jsonguerilla",
        "urls":[]
      }]
    }

    In this JSON response, the images are represented by hyperlinks but they are encoded as strings. Although this example is for a JSON object returned from a server, this is how traditional APIs handle input as well. This is due to the fact that JSON natively does not provide a way to handle hyperlinks; they are only Strings.

    JSON hyperschema attempts to specify a way to have a more semantic way of representing hyperlinks and images. It does this by defining keywords (as JSON properties) such as links, rel, href. Note that this specification does not try to re-define these words in general (as they are defined in HTTP protocol already) but it tries to normalize the way those keywords are used in JSON.

Drafts

The schema is still under development and the progress can be tracked by comparing the versions known as “drafts”. Currently, the schema is in the 4th version. The validation keywords can be dropped or added between versions. This article — and many more over the interwebs — refer to the 4th version of the draft.

Usage

Let’s build a basic JSON API that accepts the following data with some constraints:

  1. A post ID. This is a number and is a required parameter.
  2. Some free-form text with an attribute of body. This is a required parameter.
  3. A list of tags with an attribute of ‘tags’. Our paranoid API cannot accept more than 6 tags though. This is a required parameter.
  4. An optional list of hyperlinks with an attribute of ‘references’

Let’s face it, almost every app you might’ve ever written must’ve had some or the other constraints. We end up repeating the same verification logic everytime. Let’s see how we can simplify that.

We will be using Sinatra for building the API. This is the basic structure of our app.rb:

require 'sinatra'
require 'sinatra/json'
require 'json-schema'

post '/' do
end

The Gemfile:

gem 'sinatra'
gem 'sinatra-contrib'
gem 'json-schema'

We will be using the JSON-Schema gem for the app. Let’s look at the schema that we will define in a schema.json file:

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "type": "object",
  "required": [ "id", "body", "tags" ],
  "properties": {
    "id": {
      "type": "integer"
    },

    "body": {
      "type": "string"
    },

    "tags": {
      "type": "array",
      "items": {
        "type": "string"
      },
      "maxItems": 6
    },

    "references": {
      "type": "array",
      "items": {
        "type": "string",
        "format": "uri"
      }
    }
  }
}
  1. The properties attribute holds the main chunk of the schema definition. This is the attribute under which each of the individual API attribute is explained in the form of a schema of it’s own.
  2. The required attribute takes in a list of strings that mention which of the API parameters are required. If any of these parameters is missing from the JSON input to our app, an error will be logged and the input won’t get validated.
  3. The type keyword specifies the schema type for that particular block. So, at the first level, we say it’s an object (analogous to a Ruby Hash). For the body, tags and references, the types are string, array and array respectively.
  4. In case an API parameter can accept an array, the items inside that array can be explained by a schema definition of their own. This is done by using an items attribute and defining how each of the item in the array should be validated.
  5. The format attribute is a built-in format for validation in the JSON Schema specification. This alleviates the pain of adding regex for validating common items like uri, ip4, ip6, email, date-time and hostname. That’s right, no more copy-pasting URI validation regexes from StackOverflow.
  6. The $schema attribute is a non-mandatory attribute that specifies the type of the schema being used. For our example, we will be using the draft#4 of the JSON Schema spec.

To use this schema in our app, we will create a helper method that uses validates the input with the schema we just defined. The json-schema gem provides three methods for validation — a validate method that returns either true or false, a validate! that raises an exception when validation of an attribute fails and a fully_validate method that builds up an array of errors similar to what Rails’ ActiveRecord#save method provides.

We will be using the JSON::Validator.fully_validate method in our app and return a nicely formatted JSON response to the user if the validation fails.

helpers do
  def validate(json_string_or_hash)
    JSON::Validator.fully_validate('schema.json', json_string_or_hash)
  end
end

Now, we can use this helper inside routes to check the validity of the input JSON like so:

post '/' do
  input = JSON.load(request.body.read)
  errors = validate(input)

  if errors.empty?
    json({ message: "The blog post has been saved!" })
  else
    status 400
    json({ errors: a })
  end
end

If the input is valid, the errors object will be empty. Otherwise, it will hold a list of errors. This object will be returned as a JSON response with the appropriate HTTP status code. For example, if we run this app and send in a request with a missing id parameter, the response will be something similar to the following:

[
  "The property '#/' did not contain a required property of 'id' in
  schema schema.json#"
]

Let’s say if we send in a request with id having a string parameter. The errors object will hold the following:

[
  "The property '#/id' of type String did not match the following type:
  integer in schema schema.json#"
]

Last example. Let’s try sending a references parameter with a malformed URI. We will send the following request:

{
  "id": 1,
  "body": "Hello, Universe",
  "tags": ["start", "first"],
  "references": [ "data:image/svg+xml;base64 C==" ]
}

(This input is in the file not_working_wrong_uri.txt)

curl \
  -d @not_working_wrong_uri.txt
  -H 'Content-Type: application/json' \
  http://localhost:4567

The output of this would be:

[
  "The property '#/references/0' must be a valid URI in schema
  schema.json#"
]

Thus, with a really simple validation library and a standard that library implementers in different languages use, we can achieve input validation with a really simple setup. One really great advantage of following a schema standard is that we can be sure about the basic implementation no matter what the language which might implment the schema. For example, we can use the same schema.json description with a JavaScript library for validating the user input — for example, in the front-end of the API we’ve just built.

Summary

The full app, some sample input files are present in this repo. The json-schema gem is not yet official and might have some unfinished components — For example, the format validations of hostname and email for a string type have not been implemented yet — and the JSON Schema specification itself is under constant revisions. But that doesn’t mean it’s not ready for usage. Few of our developers use the gem in one of our projects and are pretty happy with it. Try out the gem and go through the specfication to gain an idea of why this would be beneficial yourself.

More Reading

  1. Understanding JSON Schema
  2. JSON Schema Documentation
  3. This excellent article by David Walsh
  4. JSON Schema Example: This example uses more keywords that weren’t discussed in this post. For example, title and description.

Form object validations in Rails 4

Yuva  - March 22, 2014  |   ,

Of late, at Codemancers, we are using form objects to decouple forms in views. This also helps in cleaning up how the data filled by end user is consumed and persisted in the backend. So, far the results have been good.

What are form objects

This blog post assumes that you are already familiar with form objects. Railscasts has a nice screencast about form objects. Do check it out if you haven’t already.

Use case

Let’s say that there is an organization and it has several employees. We’re tasked to build a Rails app that provides an interface where an admin can select one or more employees and send them emails. A typical interface implementation might look like this:

Employee email form

After selecting employees, filling-in the subject and body, and upon clicking “Send”, the backend should send emails to the selected employees. This is done by passing the array of the ids of employees, the subject and body to the backend. The POST parameters for that request look like this:

{
  "utf8"=>"",
  "email_form"=>{"employee_ids"=>[""], "subject"=>"", "body"=>""},
  "commit"=>"Send emails to employees"
}

Mass mailer form

We will create a EmployeesMassMailerForm form to encapsulate the validations and performing the actual action of sending email. This form should accept the params sent by the form, perform validations like checking whether all the employee ids belong to organization etc., and then send the emails.

class Organization < ActiveRecord::Base
  def get_employees(ids)
    employees.where(id: ids)
  end
end

class EmployeeMassMailerForm
  include ActiveModel::Model

  attr_accessor :organization, :employee_ids, :subject, :body

  validates :organization, :employee_ids, :subject, :body, presence: true
  validate  :employee_ids_should_belong_to_organization

  def perform
    return false unless valid?

    @employees = organization.get_employees(xemployee_ids)
    @employees.each { |e| schedule_email_for(e) }
    true
  end

  private

  def employee_ids_should_belong_to_organization
    if organization.get_employees(employee_ids).length != employee_ids.length
      errors.add(:employee_ids, :invalid)
    end
  end

  def schedule_email_for(e)
    Mailer.send_email(e, subject, body)
  end
end

With Rails 4, ActiveModel ships with Model module which helps in assigning attributes, just like how you can do with ActiveRecord class, along with helpers for validations. It is no longer necessary to use other libraries for form objects. Just include ActiveModel in a PORO class and you are good to go.

Testing using rspec and shoulda

All the form objects can be broken down into 2 main sections:

  1. Validations
  2. Performing actions
Testing validations

Adding validations on forms and models is pretty straight forward. Except for database-related validations like uniqueness, all the ActiveRecord validations can be used on form objects. These validations also make it easy to display validation errors in the view.

At Codemancers, we mostly use rspec and shoulda for testing. Validations on forms can be tested like this:

describe EmployeeMassMailerForm do
  describe 'Validations' do
    it { should validate_presence_of(:organization) }
    it { should validate_presence_of(:employee_ids) }
    it { should validate_presence_of(:subject)      }
    it { should validate_presence_of(:body)         }

    context 'when employee ids belong to organization' do
      it 'validates form successfully' do
        employee_ids = [1, 2]
        organization = mock_model(Organization, get_employees: employees_ids)

        form = described_class.new(organization: organization, subject: 'Test',
                                   employee_ids: employee_ids, body: 'Test')
        expect(form).to be_valid
      end
    end

    context 'when one or more employee ids donot belong organization' do
      it 'fails to validate the form' do
        organization = mock_model(Organization, get_employees: [])

        form = described_class.new(organization: organization, subject: 'Test',
                                   employee_ids: [1, 2, 3], body: 'Test')
        expect(form).to be_invalid
      end
    end
  end
end

You can notice here that while validating employee ids, we use stubs and mock models so that tests never hit database. Testing a form that has validations is a bit hard, because one has to heavily stub and mock models until form becomes valid. But testing an invalid form is easy and sometimes easy to maintain. Notice that we do not care what get_employees returns and that we hard coded it with an empty array whose length is 0. Always try to put as many validations as possible on form object, so that very less exceptions are raised while performing actions.

Testing actions performed by form

Once all the validations pass, the form object will go ahead and perform the action it is supposed to do. It can be anything from sending emails to persisting objects to database. Lets see how we can test the action perform from above form.

describe EmployeeMassMailerForm do
  describe '#perform' do
    let(:organization) do
      employees = [stub(email: 'a@b.com'), stub(email: 'b@c.com')]
      mock_model(Organization, get_employees: employees)
    end

    let(:form) do
      described_class.new(organization: organization, subject: 'Test',
                          employee_ids: [1, 2], body: 'Test')
    end

    before(:each) do
      described_class.any_instance.should_receive(:valid?).and_return(true)
      InvitesMailer.deliveries.clear
    end

    it 'sends emails to all employees' do
      form.perform
      expect(InvitesMailer.deliveries.length).to eq 2
    end

    it 'returns true' do
      expect(form.perform).to be_true
    end
  end
end

The trick here is to hard-code valid? to be true in before block. Since we have already tested validations, we can hard code the return value of valid? to be true. This saves a bunch of db calls and mocks.

I hope you enjoyed this article and if you want to keep updated with latest stuff we are building or blogging about, follow us on twitter @codemancershq.

Random Ruby tips from trenches #1

Hemant  - January 7, 2014  |   , ,

  1. Rubocop and Flycheck :

    Flycheck is a emacs mode which helps us with IDE like warnings in Emacs. I am already using enh-ruby-mode which helps with some of the syntax errors and stuff, but what is nice about flycheck is it integrates with rubocop and shows rubocop errors in-place in the editor.

    A picture is worth thousand words so:

    rubocop with flycheck

  2. pry ––gem:

    pry --gem opens a pry session with ./lib added to $LOAD_PATH and ‘require’s the gem. A good shortcut while working on gems and you want a quick console with the gem loaded.

  3. ruby –S :

    This can be used for running binaries which are present in local directory. For example, if you are working on bundler and want to run bundle command with local version of bundler rather than one installed globally you can use:

       ruby -S ./bin/bundle -I ./lib/ install
    

    The advantages are:

    • ruby -S allows you to ignore #!/usr/bin/env ruby line and even if current version of ruby is X using ruby -S you can run the command with another version of Ruby. Especially useful for running scripts with JRuby, Rbx etc.

    • Based on RUBYPATH environment variable running a script via ruby -S (or jruby -S) allows PATH environment to be modified. For example running jruby -S gem does not run gem which is in current path, but it runs gem command provided by JRuby because JRuby defines different RUBYPATH.

  4. Faster rbenv load :

    If you are using rbenv and there is a lag while opening a new shell, consider updating the rbenv initializing line in your shell profile to:

     eval "$(rbenv init - --no-rehash)"
    

    The --no-rehash flag tells rbenv to skip automatic rehash when opening a new shell and speeds up the process. This also speeds up VIM if you are using vim-rails or vim-ruby.

CSS3 Box Model behaviour

Vijay Sharma  - November 17, 2013  |   , , ,

What is CSS Box Model?

W3C defines “The CSS box model describes the rectangular boxes that are generated for elements in the document tree and laid out according to the visual formatting model”

It means that any html element on a web page can be represented using a box. Yes, a web page is full of boxes. These boxes are made up of basically four components which affects their representation on a page. This four components are Content, Padding, Border and Margin. HTML elements represented by a box can have all these four components or at least one component which is Content. Normally a box model can be represented by the following illustration.

Box Model

A bit of history

Prior to IE6, internet explorer had its own box model called Internet Explorer box model which was considered a buggy model. As per CSS 1 specifications, any width and height applied to an element via css will apply only to content area. Any padding, border and margin applied are added to the content area. This is how width and height are applied to element boxes. And this still holds true for all browsers. Even CSS3 spec makes this a default way of handling box models. As I mentioned earlier that internet explorer followed its own box model which was considered buggy. Any width and height applied to an element in internet explorer did include only content area, rather it applied to content area including padding and border. So an element normally appeared narrower in IE compared to other browsers. For backwards compatibility IE supported this model in quirks mode. You can trigger quirks mode in IE by using HTML 3 or earlier DTD or by completely removing DTD.

Strangely, over the years we realized that the model considered buggy is actually what web authors need? CSS3 brought back this so called buggy model and now we have two box models which web authors can choose as per their requirement. These two different box models can be triggered via css with property box-sizing with valuescontent-box | border-box | inherit.

Below image shows how two different models behave.

Compare Box Model

We are happy with what is there by default. Is it useful?

In my opinion, having the flexibility of choosing box-model of your preference has made life much easier when it comes to coding HTML. Major browsers are already supporting it with vendor prefixes like below.

div {
   -moz-box-sizing: border-box;
   -webkit-box-sizing: border-box;
   box-sizing: border-box;
}
Example:

Say we need a two column layout. Left column with 40% width and right column with 60% width. With default box model or box-sizing:content-box, we will create two floated divs with 40% and 60% with respectively with out adding any padding or margin, or else it will add to the content width and break the layout. These two divs will act as wrappers for left and right content. Any padding intended will have to applied to inner container div inside each wrapper.

.div40, .div60 {
  float: left;
}

.div40 {
  width: 40%;
}

.div60 {
  width: 60%;
}

/* For Padding */
.div40 > div,
.div60 > div {
  padding: 20px;
}

Using border-box model there’s no need for using extra div for padding which will surely take one step ahead in semantic coding. Lets see the code.

.div40, .div60 {
  -moz-box-sizing: border-box;
  -webkit-box-sizing: border-box;
  box-sizing: border-box;
  float: left;
  padding: 20px; //Padding to the same div will be adjusted from the specific width of each div. Same will be true for borders as well.
}

.div40 {
  width: 40%;
}

.div60 {
  width: 60%;
}

This example is just an illustration of CSS Box Model. You are the best judge for the kind of box-model you are going to use in your project.

Hope you have enjoyed this article, and follow us on twitter @codemancershq for all the awesome blogs, or you can use rss feeds.

Rendering images for Retina or any high DPI screens.

Vijay  - November 3, 2013  |   , , , ,

Authoring HTML and CSS has come a long way. As opposed to the regime built in past, today we focus on lots of thing while authoring a page. We follow the standards and best practices and create semantic HTML with visual properties separated out in CSS. Here I’ll not talk about writing semantic HTML and CSS but will focus on rendering images for latest devices having device pixel ratio of 2(Retina Display), 3(Galaxy S4) and even more in near future. Does it really need to be taken into account? Yes, if you are building a responsive site. Because the images created for normal display will look pixelated and blurred in these devices.

To understand better, lets look into some basic terms associated with responsive coding.

dpi,Device Pixel Ratio, dip, dppx, are few common units that we should familiarise with.

Find more on units

Example:

An image of 64px X 64px on dpr 1 display will use 64px wide and 64px vertical screen pixels. whereas on retina it will be rendered by 128px wide and 128px vertical screen pixels. This will make that image look blurry. Will be more disaster on display with dpr 3 (like Galaxy S4)

Screen comparison

10px by 10px image rendered on different screens.

Coming back, how do we then render images without blurring or pixelating it.

Use svg images

Scalable vector graphics are supported by most browsers. We can create most of the graphics as .svg and use it. Can be even used as sprites. This solves the problem partially as we may not be able to create svg for every graphics especially photographs. In that case we can use media queries.

Media Icons

Above icons are in svg format and it will look perfect in all devices. Check by zooming the page.

Note: svg is not supported below IE9. SVG Web polyfill is available for that.

CSS media queries

For example we have a breakpoint for responsive design at 768px (tablet) and we are displaying a background image for both normal and retina display. As this image cannot be converted to svg we will use media query. Image size is say 250px X 400px and file name is image1.jpg.For retina display we will create the image at 500px X 800px and name is as image1@2x.jpg ans similarly, for device with dpr 3 create the same image at 750px x 1200px and name is as image1@3x.jpg Prefixing with @2x or @3x is just to denote that this image is scaled 2 times or 3 times. Any other consistent prefix works well as long as you decode it correctly.

@media only screen and (min-width: 768px) {
  /* non-retina */
  .image1{
    background-image:url('image1.jpg');
  }
}

@media only screen and (-webkit-min-device-pixel-ratio: 2) and (min-width: 768px),
only screen and (min--moz-device-pixel-ratio: 2) and (min-width: 768px),
only screen and (-o-min-device-pixel-ratio: 2/1) and (min-width: 768px),
only screen and (min-device-pixel-ratio: 2) and (min-width: 768px),
only screen and (min-resolution: 2dppx){
  /* retina display. dpr 2 */
  .image1{
      background-image:url('image1@2x.jpg');
      background-size: 250px 400px; /* same size as normal image would be or on web it will scale up 2 times */
  }
}

@media only screen and (-webkit-min-device-pixel-ratio: 3) and (min-width: 768px),
only screen and (min--moz-device-pixel-ratio: 3) and (min-width: 768px),
only screen and (-o-min-device-pixel-ratio: 3/1) and (min-width: 768px),
only screen and (min-device-pixel-ratio: 3) and (min-width: 768px),
only screen and (min-resolution: 3dppx){
  /* dpr 3 */
  .image1{
      background-image:url('image1@3x.jpg');
      background-size: 250px 400px; /* same size as normal image would be or on web it will scale up 3 times */
  }
}
Hope you have enjoyed this article, and follow us on twitter @codemancershq for all the awesome blogs, or you can use rss feeds.