Datatype 

sbt-datatype is a code generation library and an sbt autoplugin that generates growable datatypes and helps developers avoid breakage of binary compatibility.

Unlike standard Scala case classes, the datatypes generated by this library allow the developer to add new fields to the defined datatypes without breaking binary compatibility while offering (almost) the same functionalities as plain case classes. The only difference is that datatype doesn’t generate unapply methods, because they would break binary compatibility.

Our plugin takes as input a datatype definition in the form of a JSON object, whose format is based on the format defined by Apache Avro, and generates the corresponding code in Java or Scala along with the boilerplate code that will allow the generated classes to remain binary compatible with previous versions of the datatype.

The source code of the library and autoplugin can be found on GitHub.

Using the plugin 

To enable the plugin for your build, put the following line in project/datatype.sbt:

addSbtPlugin("org.scala-sbt" % "sbt-datatype" % "0.0.3")

Your datatype definitions should be placed by default in src/main/datatype and src/test/datatype. This location can be changed by setting a new location in your build definition:

datatypeSource in generateDatatypes := file("some/location")

The plugin exposes two other settings for Scala code generation:

  1. datatypeScalaFileNames in generateDatatypes This setting accepts a function Definition => File which will determine the filename for every generated Scala definition.
  2. datatypeScalaSealProtocols in generateDatatypes This setting accepts a boolean value, and will determine whether protocols should be sealed or not.

Types generated 

datatype is able to generate three kinds of types:

  1. Protocols
  2. Records
  3. Enumerations

Protocols 

Protocols are mapped to Java abstract classes or Scala sealed abstract classes. They can be extended by other protocols or records.

Example: 
{
  "types": [
    {
      "name": "Person",
      "type": "protocol",
      "target": "Java",
      "fields": [
        {
          "name": "name",
          "type": "String"
        },
        {
          "name": "age",
          "type": "int"
        }
      ]
    }
  ]
}

This definition will generate the following Java class:

public abstract class Person implements java.io.Serializable {
    private String name;
    private int age;
    public Person(String _name, int _age) {
        super();
        name = _name;
        age = _age;
    }
    public String name() {
        return this.name;
    }
    public int age() {
        return this.age;
    }
    public boolean equals(Object obj) {
        if (this == obj) {
            return true;
        } else if (!(obj instanceof Person)) {
            return false;
        } else {
            Person o = (Person)obj;
            return name().equals(o.name()) && (age() == o.age());
        }
    }
    public int hashCode() {
        return 37 * (37 * (17 + name().hashCode()) + (new Integer(age())).hashCode());
    }
    public String toString() {
        return "Person("  + "name: " + name() + ", " + "age: " + age() + ")";
    }
}

Or the following Scala class (after changing the target property):

abstract class Person(
  val name: String,
  val age: Int) extends Serializable {
  override def equals(o: Any): Boolean = o match {
    case x: Person => (this.name == x.name) && (this.age == x.age)
    case _ => false
  }
  override def hashCode: Int = {
    37 * (37 * (17 + name.##) + age.##)
  }
  override def toString: String = {
    "Person(" + name + ", " + age + ")"
  }
}

Records 

Records are mapped to Java or Scala classes.

Example: 
{
  "types": [
    {
      "name": "Person",
      "type": "record",
      "target": "Java"
      "fields": [
        {
          "name": "name",
          "type": "String"
        },
        {
          "name": "age",
          "type": "int"
        }
      ]
    }
  ]
}

This schema will produce the following Java class:

public final class Person implements java.io.Serializable {
    private String name;
    private int age;
    public Person(String _name, int _age) {
        super();
        name = _name;
        age = _age;
    }
    public String name() {
        return this.name;
    }
    public int age() {
        return this.age;
    }
    public boolean equals(Object obj) {
        if (this == obj) {
            return true;
        } else if (!(obj instanceof Person)) {
            return false;
        } else {
            Person o = (Person)obj;
            return name().equals(o.name()) && (age() == o.age());
        }
    }
    public int hashCode() {
        return 37 * (37 * (17 + name().hashCode()) + (new Integer(age())).hashCode());
    }
    public String toString() {
        return "Person("  + "name: " + name() + ", " + "age: " + age() + ")";
    }
}

Or the following Scala code (after changing the target property):

final class Person(
  val name: String,
  val age: Int) extends Serializable {
  override def equals(o: Any): Boolean = o match {
    case x: Person => (this.name == x.name) && (this.age == x.age)
    case _ => false
  }
  override def hashCode: Int = {
    37 * (37 * (17 + name.##) + age.##)
  }
  override def toString: String = {
    "Person(" + name + ", " + age + ")"
  }
  private[this] def copy(name: String = name, age: Int = age): Person = {
    new Person(name, age)
  }
  def withName(name: String): Person = {
    copy(name = name)
  }
  def withAge(age: Int): Person = {
    copy(age = age)
  }
}
object Person {
  def apply(name: String, age: Int): Person = new Person(name, age)
}

Enumerations 

Enumerations are mapped to Java enumerations or Scala case objects.

Example: 
{
  "types": [
    {
      "name": "Weekdays",
      "type": "enumeration",
      "target": "Java",
      "types": [
        "Monday", "Tuesday", "Wednesday", "Thursday",
        "Friday", "Saturday", "Sunday"
      ]
    }
  ]
}

This schema will generate the following Java code:

public enum Weekdays {
    Monday,
    Tuesday,
    Wednesday,
    Thursday,
    Friday,
    Saturday,
    Sunday
}

Or the following Scala code (after changing the target property to):

sealed abstract class Weekdays extends Serializable
object Weekdays {
  case object Monday extends Weekdays
  case object Tuesday extends Weekdays
  case object Wednesday extends Weekdays
  case object Thursday extends Weekdays
  case object Friday extends Weekdays
  case object Saturday extends Weekdays
  case object Sunday extends Weekdays
}

Existing parameters for protocols, records, etc. 

All the elements of the schema definition accept a number of parameters that will influence the generated code. These parameters are not available for every node of the schema. Please refer to the syntax summary to see whether a parameters can be defined for a node.

name

This parameter defines the name of a field, record, field, etc.

target

This parameter determines whether the code will be generated in Java or Scala.

namespace

This parameter exists only for Definitions. It determines the package in which the code will be generated.

doc

The Javadoc that will accompany the generated element.

fields

For a protocol or a record only, it describes all the fields that compose the generated entity.

types

For a protocol, it defines the child protocols and records that extend it.

For an enumeration, it defines the values of the enumeration.

since

This parameter exists for fields only. It indicates the version in which the field has been added to its parent protocol or record.

When this parameter is defined, default must also be defined.

default

This parameter exists for fields only. It indicates what the default value should be for this field, in case it is used by a class that has been compiled against an ealier version of this datatype.

It must contain an expression which is valid in the target language of the parent protocol or record.

type for fields

It indicates what is the underlying type of the field.

Always use the type that you want to see in Scala. For instance, if your field will contain an integer value, use Int rather than Java’s int. datatype will automatically use Java’s primitive types if they are available.

For non-primitive types, it is recommended to write the fully qualified type.

type for other definitions

It simply indicates the kind of entity that you want to generate: protocol, record or enumeration.

Using datatype to retain binary compatibility 

By using the since and default parameters, it is possible to grow existing datatypes while remaining binary compatible with classes that have been compiled against an earlier version of your datatype definition.

Consider the following initial version of a datatype:

{
  "types": [
    {
      "name": "Greeting",
      "type": "record",
      "target": "Scala",
      "fields": [
        {
          "name": "message",
          "type": "String"
        }
      ]
    }
  ]
}

The generated code could be used in a Scala program using the following code:

val greeting = Greeting("hello")

Imagine now that you would like to extend your datatype to include a date to the Greetings. The datatype can be modified accordingly:

{
  "types": [
    {
      "name": "Greeting",
      "type": "record",
      "target": "Scala",
      "fields": [
        {
          "name": "message",
          "type": "String"
        },
        {
          "name": "date",
          "type": "java.util.Date"
        }
      ]
    }
  ]
}

Unfortunately, the code that used Greeting would no longer compile, and classes that have been compiled against the previous version of the datatype would crash with a NoSuchMethodError.

To circumvent this problem and allow you to grow your datatypes, it is possible to indicate the version since the field exists and a default value in the datatype definition:

{
  "types": [
    {
      "name": "Greeting",
      "type": "record",
      "target": "Scala",
      "fields": [
        {
          "name": "message",
          "type": "String"
        },
        {
          "name": "date",
          "type": "java.util.Date",
          "since": "0.2.0",
          "default": "new java.util.Date()"
        }
      ]
    }
  ]
}

Now the code that was compiled against previous definitions of the datatype will still run.

Syntax summary 

Schema           := { "types": [ Definition* ] }

Definition       := Protocol | Record | Enumeration

Protocol         := {   "name": ID,
                        "type": "protocol",
                        "target": ("Scala" | "Java")
                     (, "namespace": string constant)?
                     (, "doc": string constant)?
                     (, "fields": [ Field* ])?
                     (, "types": [ Definition* ])? }

Record           := {   "name": ID,
                        "type": "record",
                        "target": ("Scala" | "Java")
                     (, "namespace": string constant)?
                     (, "doc": string constant)?
                     (, "fields": [ Field* ])? }

Enumeration      := {   "name": ID,
                        "type": "enumeration",
                        "target": ("Scala" | "Java")
                     (, "namespace": string constant)?
                     (, "doc": string constant)?
                     (, "types": [ EnumerationValue* ])? }

EnumerationValue := ID
                  | {   "name": ID
                     (, "doc": string constant)? }

Field            := {   "name": ID,
                        "type": ID
                     (, "doc": string constant)?
                     (, "since": version number string)?
                     (, "default": string constant)? }

Contents

sbt Reference Manual
    1. Datatype