In this guide, we don't aim to teach you programming in ATS (Applied Type System), but we need to tell you enough so that a seasoned ML programmer can quickly adjust to the terminology used in ATS and start reading existing ATS code.
In ATS predecessor Dependent ML, the static world only sees a very limited presence of the dynamic world through dependent types. Each dynamic expression has a type that can be indexed with a static expression. Static constraints are collected from static expressions, and type checking involves solving these constraints to decide satisfiability.
In addition to dependent types, ATS also has proofs, which allow the static world to become more involved in dynamics by allowing programmers to write a dynamic expression and a proof term side-by-side to establish correspondence. Proofs can either be classical propositions, called props, or linear propositions (in the sense of linear logic—see margin note) called views.
To those familiar with Prolog, defining props is very much like defining a Prolog predicate. However, the type checker doesn't solve these predicates automatically, and you have to manually manipulate proofs like how you manipulate program data. Proof manipulation allows ATS to be used as a theorem prover.
Finally, you can draw the dichotomy of dynamics and statics as follows, in the form of "some kind of dynamic terms are characterized by a particular static sort."
In ATS, files can have several suffixes.
One notable difference between .dats and .sats is the declaration of function and value signatures.
However, more typically, .sats would define the type of a function to be implemented in .dats. In such case, the implementation in .dats does not need type annotation.
It is important to understand that, unlike ML, ATS type checker only performs very limited type inference. You have to explicitly annotate enough types (mostly arguments to a function) so ATS can reconstruct typing derivation.
ATS is a bit like ML. You can define type and datatype.
From the SML code, there is a predefined type int, and here we define a new type name t aliased to int. We can define aliases to types parameterized by some type variables. We also define an option datatype that is parameterized by a type variable a. These identifiers are all static identifiers, in contrast to dynamic identifiers which are run-time variables.
In ATS, notice that type variables for datatype constructors need to be explicitly applied. For more information about datatype, see Datatypes in the ATS tutorial, but you may want to read the rest of this section first.
Type definition assigns a static identifier to a static expression with the sort "type", so typedef is a special case for stadef. The following two static declarations are almost the same.
The only difference is that typedef ensures that t will end up having the "type" sort, and stadef doesn't particularly care.
A type is one of the many sorts that a static expression or static identifier can be. Other sorts are:
You can define a new sort using datasort. The first few lines of sllist.dats demonstrate how to define a sort that resembles a list. You can also alias sorts using sortdef, which also lets you piggy-back constraints to a sort called subset sort. See prelude/sortdef.sats for examples. A constraint is simply a static expression of a boolean sort.
For each of the type, view, prop, it is possible to alias, declare abstract quantities, and form algebraic constructors.
Type, view, prop, and viewtype are special cases of statics, so whenever you see typedef, viewdef, propdef and viewtypedef, you can just use stadef. Similarly, for abstract declarations, you can use sta for abstype, absview, absprop, and absviewtype. However, datasort, datatype, dataview, dataprop and dataviewtype are not interchangeable. They define algebraic constructors for a new "thing" (e.g. datatype gives you constructors for a new type, dataview gives you constructors for a new view, and so on).
View, prop, and viewtype will not be discussed now. Here we focus on types and simple static expressions.
Static expression is a first-order language that is sort checked. Here are some examples.
The static function nat_lt can be used whenever a bool sort is required. For example, if we want to constraint the range of a run-time integer argument:
We can then use the type NatLt(n) to write a function:
You can look at prelude/basics_sta.sats, which is part of ATS/Anariats, for the signatures of many built-in static operators. Notice that stadef allows many such operators to be overloaded. The fixity of many infix binary operators are defined in prelude/fixity.sats.
The first distinction is type with a fixed size, denoted simply as type, and type with an arbitrary but unknown size, denoted as t@ype. Values of type are like ML values, which has the same size as a machine pointer word. It can either be boxed or unboxed. Boxed values of type also resides on the managed (garbage collected) heap. Here are some examples of boxed values:
However, the ML types are too limited for system programming, which has to represent memory layout as a flat structure. This is where variables of t@ype would be useful. More often, memory layout of this form carries a view (linear resource) and a t@ype, and they are called viewt@ype altogether. Notice that t@ype is a subsort of viewt@ype, so anywhere viewt@ype is mentioned, we can give it just a t@ype. In fact, there is a static function sizeof(t) that has the same meaning in the C language, which has the static signature:
And the sizeof static function can be used for any t@ype and viewt@ype.
The boxed tuple, record, and list have analogy in the flat types as well.
But constructing flat values is a bit different. A flat value can reside either on the heap (like a boxed value) or on stack (which ML doesn't use for allocating data structure). This is discussed in another article, Manipulating flat memory.
Flat memory may be uninitialized. The notation "t?" represents the type of an uninitialized memory that has the same size as t. For example, uninitialized int is written as "int?" (see margin note). However, given two types t1 and t2, even though we may know that sizeof(t1) == sizeof(t2), the compiler still treats t1? and t2? as two different types.
Finally, we occasionally see annotated '+' and '-' at the end of type (as well as viewtype) and t@ype (as well as viewt@ype). The '+' means that its parameters are covariant; '-' means the parameters are contravariant. Without the annotation, parameters are nonvariant. This is useful for subsorting for types indexed by a static expression. The covariant use is more common. Contravariance is rarely used.
The anatomy of an ATS function type deserves careful examination. The simplest form, similar to ML, is simply:
Notice we can tell ATS that function_name will be called c_name when translated to C, though this is optional. This can be used to either:
This has several important consequences.
Adding guarded and assertion dependent types to the mix, we now have:
Where guards are static expressions of the bool sort written in terms of any forall quantifiers, and asserts are static expressions of the bool sort written in terms of any forall and exists quantifiers. Furthermore, t1 ... tn are static expressions of the type sort that can be written in terms of the for-all quantifiers, and t' is a static expression of the type sort that can be written in terms of any forall and exists quantifiers.
Now, we add proof terms to the mix too.
Notice that the uncurried arguments are now separated into two parts, proof variables and program variables. Proof variables will be erased once compilation finishes, leaving just the program variables in the resulting compiled program.
Finally, ATS uses decorated arrow type to keep track of side-effects as tags.
If we were to write function type as a static expression (as opposed to function signature declaration like we've seen so far), the type expression would look like this:
The arrow type in ATS looks like "-<>", and the form -<tag1, ..., tagk> is an augmented arrow type with decoration. ATS currently recognizes the following tags for distinguishing various kind of functions:
As well as the following tags to track side-effects:
What does a typical ATS function declaration look like? Here we show an
example from the Standard C library that sports the following prototype.
char *fgets(char *s, int size, FILE *stream);
From the man page, "fgets() reads in at most one less than size characters from stream and stores them into the buffer pointed to by s. A '\0' is stored after the last character in the buffer." And that "fgets() returns s on success, and NULL on error or when end of file occurs while no characters have been read."
We now look at the ATS declaration, heavily commented and colored (orange is comment, cyan is keyword, and blue is static expression).
(* fgets_v describes the result of fgets, which can indicate either a
This example is based on libc/SATS/stdio.sats, part of ATS/Anariats, with slight modification and heavily annotated with comments. It may not be immediately apparent to the first-time reader, but there are some subtle points:
For your reference, this is what the above code looks like without comments.
dataview fgets_v (sz:int, addr, addr) =