Learning a new programming language usually consists of learning two basic things first: 1. Data structures and their APIs 2. Control logic (if/else, loops - for, while, do-while etc)
In a similar fashion, we take a look at how to construct familiar data structures first.
module Lang.dataStructures where
open import Agda.Builtin.Nat public using (Nat)
Agda is an implementation of type theory, a branch of mathematics which deals with Type
as an object and
various theorems (APIs) for working with Type
s. A Type
presents a notion of an object of a
certain kind, i.e. it assigns some extra information to an object.
Programmers might already be well familiar with the notion of Type
as it’s use is widespread in
programming.
In Type theory, we declare A
as a type:
\[ A : Type \]
and say the object x
is of type A
like:
\[ x : A \]
This, in say scala is done like:
val x:Int = 0
or in C:
int x;
in Agda:
variable
: Nat x
The Set
type is the root of all types, somewhat akin to java’s Object
or scala’s
Any
. Every other Type is a Set
, whereas Set
itself if of type
Set : Set₁
. We will look deeper into this at a later stage.
Data
keywordProgramming languages often come bundled with some primitive data types like int
, float
,
string
etc and some that combine these primitive types into more complex structures, e.g. map
or dict
(python) can be used to construct say map<string, array<string>>
or the
Either
datatype in haskell, the Option
datatype in scala, tuple
in python and so
on.
Most languages allow creating new data types using either cartesian product or cartesian sum (disjoint union) on a bunch of pre-existing data types and enclosing them inside a container.
Some examples of cartesian product of String
, Int
and another String
would
be:
val a : Tuple2[String, Tuple2[Int, String]] = ...
val b : Tuple2[Tuple2[String, Int], String] = ...
val c : Map[String, (Int, String)] = ...
and some example of sums are:
val a : Option[A] = ... // Either A or null
val a : Either[A, B] = ... // Either type A or type B
Some languages also provide the mechanism to define new data types, sometimes by alias-ing a data type with a name like in scala:
type newData = Map[String, List[Float]]
This is called type aliasing
.
Others provide the facility to define completely new data types, like haskell does with the data
keyword:
-- this states that the type `Bool` can have two values False and True
data Bool = False | True
A haskell datatype can also have constructors. For e.g. if we were to define a shape type which can either be a circle or a rectangle:
data Shape = Circle Float Float Float | Rectangle Float Float Float Float
-- uses the Circle constructor to create an object of type Shape
Circle 1.2 12.1 123.1
-- uses the Rectangle constructor to create an object of type Shape
Rectangle 1.2 12.1 123.1 1234.5
It is important to appreciate that Shape
is a new Type
, one that did not previously exist
before in the language.
The data
keyword works in a similar manner in Agda:
-- lets assume the type ℝ, or real numbers is already defined
module _ {ℝ : Set} where
data Shape : Set where
: ℝ → ℝ → ℝ → Shape
Circle : ℝ → ℝ → ℝ → ℝ → Shape Rectangle
In Type theory, a function f
that operates on values of type 𝔸
, also called domain of the
function and produces values of type 𝔹
, also called co-domain of the function, is represented as:
\[ f : 𝔸 → 𝔹 \]
A function in Agda consists of two main parts:
Type
.Type
.not : Bool → Bool
not true = false
not false = true
This pattern matching way if typically found in functional programming language and the reader is cautioned that this will be used heavily throughout this work. As Agda is implemented in haskell and its syntax shares high degrees of similarity, it is encouraged to be a bit well versed with basic haskell.
An empty type cannot be created cause it has no constructor. This is the most barebones of a data
definition.
data ⊥ : Set where
A singleton is just a type containing only one object:
data ⊤ : Set where
: ⊤ singleton
The singleton constructor singleton
creates a single object of type T
.
The boolean type has just two values:
data Bool : Set where
: Bool
true : Bool false
Natu
ral numbersNatural numbers are the integral numbers used for counting, e.g. 0,1,2,3… Natural numbers support an operation called
succ
for succession, which can be used to create new natural numbers from known ones,
e.g. 3 = 2 + 1 = 1 + 1 + 1
. Thus, all natural numbers can be created from zero
and n
successive numbers after zero
. All we need to know are:
and then, increment zero to infinity!
data ℕ : Set where
: ℕ
zero : ℕ → ℕ succ
The operations for natural numbers, addition, subtraction, multiplication and powers can be defined as functions in Agda:
_+_ : ℕ → ℕ → ℕ
= x
x + zero (succ y) = succ (x + y)
x +
_−_ : ℕ → ℕ → ℕ
= zero
zero − m = succ n
succ n − zero = n − m
succ n − succ m
_×_ : ℕ → ℕ → ℕ
= zero
x × zero (succ y) = x + (x × y)
x ×
_² : ℕ → ℕ
= x × x
x ²
_^_ : ℕ → ℕ → ℕ
= succ zero
x ^ zero (succ y) = x × (x ^ y) x ^
Examples:
= succ zero
one = succ one
two = succ two
three = succ three
four = succ four
five = succ five
six = succ six
seven = succ seven
eight = succ eight
nine = succ nine ten
and so on. Here, each member of the integer family can be derived from a smaller member by successively applying
succ
to it. Such a type is called an Inductive
type.
Also note that in the function _+_
, we used a new kind of clause:
+ (succ y) = succ (x + y) x
Here, we mean that for all inputs of the type x + (succ y)
where succ y
corresponds to a
natural number that has been constructed using a succ
i.e. it is (succ y) > 0
, return
succ (x + y)
. This pattern is called “pattern matching”. Pattern matching is heavily used all over the
functional programming world, e.g. in scala:
(1 to 100).map{
// pattern match against all values of type integer
case(y:Int) if y > 5 =>
+1
y// if above pattern does not match
case _ =>
0
}
We define a binary tree using the following definition. Note that this merely creates an empty structure of a tree, the nodes or leaves contain no information in them, except for their position in the tree:
data BinTree : Set where
: BinTree
leaf : BinTree → BinTree → BinTree node
Now let us augment the binary trees with leaves containing natural numbers in leaf nodes:
data ℕBinTree : Set where
: ℕ → ℕBinTree
leaf : ℕBinTree → ℕBinTree → ℕBinTree node
Binary trees with each node and leaf containing a natural number:
data ℕNodeBinTree : Set where
: ℕ → ℕNodeBinTree
leaf : ℕ → ℕNodeBinTree → ℕNodeBinTree → ℕNodeBinTree node
Binary trees with each node containing a natural number and each leaf containing a boolean:
data ℕMixedBinTree : Set where
: Bool → ℕMixedBinTree
leaf : ℕ → ℕMixedBinTree → ℕMixedBinTree → ℕMixedBinTree node
We define a graph with:
to
and from
verticesdata Vertex : Set
data Edge : Set
data Graph : Set
data Vertex where
: ℕ → Vertex
vertex
data Edge where
: Vertex → Vertex → Edge
edge
data Graph where
: Edge → Graph
idGraph _+|+_ : Graph → Edge → Graph
infixl 3 _+|+_
Agda supports “mixfix” operators which combine infix, prefix and postfix operator semantics. Operator arguments are indicated with underscores
_
. An example would be the infix addition operator_+_
which when applied with two parameters can be written asa + b
. Similarly, a prefix operator would be represented as_♠
, a postfix one as♠_
. It is also possible to define more complex operators likeif_then_else_
.
The infixl
operator above sets the precedence of the operator +|+
.
We can use the above definition to create a graph in the following way:
: Graph
graph = idGraph (edge (vertex zero) (vertex seven)) +|+
graph (vertex one) (vertex three) +|+
edge (vertex seven) (vertex four) +|+
edge (vertex nine) (vertex (succ six)) edge
A list containing objects of type A
can be defined inductively as having:
[]
::
data List (A : Set) : Set where
: List A
[] _::_ : A → List A → List A
infixr 5 _::_
and we create instances of lists as:
: List Bool
bunch = false :: false :: true :: false :: true :: [] bunch
A finite set can be considered as a finite bunch of numbered objects, such that each object can be uniquely
identified by an integer, thus making the set countable. This count is called the Cardinality
of the
set.
Formally, a finite set Fin
is a set for which there exists a bijection (one-to-one and onto
correspondence)
\[f : Fin → [n]\]
where \(n ∈ ℕ\) and [n]
is the set of all natural numbers from
0
to n
.
This can be represented in Agda as:
data Fin : ℕ → Set where
: (n : ℕ) → Fin (succ n)
id : (n : ℕ) → Fin n → Fin (succ n) succ
Fin n
represents the set of first n natural numbers, i.e., the set of all numbers smaller than n. We
create a finite set of four elements:
: Fin four
fourFin = succ three (succ two (succ one (id zero))) fourFin
For a more in-depth treatment of finite sets, refer Dependently Typed Programming with Finite Sets.
We now define a finite sized indexed list, also called a vector Vec
. The constructor consists of:
[]
which constructs an empty vectorcons
which inductively builds a vectordata Vec (A : Set) : ℕ → Set where
: Vec A zero
[] : (n : ℕ) → A → Vec A n → Vec A (succ n) cons
Examples of vectors :
: Vec Bool one
vec1 = cons zero true []
vec1
: Vec Bool two
vec2 = cons one false vec1
vec2
: Vec Bool three
vec3 = cons two true vec2 vec3
Note that each vector has its size encoded into it’s type. This is not to be confused with set theory based lists, where any two list of different number of elements have the same type.
For example:
val x : List[Int] = List(1,2,3,4,5)
val y : List[Int] = List(1,2,3,4,5,6,7,8,9,0)
both have the same type List[Int]
.
Another example:
A bool-indexed vector such that only one type can be stored at the same time:
data ⟂ : Set where
data BoolVec(A B : Set) : Bool → Set where
: B → BoolVec A B false
id₁ : A → BoolVec A B true
id₂
: BoolVec ⟂ ℕ false
containsB = id₁ three
containsB
: BoolVec ℕ ⟂ true
containsA = id₂ four containsA