Writing min function, part 1: The rise of Concepts

May 20, 2014

This is the first in a series of articles in which I want to transmit what I learned (or what I think I learned) from the books, papers and lectures of Alexander Stepanov.

These are the lessons that Alex gives us, and I want to show them in this series:

And… the following lesson is mine:

In this article I want to avoid using any programming language, I want to focus on the algorithms and the specifications. In subsequent articles, I will try to implement what we learned using several mainstream programming languages.

Writing min

I will try to write the function min, that is, a function that returns the minimum of two things.

At this time you may be wondering, this guy is writing an entire blog post about a two-line function, is this serious? The answer is yes. As Alex says: “Simple things are beautiful”, and believe it or not, we can learn a lot in the process of writing min.

The objective is to learn to correctly determine what are the requirements that a function must impose on types used in it.

“It is better to design our Components (algorithms and data structures) not in terms of concrete types, but in terms of requirements on types expressed as syntactic and semantic properties”

Alex calls a collection of requirements a Concept[3].

Despite having no support for Concepts in programming languages, he has been using them for decades, not in code, but in his mind and in the documentation of the components developed by him [4].

First Attempt

Well, let’s start writing the specification and then, the code:

Spec: Given two objects[5], a and b, return the smaller of both.

//Note: Naive min function in pseudo-code (contains errors).
min(a, b) {
	if (a < b) return a
	return b
}

The above function is written in pseudo-code (which looks like a mix between C and Python), it has some flaws, but we will see them later.

The most important question is… What are the requirements of min function must impose to the arguments a and b? That is… Which are the Concepts?

For some programmers, especially those advocates of duck-typing, imposing requirements to arguments may be something uninteresting. They simply use the arguments in the function body, and they hope to at least get a runtime error.

I strongly disagree!

“Even if we do not have Concepts in the language, they should exists in our mind. You have to learn to think in terms of Concepts whichever language you use.”

Forget for a while about programming languages, let’s see what the requirements are. The problem arises from this code snippet

      a < b

What does this mean?

Someone could say that the requirement is that arguments a and b must be compared using the less-than-operator.

But this is just a syntactic requirement, we have to go further in order to correctly specify our function. We need to include semantics in our type requirements! But… how to do it?

Mathematics to the rescue!

What is the less-than-operator?

It is a way for comparing two objects, returning a boolean value. Is this enough for defining min function?

No, and to ilustrate that, see what happened if the less-than-operator is defined this way:

//Pseudo-code for less-than-operator
less_than_operator(a, b) {
	if ( is_even(system_time().seconds) ) return true
	return false
}

This function returns true if the number of seconds of the system time is even, otherwise returns false. With this code I want to emphasize that the less_than_operator could be defined using a random behaviour, but we need to define an specific behaviour.

Mathematically the less-than-operator is a Relation[6]. A relation is a binary Predicate[6].

That is, a predicate that takes two parameters of the same type. “If you look of two things, is either true or false. The relation holds, or not.” The difference between the code above and a relation is that the relation is considered a FunctionalProcedure[6], that is, a function in which by replacing its inputs with equal objects results in equal output objects.

But the relation concept is too weak, we need a stronger concept: Ordering.

“What is an ordering? What do mathematicians call ordering?

The only absolute rule for ordering is the requirement of transitivity [6]. A relation is transitive if, whenever it holds between a and b, and between b and c, it holds between a and c. A transitive relation is the most basic notion of ordering, but it is still too weak for our needs.”

Let’s review what kinds of Ordering Relations exist:

(Note: There are more ordering relations, but we will see them later)

Writing min using Concepts

Well, now we know about ordering relations, let’s look at how we can use them to specify the min function.

But first, what should we use? Partial or Total Ordering?

“If our relation is the Subset relation on a set, then, _min_ and _max_ of two sets doesn’t make sense.”

Then, Partial Ordering is too weak, because the relation doesn’t hold for every pair of elements of the set.

We need to use Total Ordering for define the requirements of min, let’s do it:

// Requires:
//  The type of a is equal to the type of b, and it is called T,
//  and T is TotallyOrdered[6]
min(a, b) {
	if (a < b) return a
	return b
}
// Note: the implementations still has errors.

Note that the requirements were expressed as code comments. Later we will see what the programming languages provide us to express them as code.

Well, this is enough for a single post.

In the next articles of the series, we will:

Stay tuned!


Acknowledgments

Thanks in particular to the following for their feedback to improve this article: Mario Dal Lago, Andrzej Krzemienski, Dean Michael Berris, Javier Centurión, Alejandro Santos, Ezequiel Reyno.


The Series

   Part 1: The rise of Concepts
   Part 2: Understanding Concepts
   Part 3: Weakening the ordering
   Part 4: Const-Correctness
   Part 5: Stabilizing the algorithm


References

[1] Original definition of “Generic Programming”:
      David R. Musser and Alexander A. Stepanov: Generic Programming. ISSAC 1988, pages 13-25. http://www.stepanovpapers.com/genprog.pdf
[2] Elements of Programming of Alexander Stepanov and Paul McJones:
      http://www.elementsofprogramming.com
      Buy it at Amazon
[3] Concept definition: Stepanov and McJones [2009, page 10]
[4] SGI’s STL using Concepts in Documentation: https://www.sgi.com/tech/stl/min.html
[5] Object definition:
      The definition used in this article has nothing to do with an OOP-like definition of object [7].
      The definition used here is a practical definition of what an object is:
      “Object is a sequence of bits in memory” or
      “Object is a value residing in memory”
      See Stepanov and McJones [2009, page 4] for a complete definition.
[6] See Appendix A
[7] Object-Oriented Software Construction (2nd Ed) by Bertrand Meyer [1997, page 1198]


Appendix A: Definitions

Some of the definitions presented here are based on: http://www.elementsofprogramming.com/eop-concepts.pdf

$$Relation(\texttt{Op}) \triangleq \\ \qquad HomogeneousPredicate(\texttt{Op}) \\ \quad \land \texttt{ Arity(Op) = 2}$$

$$HomogeneousPredicate(\texttt{P}) \triangleq \\ \qquad Predicate(\texttt{P}) \\ \quad \land HomogeneousFunction(\texttt{P})$$

$$Predicate(\texttt{P}) \triangleq \\ \qquad FunctionalProcedure(\texttt{P}) \\ \quad \land \texttt{Codomain(P) = bool}$$

$$\textbf{property}\text{(R : Relation)} \\ \text{transitive : R} \\ \quad r \mapsto (\forall a, b, c \in \texttt{Domain(R)}) (r(a, b) \land r(b, c) \Rightarrow r(a, c))$$

$$\textbf{property}\text{(R : Relation)} \\ \text{refexive : R} \\ \quad r \mapsto (\forall a \in \texttt{Domain(R)}) (r(a, a))$$

$$\textbf{property}\text{(R : Relation)} \\ \text{antisymmetric : R} \\ \quad r \mapsto (\forall a, b \in \texttt{Domain(R)}) (r(a, b) \land r(b, a) \Rightarrow a = b)$$

$$\textbf{property}\text{(R : Relation)} \\ \text{irreflexive : R} \\ \quad r \mapsto (\forall a \in \texttt{Domain(R)}) (\lnot r(a, a))$$

$$\textbf{property}\text{(R : Relation)} \\ \text{asymmetric : R} \\ \quad r \mapsto (\forall a, b \in \texttt{Domain(R)}) (r(a, b) \Rightarrow \lnot r(b, a))$$

$$\textbf{property}\text{(R : Relation)} \\ \text{total : R} \\ \quad r \mapsto (\forall a, b \in \texttt{Domain(R)}) (r(a, b) \lor r(b, a))$$

$$\textbf{property}\text{(R : Relation)} \\ \text{total_ordering : R} \\ \quad \text{r} \mapsto \text{transitive(r) } \land \\ \quad \text{ (} \forall \text{ a, b} \in \texttt{Domain(R)} \text{) exactly one of the following holds: r(a, b), r(b, a), or a = b}$$

$$TotallyOrdered(\texttt{T}) \triangleq \\ \qquad \texttt{Regular(T)} \\ \quad \land <\texttt{: T x T} \rightarrow \text{bool} \\ \quad \land total\_ordering(<)$$

For definitions of: HomogeneousFunction, FunctionalProcedure, and Regular, see http://www.elementsofprogramming.com/eop-concepts.pdf [page 1]