Algorithms and specifications

Feb 19, 2020

In this article I want to talk about 2 topics in which I think that most programmers tend to fail: Algorithms and Specifications.

I’m going to use a real experience, which happened some time ago:

For 4 years now I have maintained a multi-currency node (Bitcoin, Bitcoin Cash and Litecoin) called Knuth (a.k.a. Bitprim). In November 2017 Bitcoin Cash made its first protocol change after its birth in August of the same year. My job at that time was to update the code of our node to support the protocol changes. From that moment I want to write this article, but … for one or several reasons I did not do it at that moment, I am doing it now.

The most important change was in the Difficulty Adjustment Algorithm, from now DAA.

Here the description of the algorithm.

I do not want to go into detail about the concept of difficulty or the DAA. For this you can refer to: Difficulty.

What interests me are points 2 and 3 of the description of the DAA:

2. Let B_last be chosen[2] from [B_n-2, B_n-1, B_n].
3. Let B_first be chosen[2] from [B_n-146, B_n-145, B_n-144].

Both point to the footnote [2]:

2. A block is chosen via the following mechanism:

Given a list: S = [B_n-2, B_n-1, B_n]
a. If timestamp(S[0]) greater than timestamp(S[2]) then swap S[0] and S[2].
b. If timestamp(S[0]) greater than timestamp(S[1]) then swap S[0] and S[1].
c. If timestamp(S[1]) greater than timestamp(S[2]) then swap S[1] and S[2].
d. Return S[1].

See GetSuitableBlock

The specification of the algorithm points to its implementation, in a function called GetSuitableBlock. Here the code:

/**
 * To reduce the impact of timestamp manipulation, we select the block we are
 * basing our computation on via a median of 3.
 */
static const CBlockIndex *GetSuitableBlock(const CBlockIndex *pindex) {
    assert(pindex->nHeight >= 3);

    /**
    * In order to avoid a block is a very skewed timestamp to have too much
    * influence, we select the median of the 3 top most blocks as a starting
    * point.
    */
    const CBlockIndex *blocks[3];
    blocks[2] = pindex;
    blocks[1] = pindex->pprev;
    blocks[0] = blocks[1]->pprev;

    // Sorting network.
    if (blocks[0]->nTime > blocks[2]->nTime) {
        std::swap(blocks[0], blocks[2]);
    }

    if (blocks[0]->nTime > blocks[1]->nTime) {
        std::swap(blocks[0], blocks[1]);
    }

    if (blocks[1]->nTime > blocks[2]->nTime) {
        std::swap(blocks[1], blocks[2]);
    }

     // We should have our candidate in the middle now.
    return blocks[1];
}

What the algorithm does is basically create a sequence of 3 elements (array), order it from least to greatest and return the second element.

The complexity in time of this algorithm is:

Best case: 0 swaps, 3 comparisons
Worst case: 2 swaps, 3 comparisons
Average case: 7/6 swaps, 3 comparisons; assuming a uniform distribution of the input data.

Now, look again at the algorithm. An array is being created (using the input data), then sort it up and return the middle element. This is a known algorithm and is called median, in particular, median of 3 elements.

The median is a selection algorithm. Unlike the sorting (inplace) algorithms, the selection algorithms should not mutate the input data, but return one of the elements.

Here is a sketch of the median of 3 algorithm, in C++:

template <TotallyOrdered T>
auto max(T const& a, U const& b) {
    return b < a ? a : b;
}

template <TotallyOrdered T>
auto median_3_ab(T const& a, T const& b, T const& c) {
    // precondition: a <= b

    return ! (c < b) ? b :        // a, b, c are sorted
                       max(a, c); // b is not the median
}

template <TotallyOrdered T>
auto median_3(T const& a, T const& b, T const& c) {
    return b < a ? median_3_ab(b, a, c)
                 : median_3_ab(a, b, c);
}

Or if you prefer the inline version of the algorithm:

template <TotallyOrdered T>
auto median_3(T const& a, T const& b, T const& c) {
    if (b < a) {
        if (c >= a) return a;  // b, a, c are sorted
        return max(b, c);      // a is not the median
    } else {    // a <= b
        if (c >= b) return b;  // a, b, c are sorted
        return max(a, c);      // b is not the median
    }
}

I leave the analysis of the code for the reader, for the lazy: what the algorithm does is simply select the middle element between a, b and c, pretending that the 3 were sorted in ascending order. It does this without mutating or reordering the input data.

The time complexity of median_3 is:

Best case: 0 swaps, 2 comparisons
Worst case: 0 swaps, 3 comparisons
Average case: 0 swaps, 8/3 comparisons; assuming a uniform distribution of the input data.

Now, we could use our new algorithm in the original GetSuitableBlock function:

static
CBlockIndex const* GetSuitableBlockNewVersion(CBlockIndex const* pindex) {
    assert(pindex->nHeight >= 3);
    return &median_3(*pindex->pprev->pprev, *pindex->pprev, *pindex);
}

Much shorter and understandable, right?.

Before continuing, we have to fix something: we do not know if the Natural Ordering specified in the CBlockIndex class is given by the block’s timestamp (nTime attribute). We need a version of median_3 that accepts a form of comparison specified by the user: we need you to accept a strict weak ordering relation (for more information see here).

template <Regular T, StrictWeakOrdering R>
auto max(T const& a, U const& b, R r) {
    return r(b, a) ? a : b;
}

template <Regular T, StrictWeakOrdering R>
auto median_3_ab(T const& a, T const& b, T const& c, R r) {
    // precondition: a <= b

    return ! r(c, b) ? b :           // a, b, c are sorted
                       max(a, c, r); // b is not the median
}

template <Regular T, StrictWeakOrdering R>
auto median_3(T const& a, T const& b, T const& c, R r) {
    return r(b, a) ? median_3_ab(b, a, c, r)
                   : median_3_ab(a, b, c, r);
}

Now, we can correctly implement GetSuitableBlockNewVersion, comparing by nTime:

static
CBlockIndex const* GetSuitableBlockNewVersion(CBlockIndex const* pindex) {
    assert(pindex->nHeight >= 3);
    return &median_3(*pindex->pprev->pprev, *pindex->pprev, *pindex, [](auto const& a, auto const& b){
        return a.nTime < b.nTime;
    });
}

We have one last problem to solve. Let’s make a small test of the original algorithm and the new one:

struct CBlockIndex {
    size_t nHeight;
    size_t nTime;
    CBlockIndex* pprev;
};

int main() {
    CBlockIndex ba {1, 1558731500, nullptr};
    CBlockIndex bb {2, 1558731500, &ba};        //same nTime as previous
    CBlockIndex bc {3, 1558730000, &bb};

    auto r = GetSuitableBlockNewVersion(&bc);
    cout << "GetSuitableBlockNewVersion: " << r->nHeight << endl;

    r = GetSuitableBlock(&bc);
    cout << "GetSuitableBlock:           " << r->nHeight << endl;
}

The code above prints:

GetSuitableBlockNewVersion: 1
GetSuitableBlock:           2

What we are trying to prove with the previous code is the stability of both algorithms. Our median_3 algorithm is stable which means that the relative order of the equivalent elements is preserved (for more information see here).

To prove it with data, we will use the previous example, in which we have the following input data for our algorithms:

s = [{1, 1558731500}, {2, 1558731500}, {3, 1558730000}]

Where the first element of each pair is nHeight, the identifier of the block, and the second element is the timestamp called nTime. Note that the nTime of the first 2 elements is the same.

If we sort the previous sequence by nTime using a stable ordering algorithm, such as Merge sort we would have something like this:

s = [{3, 1558730000}, {1, 1558731500}, {2, 1558731500}]

Note that the middle element is the one with nHeight = 1. This indicates that our algorithm behaved in a stable manner but not the original algorithm used in the Bitcoin Cash DAA.

In my first implementation of DAA in the Bitprim node I used a code similar to median_3 which was also stable, since I had not verified the code of the specification, I had mistakenly assumed that it was also stable. Then this caused runtime errors of our node on a difficulty change. It did not always happen, but there was a particular case in which we could detect it. After several hours of debugging I could detect that the problem was that the algorithm used by me was not compatible with the “specified” in DAA.

Therefore, I had to “correct” my algorithm to make it non-stable in the same way as that of the specification.

Actually, if I remember correctly, the first version of the DAA specification did not mention the GetSuitableBlock code, but said that the median of 3 elements was calculated. Since the implementation of the median was “incorrect” they had to adapt the specification to be consistent with the code. Keep in mind, that once the code of a Bitcoin node (or any cryptocurrency) is in operation, a modification in its behavior introduces incompatibilities with previous versions and produces the so-called forks. So once the code is running, it’s about not changing it. For this reason it is why the specification had to be adapted instead of correcting the code.

Before finishing, let’s make a comparison of both algorithms, GetSuitableBlock vs. median_3:

median_3 does not make any swap, GetSuitableBlock can make between 0, 7/6 or 2 swaps, unnecessarily. (Efficiency)
GetSuitableBlock creates an array, unnecessarily. (Efficiency)
median_3 performs 2, 8/3 or 3 comparisons, GetSuitableBlock always performs 3 comparisons. (Efficiency)
median_3 is stable, GetSuitableBlock is not. median_3 is what anyone expects from an algorithm that calculates the median of 3 elements. (Correctness)

And now, to conclude, some conclusions:

The author of the DAA specification could have chosen a known and “standard” algorithm, but he did not. And perhaps the worst of all is that the specification refers to the code. The code must never be specification. The code must be created from a specification. So if a specification refers to code, there is no such specification.

Bye!