A safer way to get/set parts of a compact number using contracts
Introduction
It is a common memory saving technique to combine multiple numeric parts into a single integer or natural number. A good example is compact_time: INTEGER
from class TIME_VALUE which combines the seconds, minutes and hours into a single integer. This is achieved using bit-masks and some bit-arithmetic. However these techniques generally do not have adequate contracts to protect against incorrect mask values or shift-count values. The Eiffel-Loop base library now has some utility classes to make setting numeric parts, not only safer, but easier. They have no dependency outside of the Eiffel kernel classes so can easily be just copied into your project.
TIME_VALUE as an example
The following code fragment demonstrates getting and setting the hour part of the compact_time integer.
class TIME_VALUE
feature -- Access
hour: INTEGER
do
Result := (compact_time & hour_mask) |>> hour_shift
end
compact_time: INTEGER
-- Hour, minute, second coded.
feature -- Element change
set_hour (h: INTEGER)
do
compact_time := compact_time & hour_mask.bit_not
compact_time := compact_time | (h |<< hour_shift)
end
feature {NONE} -- Implementation
Hour_mask: INTEGER = 0x00FF0000
Hour_shift: INTEGER = 16
end -- class TIME_VALUE
We can identify the following possible sources of programmer error which are not protected by any precondition or postcondition.
- hour_mask might not consist of a sequence of consecutive binary ones.
- hour_mask might not be big enough to accommodate the hour value
- The value of Hour_shift might not be consistent with the number of trailing binary zeros in hour_mask
- There is no "reversible" post-condition on hour function.
Of course it is possible to add all these contracts to TIME_VALUE but it would be tedious and add a lot of extra code.
A bit-routines class to the rescue
By using the class EL_INTEGER_32_BIT_ROUTINES the above code can be rewritten as follows, and as a bonus you get contracts for checking the mentioned sources of error for free. See ancestor class EL_NUMERIC_BIT_ROUTINES
class TIME_VALUE
feature -- Access
hour: INTEGER
local
i32: EL_INTEGER_32_BIT_ROUTINES
do
Result := i32.isolated (compact_time, hour_mask)
end
compact_time: INTEGER
-- Hour, minute, second coded
feature -- Element change
set_hour (h: INTEGER)
local
i32: EL_INTEGER_32_BIT_ROUTINES
do
compact_time := i32.inserted (compact_time, hour_mask, h)
end
feature {NONE} -- Implementation
Hour_mask: INTEGER = 0x00FF0000
end -- class TIME_VALUE
You will notice that there is no longer any need to specify the hour_shift as this is implicit in the number of trailing zeros of the mask. So this possible source of error is eliminated.
Thanks to a nice little algorithm found on Stackoverflow the shift count is calculated in only 7 steps. (8 for a 64-bit number).
Benchmark
I tried caching the shift_count in a hash table but it actually made the routine slower not faster. This shows how efficient the branching algorithm is. The iterative method uses repeated right-shifting but it is very slow compared to the branching one on Stackoverflow.
Output of class BIT_SHIFT_COUNT_COMPARISONPasses over 1000 millisecs (in descending order)
NATURAL branching method : 360.6 times (100%)
NATURAL cached branching method : 343.3 times (-4.8%)
NATURAL iterative method : 169.7 times (-53.0%)
Class Hierarchy
To date there are only routines for 32-bit and 64-bit numbers.
ADDENDUM
Since writing this article I discovered two other ways to calculate the position of an LSB. Using De Bruijn sequences was suggested to me by Chat GPT. I also had the idea to look for built-in compiler functions.
gcc: __builtin_ctz
MSVC (since 2005): _BitScanReverse
As you can see, built-in functions are way faster.
Passes over 1000 millisecs (in descending order)
NATURAL gcc __built_in_ctz : 4713.0 times (100%)
NATURAL de Bruijn method : 1682.3 times (-64.3%)
NATURAL branching method : 412.8 times (-91.2%)
NATURAL cached branching method : 398.2 times (-91.6%)
NATURAL iterative method : 200.0 times (-95.8%)