The complements are also really useful for subtracting numbers. Not only in the binary system, but also in the decimal system. They allow you to subtract by adding numbers.
Let's say you need to calculate
8467
-4583
The ten's complement of 4583 is 5417 (the complement is the remainder to the next power of ten, in that case to 10000). A cool property of the ten's complement in the decimal system is that the subtraction above can be rewritten as
8467
+5417
-10000
Which results in
13884
-10000
------
3884
This is much easier to calculate than the direct subtraction. It exploits the fact that calculating -x is equal to +(10000 - x) - 10000.
The same works for binary and the two's complement, where it is even easier, because calculating the two's complement is equal to inverting all digits and adding 1.
It works for any number system by using the complement of the system's base.
It looks like here you are calculating 9999 - 4583 = 5416 (an easy subtraction), then adding one.
> (the complement is the remainder to the next power of ten, in that case to 10000)
Is remainder a correct term here? It's confusing because if you divide 4583/10000, the remainder is 4583. Modulo of -4583 by 10000 is 5417 though.
> This is much easier to calculate than the direct subtraction.
This is not clear to me. Is it because when you calculate 9999-yyyy, you never need to borrow? Or when you calculate 10000-yyyy, you always make a simple borrow? In any case, I feel this still takes some practise because what was a single subtraction becomes two or more steps.
Direct subtraction with paper and pencil isn't much more difficult than direct addition anyway. If you want to do it in your head, I'd prefer a single-step method instead of having to remember the intermediate results.
they've actually combined some of the steps which might make it easier to understand. It's important that we keep in mind we have 4 digits.
8467 - 4583 =
8467
+ 5416 (9's complement of 4583)
= 13883
+ 1 (10's complement)
= 13884
- 10000
= 3884
>> Direct subtraction with paper and pencil isn't much more difficult than direct addition anyway. If you want to do it in your head, I'd prefer a single-step method instead of having to remember the intermediate results.
This isn't really a technique for humans, but computers. You can implement add with carry pretty easily with a bunch of logic gates, but what about subtraction where you need to jump across several columns to borrow, keeping track across the entire minuend? Computers don't typically have those little tick marks beside each digit, or scrap paper. It is easy to set a "subtracting" flag though, and invert the subtrahend (1's compliment), and then feed the first carry-in of the ALU with the same signal as the subtract flag (add 1 for 2's complement). You could come up with your own complete subtraction logic, but it's definitely easier to use the same core addition with a tiny bit of extra to handle subtraction and then use a method like this.
Did you calculate the 9's complement of 4583 in your head? I see that 5416 has individual digits that line up with 4583 to each equal 9, but I'm not clear if this is a coincidence that happens to work for this number, or if this is the way to calculate an arbitrary N's complement value.
I thought there was more to the calculation of N's complement. Something about adding one to the result. It doesn't seem like something you can easily perform using inspection.
Imagine you're doing your math on a car odometer that only goes up to 4 digits. When you reach 10k it wraps around. For any X it is easy to find the complement ~X because you can just map the digits independently (0,1,2,3,4 -> 9,8,7,6,5 and vice versa) then adding 1. By construction each digit in X + (~X-1) will be 9, then adding the 1 back in brings it to 10000, which in odometer math wraps around to 0. So ~X is the additive inverse of X meaning that adding ~X always brings you to the same number as subtracting -X.
Or think of a clock. Subtracting X hours is always the same as adding (12-X) hours. Or any circle, turning counterclockwise X degrees is always the same as turning clockwise (360-X) degrees. The trick is being able to take the compliment digit-wise in order to replace a subtraction with an addition. In binary this is just flipping 0 to 1 and vice versa. In base 10 you just need to remember 5 pairs of digits to switch out with each other.
I wish I'd been taught that as a child instead of the conventional method. I had no end of trouble with subtraction because we were taught ‘borrowing’, and being a moderately decent kid I always gave back what I borrowed.
I don't like the term 'carrying' because it refers to a superficial aspect of the method (that you carry the number from one place to another) rather than the essence of the operation.
When you add, 15+6, you add 5+6 to get 11, and then you regroup the 11 ones into 1 ten and 1 one.
When you subtract 21-6, you regroup the 2 tens into 1 ten and 10 ones. With the original 1 one, you now have 11 ones. Then you subtract the 6 from the 11 ones to get 5 as the last digit.
> The same works for binary and the two's complement, where it is even easier, because calculating the two's complement is equal to inverting all digits and adding 1.
It’s the same in base 10, if by inverting a digit x you mean replacing it by 9-x.
This explanation has made me realize i can’t remember the last time I tried to substract two four digits numbers without a calculator. I’m not even sure I know how
Interesting cultural differences here: the way I learned it, you increment the next digit of the subtrahend (the lower number) on digit underflow, rather than borrowing from the next digit in the minuend. (So it's rather a carry than borrowing.) Anyways, explicit borrow does for a nice explanation.
That is interesting! I could see that make learning it easier since it's closer to addition if you carry instead of borrow. Though one advantage I see to the borrow is, since you're modifying the minuend, you have room to write the modified value of the digit directly above itself.
If I recall this right (it's been a long time, since), initially we marked the column where the carry went by a dot above the minuend. (I guess, it was the same with addition?)
I guess that’s how I would attempt it if I had to- reverse of addition. Just can’t remember when I have had to. Funny how addition and subtraction are unequally distributed in life
How so, you still have to keep track of the overrun, but this time even twice. First when doing 10000 - 4583 and second when adding? I don't see how that would be easier?
I think, you can't address two's complement without mentioning ones' complement.
So what is ones' complement? Simply all bits flipped. (XOR the word length – each bit becomes its complement.) There's a simplicity and beauty to this and math just works with addition and subtraction. There's also the notable feature of the most significant bit, which, if excluded from the usable range of numbers, becomes the sign-bit. If it is empty (clear), it must be a positive number, if it's set, we have a negative number (since it must be a clear bit flipped).
So everything perfect, then? Not at all. What happens, if we flip all bits on zero (0000)? Well, it becomes all bits set (1111). How do we convert a number to it's signed counterpart? We flip all bits and … Oh, this must be negative zero. Maybe we can deal with this? Sort of. But it's somewhat nasty, because we need an extra steps to traverse zero. Say, we go from +1 to -1, there isn't just a zero in between, making this a difference of 2 – as it should be –, but there's +0 and -0. Three steps. That's odd.
Can we do something about this? Namely, can we get rid of negative zero? Well, as we've seen, we have an extra step on the negative side of things… what about just adding 1 to compensate for this? So, for -1, we wouldn't write 1110 (flipping all bits of 0001), but 1111? Just the same, for -2, we would add 1 to 1101, making this 1110, and so on. – Well, this works! we just eliminated negative zero! Welcome to two's complement.
(So, is it now perfect? Well, sort of. We now have introduced a certain asymmetry into our number system: on the side of things, where the sign-bit is clear, zero is the first number and +1 the second one, while numbers with the sign-bit set start with -1. Thus, we have an excess number on the negative side of the number range described by our word length. We just can't have it all. Well, we could switch to balanced ternary, but this is another story…)
Why does starting with ones' compliment make things easier? IMV, it makes it more confusing, because two's compliment seems like an arbitrary leap that happens to work.
If you're tasked with mapping a subset of bit-states to negative numbers, it's intuitively obvious that there's already a state that yields 0 when you add 1 to it, as long as you wrap on overflow. With four bits that's 1111. That's a great representation of -1.
If you want to get 0 by adding 2, you'd have to start from 1110, and so on.
Just because the top bit can be used to identify a negative number when you use up half the states, doesn't mean it should be thought of as a logical flag. That's how you invent ones' compliment.
I think of ones' compliment as a logical encoding, and two's compliment as an arithmetic encoding, and it shows in how easy negation is in ones' compliment, and how easy arithmetic is in two's compliment.
The fact that everyone tries to teach them as if one is a precursor to the other is what's confusing.
It's really, because it's were we came from. Early computers used ones' complement and there had to be extra circuitry to fix the -0 issue. Two's complement was explicitly introduced to fix this. – So how do you explain a fix without mentioning the bug?
Notably, ones' complement is easy to operate on, when reading values from blinkenlights or entering them via toggle switches, especially with octal grouping in triplets of bits. Which isn't that true for two's complement representation, as it is less intuitive (and digit positions shift with negative numbers). And the introduction of two's complement roughly coincides with operator consoles and switches becoming less important.
(It's true, if two's complement does work, you must be able to express and generalize this in terms of number theory, but this doesn't mean that it really stands on its own feet in terms of raison d'être.)
Regarding the sign-bit, this is an essential feature: checking the sign-bit for branching (or skipping) is the most basic approach to Turing complete computing. (Turing, the man himself, thought it is all there should be.) And it's also an essential bridge between numeric and logic computing, e.g., when we shift or rotate a bit vector to inspect the bit now in sign position.
There's also signed-magnitude, which Burroughs used.
Burroughs even had a unified floating point and integer representation. 48-bit floats had a sign, an exponent sign, an exponent, and a mantissa. The binary point was at the low end of the mantissa, and if both inputs had a zero exponent and the result could be represented with a zero exponent, it would be. No need for a float/integer distinction in programs.
> There's also the notable feature of the most significant bit, which, if excluded from the usable range of numbers, becomes the sign-bit. If it is empty (clear), it must be a positive number, if it's set, we have a negative number (since it must be a clear bit flipped)
There also is the problematic feature that, according to that logic, zero is a positive number :-)
> Can we do something about this? Namely, can we get rid of negative zero? Well, as we've seen, we have an extra step on the negative side of things… what about just adding 1 to compensate for this? So, for -1, we wouldn't write 1110 (flipping all bits of 0001), but 1111? Just the same, for -2, we would add 1 to 1101, making this 1110, and so on. – Well, this works! we just eliminated negative zero! Welcome to two's complement.
I think that the “flip all the bits, add one” method to flip the sign in such a number came after the idea of using that mapping between numbers and bit patterns.
I’m thinking that because the mapping naturally comes from the observation that unsigned n-bit integers do math modulo 2^n, and that, thus, the all-ones pattern stands for
{(2^n 1) + k 2^n | k ∈ ℤ}
⇒ it’s an arbitrary choice whether to pick 2^n-1 and -1 as representatives of that set.
⇒ the hardware for implementing addition, subtraction and multiplication can be 100% identical, whatever representatives you pick for those numbers.
One thing this loses compared to the ones’ complement is that it is harder to flip the sign of a number. However, it isn’t hard to discover the “flip all the bits, add one” method.
A thing that it doesn’t really lose is that this encoding has a ‘weird’ number. To flip the sign of 1000…000, you compute 0111…111 + 1, which is 1000…000, so in some sense, that’s a second number that is its own negative in this encoding.
That isn’t much different from two’s complement. It has two zeroes, this has one zero and another number that’s equal to its own negation.
I think, it may be worth keeping in mind how important it was to have an intuitively accessible representation in the era of console lights and toggle switches. Ones' complement and octal grouping (it's easy to interpret a triplet of bits) did this. – And subtraction was just addition with one of the operands inverted. On the other hand, you had to compensate for this by adding extra circuitry to handle transitions across zero (i.e., a change of the sign-bit occurring in additions or subtractions), and checking for -0 in code as well. As operator consoles became less important, this became less worth the effort and we transitioned. (Mind how the connection between octal digit positions and bits is lost in two's complement. This would have been a major issue.)
I think, it's more an aesthetic than a technical annoyance. But, since there was ones' compliment first (implemented in about every early computer), it may be worth mentioning that there was something lost, as well. (So, for a byte, the range is now -128…+127, which isn't that pretty, while it had been -127…+127 with ones' complement.)
You may be right about NaN representation, but I don't know for certain. Notably, negative zero lived on in scientific computing, and you can have it even in JS, if you set the language version to "1.2".
(Edit/Correction: setting the type to "javascript1.2" no longer works, all versions are now synonymous with just plain "javascript".)
I think, it wasn't so much for speed as for then expensive extra circuitry. (On the few vintage instruction sets using ones' complement, I know, addition and subtraction still executes in the same time as any other internal ALU instruction. Then, cycle times were still mostly owed to slow core memory cycles and the fasted you'd ever get was a single memory cycle.)
One way to look at it is that the highest bit is negative instead of positive. So in an unsigned 8-bit integer, the highest bit is worth 128. In a signed integer, it's worth -128.
I think it's best to learn some basic maths by heart and also learn the idea so that you can reconstruct the rest if you forget or begin to doubt.
The idea of two's complement to remember: addition and subtraction for signed ints works the same way on the bit level as for unsigned ints. For signed ints, the highest bit is interpreted as the negative sign. That's it. (This is on top of the idea of how unsigned ints work ie. modulo arithmetic: a carry/borrow bit that doesn't fit in the int is thrown away)
The consequences to learn by heart if you work with signed and unsigned 8-bit ints (like in this article):
* The biggest unsigned int is 255 (2⁸-1; all 1s in binary) and smallest is 0 (all
0s in binary). Increment and decrement jump between those in overflow/underflow (the modulo arithmetic).
* As the highest bit is needed for the negative sign, the biggest signed int is 127 (2⁷-1; all 1s except the sign bit) and smallest is -128 (-2⁷; only the sign bit set). Increment and decrement jump between those in overflow/underflow.
I think the dial visualisations are great in this article for a mental model of the modulo arithmetic and how the overflow/underflow point is the difference between unsigned and signed arithmetic. Perhaps it would be even clearer to rotate them half a step so that the overflow point is straight up/down and the numbers with the highest bit set are all on the left side (hence 0 taking one of the slots on the positive side).
An (in my opinion) easier way to remember the bit representation is to remember that a positive number and its negative counterpart should add to 0. But zero is congruent to 2^b (where b is the number of bits). So equivalently, a number and its negative counterpart should add up to 2^b (since bits can’t add to 0).
Then, suppose we take a positive number x and invert all its bits to get y. Then x + y = 1111… = 2^b-1. So if we choose one of x or y to represent with +1 in its bit representation, they will be positive/negative counterparts. It’s easier to keep positive numbers having the same representation as unsigned, so we choose to represent the bit value of -x with +1.
For context: This is the author of the Chrome Linux port, of Ninja-the-build-system, and of GtkSpell. Now I feel less bad about not having understood two's complement before. :)
I feel lucky to have gotten two's complement early.
Turns out that if you truly understand two's complement, you can implement signed math in C without UB if you use unsigned types because overflow on unsigned types is defined.
Side note: this is another reason I won't touch Zig; UB on unsigned overflow is too high of a price to pay.
I think this is the reason: When you negate a number in two's complement, you take (a single) two to the power of the bit width and then subtract the number. For example, negating 0011b (4 bits) means taking 10000b and subtracting 0011b, giving you 1101b.
When you negate a number in ones' complement, you take as many ones (plural) as the bit width and then subtract the number. For example, negating 0011b (4 bits) means taking 1111b and subtracting 0011b, giving you 1100b.
Yes, that’s basically the algorithm, but it took me a while to understand why it works.
The subtraction from base^n is the radix complement, the subtraction from base^n-1 is the diminished radix complement.
Example base 9: with a fixed number width of say four, e.g. 4781, you take the radix complement by subtracting from 9^4, which is 10000 in base 9. But to take the complement, you again need subtractions with nasty borrows. But you wanted to reuse your adder and not have a subtractor to take the complement.
So you observe, that the formula is the same as 10000-1-4781+1 (subtract one and add it back). Subtracting the one gives you 8888. this solves your problem, because every digit is the highest one in its base, so you will never have to borrow. This is the diminished radix complement, to which you add one to get back to the radix complement.
Back to base 2: you take the diminished radix complement to avoid borrows (which is conveniently just a bit flip in base 2) and add one to arrive at the radix complement. Thus the familiar algorithm: „flip all bits and add one for two‘s complement“.
I hope I did a better job explaining it in my post linked above.
To me, the clearest way to think about signed vs. unsigned integers is that different representatives of the integers modulo n are chosen.
For example, for 8-bit signed integers we choose the representatives -128, -127, ..., 127 of the residue classes -128 + 256Z, -127 + 256Z, ..., 127 + 256Z in the ring of integers modulo 256.
For 8-bit unsigned integers, we instead choose the representatives 0, 1, ..., 255.
Mathematically, I do not see how anything is "breaking" as the article claims.
If the problem is remembering wether to add or subtract 1 after complementing, then one just needs to take decimal 0 = 0x00 as example: complement (0xFF) then necessarily add 1 (0x00) to get -0 = 0.
"Finally got it" aka "what it really is" is always an arbitrary interpretation. Why not view it as a finite cousin of 2-adic numbers? The article doesn't even mention modular arithmetic.
Here, "finally getting it" means going from "the way computers represent negative numbers, chosen to make the math work out" to "the way computers represent negative numbers, chosen to make the math work out, but I poked around with a few examples."
Also, a monad is just a monoid in the category of endofunctors, so I’m not sure why when people are trying to learn about them they need any more than that. ;)
When you advertise your article as helping your readers "finally get" a concept, then giving them "more than that" is precisely what you should wanna do, but also giving them more than the obvious examples that are available to anyone.
Haha! I actually made the connection when viewing the recent Veritasium video on p-adic numbers. They were walking through an proof of how ...9999 = -1, because if you add one to it, it yield zero. I was like, that's just like two's complement, and how you represent -1 by filling the places with the largest representable digit!
What the author is saying about remembering the math vs the illustration took me a second to get. The discs should probably be rotated slightly to make it more obvious that the axis you're flipping over when you take the bitwise not rests between -1 and 0, not vertically through 0 itself. When you look at it like that, it's much easier to see that whenever you flip across that axis, you end up needing to take a step to the right whether you're converting negative to positive or positive to negative.
And the reason why it's the same rightward movement for going from negative to positive and from positive to negative is because a move to the right on one side is effectively a move to the left on the other side. I see it in my head as there being a phantom mirrored dot always indicating where you will end up when you flip across the axis.
Another way to understand what e.g. 0xFE means as a signed integer is to just add a number until it equals 0. So, 0xFE + 1 = 0xFF, not zero. 0xFE + 2 = 0. Therefore, 0xFE is -2.
Two's complement was described carefully in Preliminary design of the logical design of an electronic computing instrument¹ §5.7 (Burks, Goldstine, von Neumann, 1946), but it's not an easy read. I suspect some early machines used signed-magnitude or one's complement simply because they didn't get it.
I’ll blame CPU makers when you can name one CPU in current use that isn’t bog standard 2’s complement that wraps around by default.
About Zig, you probably meant implementation defined, or at the very worst unspecified. In C standard parlance those aren’t undefined at all. They can’t encrypt your hard drive just because you relied on them. Undefined behaviour however, can. And does, when UB happens to enable an arbitrary code execution vulnerability.
Zig does have modular `+%` and saturating `+|` operators, as well as `@addWithOverflow()`, though. You get to choose what happens on overflow; if you don't, the compiler gets to.
having overflow of a common loop type be undefined is extremely useful for optimization, since many kind of unrolling and vectorization approaches only work if you can assume the loop counter cannot wrap.
Since the decimal number system is "the" system of all number systems (as we are pestered with it from an early age before our brains have developed suitable abstraction capability until it just becomes second nature) it can be advantageous to think through how ten's complement representation works in the decimal number system.
Two's complement is a special case of p-adic numbers. A recent introductory video comes to mind: https://www.youtube.com/watch?v=tRaq4aYPzCc Veritasium - "A totally different way to do math" (33m05s, 2023-06-06)
> Suppose you have the bit pattern 0xFE and you add 0x1 to it, producing 0xFF. If you interpret that operation as signed math it's computing 254 + 1 = 255, while if you interpret it as unsigned math it was -2 + 1 = -1.
Isn't that backwards? The first one is unsigned, the second one is signed.
Actually they are neither signed nor unsigned, but modular numbers, modulo 256.
Any number modulo 256 is a representative for an infinity of integer numbers, e.g. 0xff stands for ... -257, -1, 255, 511, ... and 0x01 stands for ... -255, +1, 257, 513, ... and 0xfe stands for ... -258, -2, 254, 510, ... and 0x80 stands for ... -384, -128, +128, 384, ...
In hardware, the operations with modular numbers are implemented first, and then, in addition to them, a few Boolean functions are added, to allow the interpretation of the modular number operations as signed integer operations and/or unsigned integer operations. For the former overflow and sign are needed, while for the latter carry/borrow is needed.
Besides comparison of non-equal numbers, the most important operation that is meaningless for modular numbers, so it is defined only for signed or unsigned operands, is extension from a binary format with less bits to a format with more bits, a.k.a. sign extension for signed operands and zero extension for unsigned operands. Comparison and extension depend on overflow, sign and carry.
So normally, when attempting to understand a sequence of operations on binary numbers, they should be performed as modular operations and only the final result should be interpreted as desired, e.g. as signed or unsigned.
(For efficiency reasons, the actual implementation of two's complement multiplication and division with double-length product or dividend is a little more complex than this, because they correspond with double-length modular operations, but instead of that they are implemented by distinct algorithms operating on single-length numbers. When trying to verify such operations, it is still possible to compute them as double-length modular operations, if correct conversions are used, i.e. with sign extension)
Having started with assembly programming, I see 'signed' and 'unsigned' as just two different views on the same 'sign-agnostic' bag of bits (which of course is the whole point of 2's-complement), a bit like Schroedinger's Cat: a number is neither signed nor unsigned until you look at it.
From that point of view (that signedness is just a different view on the same bits) it's a bit weird that 'signedness' ended up in type systems of high level languages, and isn't just a printf formatting feature (the only other place where signedness matters is when extending narrow integers to wider integers, where the top-most bit either needs to be set to zero for 'unsigned' or replicated for 'signed').
I haven't done the binary math to figure it out so far, but I wonder if signed vs unsigned division is also just doing an internal sign-extension to a wider type before the actual 'sign-agnostic' operation (e.g. a special case of the general sign-extension to a wider type). But as I said, not completely sure if that's the case, or if there are two entirely different hardware blocks for signed vs unsigned division. Same for muls vs mulu btw.
I'm still trying to understand is twos-complement multiplication.
When you're computing the product of two n-bit numbers, the product potentially requires 2n bits of representation.
As I discovered in the RISC-V "M" instructions [0], the n least-significant bits of the result are the same, regardless of whether each operand is signed or unsigned.
However, the most-significant n-bits of the result are contingent on the signed-ness of the two operands.
Anyone know where I find get a good explanation for that?
Try multiplying signed and unsigned 16-bit integers with values in the range -128 (65408) thru +255. Signed n-bit multiplication is just multiplication with sign-extension to 2n bits.
It's worth highlighting that this works for all bit widths, including infinite. When working with infinite representations, the theoretical analogue is p-adic numbers
To be honest, I can understand why it's not very clear to most people. I think you need a picture instead of the paragraph.
> Given a set of all possible N-bit values,
I think they lost most readers at this point with the pseudo-mathematical language. Why think of a set (when the values are ordered)? What are bit values and what's N? ("combinations of N bits" or "N-bit binary values" would be clearer.)
> we can assign the lower (by the binary value) half to be the integers from 0 to (2**(N − 1) − 1) inclusive
Or here: what does it mean to assing the "N-bit values" to be integers, what is a lower half of a set, "N-bit values by the binary value"? Where does the expression (2**(N-1)-1) come from?
The real way to 'get' two's compliment is to implement the combinatorial logic (i.e., gates) to implement both one's complement and two's complement. You'll instantly understand the value of two's complement!
I always just thought of it as, the uppermost bit is that power of two but negative. So for four bits from right to left, if they're set to one you add 1, 2, 4... -8.
How do you represent a negative number in binary? How do you represent 0 in Roman numerals? How do you represent a negative number in decimal? How do you represent a decimal number in decimal? Short answer, you can’t, so you have to cheat. — and . are not decimal numerals just as 0 is not a Roman numeral. Balanced ternary allows you to represent negative numbers without cheating, but as long as we use binary computers, we will have to cheat, and 2’s complement along with fixed sized integers is how we cheat.
The article explains how it's not cheating: you are using modulo arithmetic anyway (the dial visualisation in the article), you can pick which of the congruent numbers is the canonical one, so you choose to pick a negative one for those whose highest bit is set.
Let's say you need to calculate
The ten's complement of 4583 is 5417 (the complement is the remainder to the next power of ten, in that case to 10000). A cool property of the ten's complement in the decimal system is that the subtraction above can be rewritten as Which results in This is much easier to calculate than the direct subtraction. It exploits the fact that calculating -x is equal to +(10000 - x) - 10000.The same works for binary and the two's complement, where it is even easier, because calculating the two's complement is equal to inverting all digits and adding 1. It works for any number system by using the complement of the system's base.