programing tip

C에서 부호없는 변환에 서명-항상 안전합니까?

itbloger 2020. 7. 11. 10:56
반응형

C에서 부호없는 변환에 서명-항상 안전합니까?


다음 C 코드가 있다고 가정하십시오.

unsigned int u = 1234;
int i = -5678;

unsigned int result = u + i;

어떤 암시 적 변환이 진행되고 있으며이 코드는 u및의 모든 값에 대해 안전 i합니까? ( 이 예제의 결과 가 엄청난 양의 숫자로 오버플로 될지 라도 int로 다시 캐스팅 하여 실제 결과를 얻을 수 있다는 점에서 안전합니다.)


짧은 답변

귀하가 i된다 변환 부가함으로써 부호없는 정수로 UINT_MAX + 1, 그 첨가는 큰 결과 부호없는 값으로 수행한다 result(의 값에 의존 u하고 i).

긴 답변

C99 표준에 따르면 :

6.3.1.8 일반적인 산술 변환

  1. 두 피연산자의 유형이 모두 같으면 더 이상 변환 할 필요가 없습니다.
  2. 그렇지 않으면 두 피연산자 모두 부호있는 정수 유형을 갖거나 둘 다 부호없는 정수 유형을 갖는 경우, 정수 변환 순위가 낮은 유형의 피연산자는 순위가 더 큰 피연산자의 유형으로 변환됩니다.
  3. 그렇지 않으면 부호없는 정수 유형을 가진 피연산자가 다른 피연산자의 유형 순위보다 크거나 같은 경우 부호있는 정수 유형을 가진 피연산자는 부호없는 정수 유형을 가진 피연산자의 유형으로 변환됩니다.
  4. 그렇지 않으면 부호있는 정수 유형의 피연산자 유형이 부호없는 정수 유형의 피연산자 유형의 모든 값을 나타낼 수 있으면 부호없는 정수 유형의 피연산자는 부호있는 정수 유형의 피연산자 유형으로 변환됩니다.
  5. 그렇지 않으면 두 피연산자가 부호있는 정수 유형의 피연산자 유형에 해당하는 부호없는 정수 유형으로 변환됩니다.

귀하의 경우, 서명되지 않은 int ( u) 및 signed int ( i)가 있습니다. 위의 (3)을 참조하면 두 피연산자가 모두 같은 순위를 가지므로 부호없는 정수 i변환 해야합니다 .

6.3.1.3 부호있는 정수와 부호없는 정수

  1. 정수 유형의 값이 _Bool 이외의 다른 정수 유형으로 변환 될 때 새 유형으로 값을 표시 할 수 있으면 변경되지 않습니다.
  2. 그렇지 않으면 새 유형에 부호가없는 경우 값이 새 유형의 범위에 올 때까지 새 유형에 표시 될 수있는 최대 값보다 하나 이상을 반복적으로 더하거나 빼서 값이 변환됩니다.
  3. 그렇지 않으면 새 유형이 서명되고 값을 표시 할 수 없습니다. 결과는 구현 정의되거나 구현 정의 신호가 발생합니다.

이제 위의 (2)를 참조해야합니다. 귀하는 i추가하여 부호없는 값으로 변환됩니다 UINT_MAX + 1. 따라서 결과는 UINT_MAX구현 에 어떻게 정의되어 있는지에 달려 있습니다. 크기는 크지 만 오버플로는 발생하지 않습니다.

6.2.5 (9)

부호없는 피연산자가 포함 된 계산은 결과 부호없는 정수 유형으로 표현할 수없는 결과가 결과 유형으로 표현할 수있는 가장 큰 값보다 1이 큰 모듈로 감소되므로 오버 플로우 할 수 없습니다.

보너스 : 산술 변환 Semi-WTF

#include <stdio.h>

int main(void)
{
  unsigned int plus_one = 1;
  int minus_one = -1;

  if(plus_one < minus_one)
    printf("1 < -1");
  else
    printf("boring");

  return 0;
}

이 링크를 사용하여 온라인으로 시도 할 수 있습니다 : https://repl.it/repls/QuickWhimsicalBytes

보너스 : 산술 변환 부작용

산술 변환 규칙은 UINT_MAX부호없는 값을로 초기화 하여 값을 가져 오는 데 사용할 수 있습니다 -1.

unsigned int umax = -1; // umax set to UINT_MAX

위에서 설명한 변환 ​​규칙으로 인해 시스템의 부호있는 숫자 표시에 관계없이 이식 가능합니다. 자세한 내용은이 SO 질문을 참조하십시오. -1을 사용하여 모든 비트를 true로 설정하는 것이 안전합니까?


부호있는 것에서 부호없는 것으로 변환한다고해서 반드시 부호있는 값의 표현을 복사하거나 해석하는 것은 아닙니다 . C 표준 인용 (C99 6.3.1.3) :

정수 유형의 값이 _Bool 이외의 다른 정수 유형으로 변환 될 때 새 유형으로 값을 표시 할 수 있으면 변경되지 않습니다.

그렇지 않으면 새 유형에 부호가없는 경우 값이 새 유형의 범위에 올 때까지 새 유형에 표시 될 수있는 최대 값보다 하나 이상을 반복적으로 더하거나 빼서 값이 변환됩니다.

Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.

For the two's complement representation that's nearly universal these days, the rules do correspond to reinterpreting the bits. But for other representations (sign-and-magnitude or ones' complement), the C implementation must still arrange for the same result, which means that the conversion can't just copy the bits. For example, (unsigned)-1 == UINT_MAX, regardless of the representation.

In general, conversions in C are defined to operate on values, not on representations.

To answer the original question:

unsigned int u = 1234;
int i = -5678;

unsigned int result = u + i;

The value of i is converted to unsigned int, yielding UINT_MAX + 1 - 5678. This value is then added to the unsigned value 1234, yielding UINT_MAX + 1 - 4444.

(Unlike unsigned overflow, signed overflow invokes undefined behavior. Wraparound is common, but is not guaranteed by the C standard -- and compiler optimizations can wreak havoc on code that makes unwarranted assumptions.)


Referring to the bible:

  • Your addition operation causes the int to be converted to an unsigned int.
  • Assuming two's complement representation and equally sized types, the bit pattern does not change.
  • Conversion from unsigned int to signed int is implementation dependent. (But it probably works the way you expect on most platforms these days.)
  • The rules are a little more complicated in the case of combining signed and unsigned of differing sizes.

When one unsigned and one signed variable are added (or any binary operation) both are implicitly converted to unsigned, which would in this case result in a huge result.

So it is safe in the sense of that the result might be huge and wrong, but it will never crash.


When converting from signed to unsigned there are two possibilities. Numbers that were originally positive remain (or are interpreted as) the same value. Number that were originally negative will now be interpreted as larger positive numbers.


As was previously answered, you can cast back and forth between signed and unsigned without a problem. The border case for signed integers is -1 (0xFFFFFFFF). Try adding and subtracting from that and you'll find that you can cast back and have it be correct.

However, if you are going to be casting back and forth, I would strongly advise naming your variables such that it is clear what type they are, eg:

int iValue, iResult;
unsigned int uValue, uResult;

It is far too easy to get distracted by more important issues and forget which variable is what type if they are named without a hint. You don't want to cast to an unsigned and then use that as an array index.


What implicit conversions are going on here,

i will be converted to an unsigned integer.

and is this code safe for all values of u and i?

Safe in the sense of being well-defined yes (see https://stackoverflow.com/a/50632/5083516 ).

The rules are written in typically hard to read standards-speak but essentially whatever representation was used in the signed integer the unsigned integer will contain a 2's complement representation of the number.

Addition, subtraction and multiplication will work correctly on these numbers resulting in another unsigned integer containing a twos complement number representing the "real result".

division and casting to larger unsigned integer types will have well-defined results but those results will not be 2's complement representations of the "real result".

(Safe, in the sense that even though result in this example will overflow to some huge positive number, I could cast it back to an int and get the real result.)

While conversions from signed to unsigned are defined by the standard the reverse is implementation-defined both gcc and msvc define the conversion such that you will get the "real result" when converting a 2's complement number stored in an unsigned integer back to a signed integer. I expect you will only find any other behaviour on obscure systems that don't use 2's complement for signed integers.

https://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html#Integers-implementation https://msdn.microsoft.com/en-us/library/0eex498h.aspx


Horrible Answers Galore

Ozgur Ozcitak

When you cast from signed to unsigned (and vice versa) the internal representation of the number does not change. What changes is how the compiler interprets the sign bit.

This is completely wrong.

Mats Fredriksson

When one unsigned and one signed variable are added (or any binary operation) both are implicitly converted to unsigned, which would in this case result in a huge result.

This is also wrong. Unsigned ints may be promoted to ints should they have equal precision due to padding bits in the unsigned type.

smh

Your addition operation causes the int to be converted to an unsigned int.

Wrong. Maybe it does and maybe it doesn't.

Conversion from unsigned int to signed int is implementation dependent. (But it probably works the way you expect on most platforms these days.)

Wrong. It is either undefined behavior if it causes overflow or the value is preserved.

Anonymous

The value of i is converted to unsigned int ...

Wrong. Depends on the precision of an int relative to an unsigned int.

Taylor Price

As was previously answered, you can cast back and forth between signed and unsigned without a problem.

Wrong. Trying to store a value outside the range of a signed integer results in undefined behavior.

Now I can finally answer the question.

Should the precision of int be equal to unsigned int, u will be promoted to a signed int and you will get the value -4444 from the expression (u+i). Now, should u and i have other values, you may get overflow and undefined behavior but with those exact numbers you will get -4444 [1]. This value will have type int. But you are trying to store that value into an unsigned int so that will then be cast to an unsigned int and the value that result will end up having would be (UINT_MAX+1) - 4444.

Should the precision of unsigned int be greater than that of an int, the signed int will be promoted to an unsigned int yielding the value (UINT_MAX+1) - 5678 which will be added to the other unsigned int 1234. Should u and i have other values, which make the expression fall outside the range {0..UINT_MAX} the value (UINT_MAX+1) will either be added or subtracted until the result DOES fall inside the range {0..UINT_MAX) and no undefined behavior will occur.

What is precision?

Integers have padding bits, sign bits, and value bits. Unsigned integers do not have a sign bit obviously. Unsigned char is further guaranteed to not have padding bits. The number of values bits an integer has is how much precision it has.

[Gotchas]

The macro sizeof macro alone cannot be used to determine precision of an integer if padding bits are present. And the size of a byte does not have to be an octet (eight bits) as defined by C99.

[1] The overflow may occur at one of two points. Either before the addition (during promotion) - when you have an unsigned int which is too large to fit inside an int. The overflow may also occur after the addition even if the unsigned int was within the range of an int, after the addition the result may still overflow.


On an unrelated note, I am a recent graduate student trying to find work ;)

참고URL : https://stackoverflow.com/questions/50605/signed-to-unsigned-conversion-in-c-is-it-always-safe

반응형