“while (1);”최적화

programing tip

“while (1);”최적화

itbloger 2020. 6. 12. 21:42

“while (1);”최적화 C ++ 0x에서

업데이트, 아래 참조!

C ++ 0x를 사용하면 컴파일러가 다음 코드 조각에 대해 "Hello"를 인쇄 할 수 있습니다.

#include <iostream>

int main() {
  while(1) 
    ;
  std::cout << "Hello" << std::endl;
}

분명히 스레드 및 최적화 기능과 관련이 있습니다. 이것은 많은 사람들을 놀라게 할 수 있다고 생각합니다.

누군가 이것이 왜 이것이 허용되어야했는지에 대해 잘 설명하고 있습니까? 참고로 가장 최근의 C ++ 0x 초안은6.5/5

for 문의 경우 for-init-statement 외부에있는 루프

라이브러리 I / O 함수를 호출하지 않으며

휘발성 개체에 액세스하거나 수정하지 않으며

동기화 작업 (1.10) 또는 원 자성 작업 (원인 29)을 수행하지 않습니다

구현에 의해 종료 될 것으로 가정 될 수있다. [참고 : 이것은 종료를 증명할 수없는 경우에도 빈 루프 제거와 같은 컴파일러 변환을 허용하기위한 것입니다. — 끝 참고]

편집하다:

이 통찰력있는 기사 는 표준 텍스트에 대해 말합니다.

불행히도, "정의되지 않은 행동"이라는 단어는 사용되지 않습니다. 그러나 표준에 "컴파일러가 P를 가정 할 수 있음"이라고 표시되면 속성이 P가 아닌 프로그램에 정의되지 않은 의미가있는 것입니다.

맞습니까? 컴파일러가 위 프로그램에 대해 "Bye"를 인쇄 할 수 있습니까?

여기에 더 많은 통찰력있는 스레드가 있습니다 .이 스레드 는 Guy가 위의 링크 된 기사를 시작한 C와 비슷한 변화에 관한 것입니다. 다른 유용한 사실 중에서도 C ++ 0x에도 적용되는 솔루션을 제시합니다 ( 업데이트 : n3225에서는 더 이상 작동하지 않습니다-아래 참조!)

endless:
  goto endless;

컴파일러는 그것을 멀리 최적화 할 수 없습니다. 루프가 아니라 점프이기 때문입니다. 다른 사람은 C ++ 0x 및 C201X의 제안 된 변경 사항을 요약합니다.

루프를 작성하여, 프로그래머는 주장한다 중 루프가 표시 동작에 무언가 (행한다 I / O, 휘발성 액세스 개체 또는 수행 동기 원자 또는 동작)을 수행하는 것이, 또는 그것이 결국 종료있다. 부작용없이 무한 루프를 작성하여 그 가정을 위반하면 컴파일러에 거짓말을하고 프로그램의 동작이 정의되지 않습니다. 운이 좋으면 컴파일러에서 경고를 표시 할 수 있습니다. 언어는 눈에 보이는 동작없이 무한 루프를 표현할 수있는 방법을 제공하지 않습니다 (더 이상 제공하지 않습니까?).

n3225로 3.1.2011 업데이트 :위원회에서 텍스트를 1.10 / 24로 이동하고

구현시 모든 스레드가 결국 다음 중 하나를 수행한다고 가정 할 수 있습니다.

끝내다,

라이브러리 I / O 함수를 호출하고

휘발성 개체에 액세스하거나 수정하거나

동기화 작업 또는 원자 작업을 수행합니다.

goto트릭은 것 없습니다 더 이상 작동!

누군가 이것이 왜 이것이 허용되어야했는지에 대해 잘 설명하고 있습니까?

그렇습니다. Hans Boehm은 N1528 에서 이에 대한 이론적 근거를 제공합니다 . 왜 무한 루프에 대해 정의되지 않은 동작이 발생합니까? 이 문서는 WG14 문서이지만 이론적 근거는 C ++에도 적용되며 문서는 WG14와 WG21을 모두 참조합니다.

N1509가 올바르게 지적했듯이 현재 초안은 본질적으로 6.8.5p6의 무한 루프에 정의되지 않은 동작을 제공합니다. 그렇게하는 주요 문제는 코드가 종료되지 않는 루프를 통해 코드를 이동할 수 있다는 것입니다. 예를 들어, count와 count2가 전역 변수이거나 주소를 가지고 있고 p가 주소를 가져 오지 않은 지역 변수 인 다음과 같은 루프가 있다고 가정합니다.
for (p = q; p != 0; p = p -> next) {
    ++count;
}
for (p = q; p != 0; p = p -> next) {
    ++count2;
}
이 두 루프를 병합하여 다음 루프로 교체 할 수 있습니까?
for (p = q; p != 0; p = p -> next) {
        ++count;
        ++count2;
}
무한 루프에 대해 6.8.5p6에 특별한 경시가 없으면 이것이 허용되지 않습니다. q가 순환 목록을 가리켜 서 첫 번째 루프가 종료되지 않으면 원본은 count2에 쓰지 않습니다. 따라서 count2에 액세스하거나 업데이트하는 다른 스레드와 병렬로 실행될 수 있습니다. 무한 루프에도 불구하고 count2에 액세스하는 변환 된 버전에서는 더 이상 안전하지 않습니다. 따라서 변환으로 인해 데이터 경쟁이 발생할 수 있습니다.

이와 같은 경우 컴파일러가 루프 종료를 증명할 가능성은 거의 없습니다. q는 비순환 목록을 가리킨다는 것을 이해해야합니다. 비순환 목록은 대부분의 주류 컴파일러의 능력을 넘어서고 종종 전체 프로그램 정보가 없으면 불가능하다고 생각합니다.

The restrictions imposed by non-terminating loops are a restriction on the optimization of terminating loops for which the compiler cannot prove termination, as well as on the optimization of actually non-terminating loops. The former are much more common than the latter, and often more interesting to optimize.

There are clearly also for-loops with an integer loop variable in which it would be difficult for a compiler to prove termination, and it would thus be difficult for the compiler to restructure loops without 6.8.5p6. Even something like
for (i = 1; i != 15; i += 2)
or
for (i = 1; i <= 10; i += j)
seems nontrivial to handle. (In the former case, some basic number theory is required to prove termination, in the latter case, we need to know something about the possible values of j to do so. Wrap-around for unsigned integers may complicate some of this reasoning further.)

This issue seems to apply to almost all loop restructuring transformations, including compiler parallelization and cache-optimization transformations, both of which are likely to gain in importance, and are already often important for numerical code. This appears likely to turn into a substantial cost for the benefit of being able to write infinite loops in the most natural way possible, especially since most of us rarely write intentionally infinite loops.

The one major difference with C is that C11 provides an exception for controlling expressions that are constant expressions which differs from C++ and makes your specific example well-defined in C11.

To me, the relevant justification is:

This is intended to allow compiler transfor- mations, such as removal of empty loops, even when termination cannot be proven.

Presumably, this is because proving termination mechanically is difficult, and the inability to prove termination hampers compilers which could otherwise make useful transformations, such as moving nondependent operations from before the loop to after or vice versa, performing post-loop operations in one thread while the loop executes in another, and so on. Without these transformations, a loop might block all other threads while they wait for the one thread to finish said loop. (I use "thread" loosely to mean any form of parallel processing, including separate VLIW instruction streams.)

EDIT: Dumb example:

while (complicated_condition()) {
    x = complicated_but_externally_invisible_operation(x);
}
complex_io_operation();
cout << "Results:" << endl;
cout << x << endl;

Here, it would be faster for one thread to do the complex_io_operation while the other is doing all the complex calculations in the loop. But without the clause you have quoted, the compiler has to prove two things before it can make the optimisation: 1) that complex_io_operation() doesn't depend on the results of the loop, and 2) that the loop will terminate. Proving 1) is pretty easy, proving 2) is the halting problem. With the clause, it may assume the loop terminates and get a parallelisation win.

I also imagine that the designers considered that the cases where infinite loops occur in production code are very rare and are usually things like event-driven loops which access I/O in some manner. As a result, they have pessimised the rare case (infinite loops) in favour of optimising the more common case (noninfinite, but difficult to mechanically prove noninfinite, loops).

It does, however, mean that infinite loops used in learning examples will suffer as a result, and will raise gotchas in beginner code. I can't say this is entirely a good thing.

EDIT: with respect to the insightful article you now link, I would say that "the compiler may assume X about the program" is logically equivalent to "if the program doesn't satisfy X, the behaviour is undefined". We can show this as follows: suppose there exists a program which does not satisfy property X. Where would the behaviour of this program be defined? The Standard only defines behaviour assuming property X is true. Although the Standard does not explicitly declare the behaviour undefined, it has declared it undefined by omission.

Consider a similar argument: "the compiler may assume a variable x is only assigned to at most once between sequence points" is equivalent to "assigning to x more than once between sequence points is undefined".

I think the correct interpretation is the one from your edit: empty infinite loops are undefined behavior.

I wouldn't say it's particularly intuitive behavior, but this interpretation makes more sense than the alternative one, that the compiler is arbitrarily allowed to ignore infinite loops without invoking UB.

If infinite loops are UB, it just means that non-terminating programs aren't considered meaningful: according to C++0x, they have no semantics.

That does make a certain amount of sense too. They are a special case, where a number of side effects just no longer occur (for example, nothing is ever returned from main), and a number of compiler optimizations are hampered by having to preserve infinite loops. For example, moving computations across the loop is perfectly valid if the loop has no side effects, because eventually, the computation will be performed in any case. But if the loop never terminates, we can't safely rearrange code across it, because we might just be changing which operations actually get executed before the program hangs. Unless we treat a hanging program as UB, that is.

I think this is along the lines of the this type of question, which references another thread. Optimization can occasionally remove empty loops.

The relevant issue is that the compiler is allowed to reorder code whose side effects do not conflict. The surprising order of execution could occur even if the compiler produced non-terminating machine code for the infinite loop.

I believe this is the right approach. The language spec defines ways to enforce order of execution. If you want an infinite loop that cannot be reordered around, write this:

volatile int dummy_side_effect;

while (1) {
    dummy_side_effect = 0;
}

printf("Never prints.\n");

I think the issue could perhaps best be stated, as "If a later piece of code does not depend on an earlier piece of code, and the earlier piece of code has no side-effects on any other part of the system, the compiler's output may execute the later piece of code before, after, or intermixed with, the execution of the former, even if the former contains loops, without regard for when or whether the former code would actually complete. For example, the compiler could rewrite:

void testfermat(int n)
{
  int a=1,b=1,c=1;
  while(pow(a,n)+pow(b,n) != pow(c,n))
  {
    if (b > a) a++; else if (c > b) {a=1; b++}; else {a=1; b=1; c++};
  }
  printf("The result is ");
  printf("%d/%d/%d", a,b,c);
}

void testfermat(int n)
{
  if (fork_is_first_thread())
  {
    int a=1,b=1,c=1;
    while(pow(a,n)+pow(b,n) != pow(c,n))
    {
      if (b > a) a++; else if (c > b) {a=1; b++}; else {a=1; b=1; c++};
    }
    signal_other_thread_and_die();
  }
  else // Second thread
  {
    printf("The result is ");
    wait_for_other_thread();
  }
  printf("%d/%d/%d", a,b,c);
}

Generally not unreasonable, though I might worry that:

  int total=0;
  for (i=0; num_reps > i; i++)
  {
    update_progress_bar(i);
    total+=do_something_slow_with_no_side_effects(i);
  }
  show_result(total);

would become

  int total=0;
  if (fork_is_first_thread())
  {
    for (i=0; num_reps > i; i++)
      total+=do_something_slow_with_no_side_effects(i);
    signal_other_thread_and_die();
  }
  else
  {
    for (i=0; num_reps > i; i++)
      update_progress_bar(i);
    wait_for_other_thread();
  }
  show_result(total);

By having one CPU handle the calculations and another handle the progress bar updates, the rewrite would improve efficiency. Unfortunately, it would make the progress bar updates rather less useful than they should be.

It is not decidable for the compiler for non-trivial cases if it is an infinite loop at all.

In different cases, it can happen that your optimiser will reach a better complexity class for your code (e.g. it was O(n^2) and you get O(n) or O(1) after optimisation).

So, to include such a rule that disallows removing an infinite loop into the C++ standard would make many optimisations impossible. And most people don't want this. I think this quite answers your question.

Another thing: I never have seen any valid example where you need an infinite loop which does nothing.

The one example I have heard about was an ugly hack that really should be solved otherwise: It was about embedded systems where the only way to trigger a reset was to freeze the device so that the watchdog restarts it automatically.

If you know any valid/good example where you need an infinite loop which does nothing, please tell me.

I think it's worth pointing out that loops which would be infinite except for the fact that they interact with other threads via non-volatile, non-synchronised variables can now yield incorrect behaviour with a new compiler.

I other words, make your globals volatile -- as well as arguments passed into such a loop via pointer/reference.

참고URL : https://stackoverflow.com/questions/3592557/optimizing-away-a-while1-in-c0x

저작자표시 (새창열림)

'programing tip' 카테고리의 다른 글

좋아하는 (영리한) 방어 프로그래밍 모범 사례 (0)	2020.06.12
배열에서 요소의 합과 평균을 계산하는 방법은 무엇입니까? (0)	2020.06.12
기록 데이터를 저장하는 방법 (0)	2020.06.12
iframe, 포함 및 객체 요소의 차이점 (0)	2020.06.12
GUI를 단위 테스트하려면 어떻게해야합니까? (0)	2020.06.12

현재글“while (1);”최적화

itbloger

“while (1);”최적화

“while (1);”최적화 C ++ 0x에서

'programing tip' 카테고리의 다른 글

'programing tip'의 다른글

티스토리툴바

“while (1);”최적화

“while (1);”최적화 C ++ 0x에서

'programing tip' 카테고리의 다른 글

'programing tip'의 다른글

관련글

티스토리툴바