Fine-tuning enabled transformations

In this section:

At each optimization level you can disable some of the transformations individually. To disable a transformation, use either the appropriate option, for instance the command line option ‑‑no_inline, alternatively its equivalent in the IDE Function inlining, or the #pragma optimize directive. These transformations can be disabled individually:

Common subexpression elimination
Loop unrolling
Function inlining
Code motion
Type-based alias analysis
Static clustering
Instruction scheduling
Vectorization

Common subexpression elimination

Redundant re-evaluation of common subexpressions is by default eliminated at optimization levels Medium and High. This optimization normally reduces both code size and execution time. However, the resulting code might be difficult to debug.

Note

This option has no effect at optimization levels None and Low.

Danger

For more information about the command line option, see ‑‑no_cse.

Loop unrolling

Loop unrolling means that the code body of a loop, whose number of iterations can be determined at compile time, is duplicated. Loop unrolling reduces the loop overhead by amortizing it over several iterations.

This optimization is most efficient for smaller loops, where the loop overhead can be a substantial part of the total loop body.

Loop unrolling, which can be performed at optimization level High, normally reduces execution time, but increases code size. The resulting code might also be difficult to debug.

The compiler heuristically decides which loops to unroll. Only relatively small loops where the loop overhead reduction is noticeable will be unrolled. Different heuristics are used when optimizing for speed, size, or when balancing between size and speed.

Note

This option has no effect at optimization levels None, Low, and Medium.

Danger

To disable loop unrolling, use the command line option ‑‑no_unroll, see ‑‑no_unroll.

Function inlining

Function inlining means that a function, whose definition is known at compile time, is integrated into the body of its caller to eliminate the overhead of the call. This optimization normally reduces execution time, but might increase the code size.

For more information, see Inlining functions.

Danger

To disable function inlining, use the command line option ‑‑no_inline, see ‑‑no_inline.

Code motion

Evaluation of loop-invariant expressions and common subexpressions are moved to avoid redundant re-evaluation. This optimization, which is performed at optimization level Medium and above, normally reduces code size and execution time. The resulting code might however be difficult to debug.

Note

This option has no effect at optimization levels below Medium.

Danger

For more information about the command line option, see ‑‑no_code_motion.

Type-based alias analysis

When two or more pointers reference the same memory location, these pointers are said to be aliases for each other. The existence of aliases makes optimization more difficult because it is not necessarily known at compile time whether a particular value is being changed.

Type-based alias analysis optimization assumes that all accesses to an object are performed using its declared type or as a char type. This assumption lets the compiler detect whether pointers can reference the same memory location or not.

Type-based alias analysis is performed at optimization level High. For application code conforming to standardC or C++ application code, this optimization can reduce code size and execution time. However, non-standard C or C++ code might result in the compiler producing code that leads to unexpected behavior. Therefore, it is possible to turn this optimization off.

Note

This option has no effect at optimization levels None, Low, and Medium.

Danger

For more information about the command line option, see ‑‑no_tbaa.

Example

short F(short *p1, long *p2)
{
  *p2 = 0;
  *p1 = 1;
  return *p2;
}

With type-based alias analysis, it is assumed that a write access to the short pointed to by p1 cannot affect the long value that p2 points to. Therefore, it is known at compile time that this function returns 0. However, in non-standard-conforming C or C++ code these pointers could overlap each other by being part of the same union. If you use explicit casts, you can also force pointers of different pointer types to point to the same memory location.

Static clustering

When static clustering is enabled, static and global variables that are defined within the same module are arranged so that variables that are accessed in the same function are stored close to each other. This makes it possible for the compiler to use the same base pointer for several accesses.

Note

This option has no effect at optimization levels None and Low.

Danger

For more information about the command line option, see ‑‑no_clustering.

Instruction scheduling

The compiler features an instruction scheduler to increase the performance of the generated code. To achieve that goal, the scheduler rearranges the instructions to minimize the number of pipeline stalls emanating from resource conflicts within the microprocessor.

Danger

For more information about the command line option, see ‑‑no_scheduling.

Vectorization

Vectorization transforms sequential loops into NEON vector operations, without the need to write assembler code or use intrinsic functions. This enhances portability. Loops will only be vectorized if the target processor has NEON capability and auto-vectorization is enabled. Auto-vectorization is not supported in 64-bit mode.

Vectorization, which can be performed at optimization level High, favoring Speed, normally reduces execution time, but increases code size. The resulting code might also be difficult to debug.

Note

This option has no effect at optimization levels None, Low, and Medium, or for High Balanced or High Size. To disable vectorization for individual functions, use one of the pragma directives optimize or vectorize, see optimize and vectorize.

Danger

For information about the command line option, see ‑‑vectorize.

IAR Embedded Workbench for Arm 9.70.x

Fine-tuning enabled transformations

Common subexpression elimination

Loop unrolling

Function inlining

Code motion

Type-based alias analysis

Example

Static clustering

Instruction scheduling

Vectorization

Search results