Basic data types—floating-point types

In this section:

In the IAR C/C++ Compiler for RL78, floating-point values are represented in standard IEC 60559 format. The sizes for the different floating-point types are:

Type	Size if double=32	Size if double=64
`float`	32 bits	32 bits
`double`	32 bits (default)	64 bits
`long double`	32 bits	64 bits

Table 77. Floating-point types

Note

The size of double and long double depends on the ‑‑double={32|64} option, see ‑‑double. The type long double uses the same precision as double.

Floating-point environment

Exception flags are not supported. The feraiseexcept function does not raise any exceptions.

32-bit floating-point format

The representation of a 32-bit floating-point number as an integer is:

The exponent is 8 bits, and the mantissa is 23 bits.

The value of the number is:

(-1)^S * 2^{(Exponent-127)} * 1.Mantissa

The range of the number is at least:

±1.18E-38 to ±3.39E+38

The precision of the float operators (+, -, *, and /) is approximately 7 decimal digits.

64-bit floating-point format

The representation of a 64-bit floating-point number as an integer is:

The exponent is 11 bits, and the mantissa is 52 bits.

The value of the number is:

(-1)^S * 2^{(Exponent-1023)} * 1.Mantissa

The range of the number is at least:

±2.23E-308 to ±1.79E+308

The precision of the float operators (+, -, *, and /) is approximately 15 decimal digits.

Representation of special floating-point numbers

This list describes the representation of special floating-point numbers:

Zero is represented by zero mantissa and exponent. The sign bit signifies positive or negative zero.
Infinity is represented by setting the exponent to the highest value and the mantissa to zero. The sign bit signifies positive or negative infinity.
Not a number (NaN) is represented by setting the exponent to the highest positive value and the mantissa to a non-zero value. The value of the sign bit is ignored. The uppermost bits must be set.
Subnormal numbers are used for representing values smaller than what can be represented by normal values. The drawback is that the precision will decrease with smaller values. The exponent is set to 0 to signify that the number is subnormal, even though the number is treated as if the exponent was 1. Unlike normal numbers, subnormal numbers do not have an implicit 1 as the most significant bit (the MSB) of the mantissa. The value of a subnormal number is:
```
(-1)^S * 2^(1-BIAS) * 0.Mantissa
```
where BIAS is 127.

IAR Embedded Workbench for RL78 5.20