Basic data types—floating-point types
In the IAR C/C++ Compiler for RISC-V, floating-point values are represented in standard IEC 60559 format. The sizes for the different floating-point types are:
Type | Size | Range (+/-) | Decimals | Exponent | Mantissa | Alignment |
|---|---|---|---|---|---|---|
32 bits | ±1.18E-38 to ±3.40E+38 | 7 | 8 bits | 23 bits | 4 | |
64 bits | ±2.23E-308 to ±1.79E+308 | 15 | 11 bits | 52 bits | 8 | |
64 bits | ±2.23E-308 to ±1.79E+308 | 15 | 11 bits | 52 bits | 8 |
Floating-point environment
The feraiseexcept function does not raise any exceptions, it just sets the corresponding exception flags.
Exception flags for floating-point values are supported for operations performed by the FPU. For devices with a 64-bit FPU, they are defined in the fenv.h file.
32-bit floating-point format
The representation of a 32-bit floating-point number as an integer is:

The exponent is 8 bits, and the mantissa is 23 bits.
The value of the number is:
(-1)S * 2(Exponent-127) * 1.Mantissa
The range of the number is at least:
±1.18E-38 to ±3.39E+38
The precision of the float operators (+, -, *, and /) is approximately 7 decimal digits.
64-bit floating-point format
The representation of a 64-bit floating-point number as an integer is:

The exponent is 11 bits, and the mantissa is 52 bits.
The value of the number is:
(-1)S * 2(Exponent-1023) * 1.Mantissa
The range of the number is at least:
±2.23E-308 to ±1.79E+308
The precision of the float operators (+, -, *, and /) is approximately 15 decimal digits.
Representation of special floating-point numbers
This list describes the representation of special floating-point numbers:
Zero is represented by zero mantissa and exponent. The sign bit signifies positive or negative zero.
Infinity is represented by setting the exponent to the highest value and the mantissa to zero. The sign bit signifies positive or negative infinity.
Not a number (
NaN) is represented by setting the exponent to the highest positive value and the mantissa to a non-zero value. The value of the sign bit is ignored.Subnormal numbers are used for representing values smaller than what can be represented by normal values. The drawback is that the precision will decrease with smaller values. The exponent is set to 0 to signify that the number is subnormal, even though the number is treated as if the exponent was 1. Unlike normal numbers, subnormal numbers do not have an implicit 1 as the most significant bit (the MSB) of the mantissa. The value of a subnormal number is:
(-1)S * 2(1-BIAS) * 0.Mantissa
where
BIASis 127 and 1023 for 32-bit and 64-bit floating-point values, respectively.