-
Notifications
You must be signed in to change notification settings - Fork 146
Open
Description
The current compiler implementation of fp128 is confusing. __float128 is a GCC extension with inconsistent support (e.g., LoongArch64 doesn't support Q suffix literals), while _Float128 is standardized but may lack support in some compilers (e.g., Clang lacks C23 _Float128 and f128 suffixes; see llvm/llvm-project#80195).
We might implement dual support for compatibility: prioritize _Float128 detection first, then fall back to __float128 if unavailable.
See also:
- https://en.cppreference.com/w/cpp/types/floating-point
- https://gcc.gnu.org/onlinedocs/gcc/Floating-Types.html
- Related issue on gcc: GCC Bugzilla: LoongArch: Q Suffix for __float128 Literals Not Supported
- Related issue on clang: [clang] missing support for _Float128 (C23) llvm/llvm-project#80195
- The support status table on cppreference:
| Types Defined in header |
Literal suffix | Predefined macro | C language type | bits of storage | bits of precision | bits of exponent | max exponent |
|---|---|---|---|---|---|---|---|
| float16_t | f16 or F16 | STDCPP_FLOAT16_T | _Float16 | 16 | 11 | 5 | 15 |
| float32_t | f32 or F32 | STDCPP_FLOAT32_T | _Float32 | 32 | 24 | 8 | 127 |
| float64_t | f64 or F64 | STDCPP_FLOAT64_T | _Float64 | 64 | 53 | 11 | 1023 |
| float128_t | f128 or F128 | STDCPP_FLOAT128_T | _Float128 | 128 | 113 | 15 | 16383 |
| bfloat16_t | bf16 or BF16 | STDCPP_BFLOAT16_T | (N/A) | 16 | 8 | 8 | 127 |
Metadata
Metadata
Assignees
Labels
No labels