feat(python): add bfloat16 and bfloat16_array support by asadjan4611 · Pull Request #3329 · apache/fory

asadjan4611 · 2026-02-12T06:39:37Z

Why?

This PR implements bfloat16 (Brain Float 16) and bfloat16_array support for Fory Python runtime and codegen, addressing issue #3289. This enables using bfloat16 in FDL to reduce payload size while keeping a wide exponent range, which is common in ML/AI workflows.

What does this PR do?

This PR adds comprehensive bfloat16 support to Fory Python:

Core Implementation

BFloat16 Type: Cython implementation with IEEE 754 compliant float32↔bfloat16 conversions (round-to-nearest, ties-to-even)
BFloat16Array: Python-visible array type backed by array.array('H') for packed contiguous storage
Serializers: Both scalar (BFloat16Serializer) and array (BFloat16ArraySerializer) serializers
Type Registration: Registered with TypeId.BFLOAT16 (18) and TypeId.BFLOAT16_ARRAY (54)

Integration Points

Buffer Operations: Added write_bfloat16() and read_bfloat16() methods
Codegen Support: Added bfloat16 to codegen type mapping
Row Format: Added bfloat16() factory function (temporarily maps to float16 until C++ row format supports it)
Type System: Fully integrated into Fory type resolver

Testing

11 comprehensive test cases covering:
- Basic operations and conversions
- Special values (NaN, ±Inf, ±0)
- Serialization round-trips
- Array operations
- Integration with dataclasses, lists, and maps
- Type registration verification

Code Quality

Follows existing float16 implementation patterns
Matches project code standards and style
All files include proper Apache 2.0 license headers
No linter errors

Related issues

Fixes [Python] add bfloat16 and bfloat16_array (Cython, no numpy) #3289

Does this PR introduce any user-facing change?

Does this PR introduce any public API change?
- Yes: Adds BFloat16, BFloat16Array types and bfloat16() factory function
Does this PR introduce any binary protocol compatibility change?
- No: Uses existing TypeId.BFLOAT16 (18) already defined in protocol spec

Implementation Details

Wire Format

Encodes bfloat16 as 2 bytes representing raw IEEE 754 bfloat16 bit pattern
Little endian byte order (matches existing float32/float64 behavior)
NaN/Inf/±0/subnormal values round-trip correctly at bit level

Type System

Type ID: 18 (BFLOAT16) - already defined in xlang serialization spec
Array Type ID: 54 (BFLOAT16_ARRAY)
Protocol compliant with existing xlang serialization format

Performance

Uses Cython for performance-critical conversion operations
Zero-copy array operations using array.array('H')
Follows same optimization patterns as existing float16 implementation

- Add BFloat16 Cython type with IEEE 754 compliant conversions - Add BFloat16Array class backed by array.array('H') - Implement serializers for scalar and array types - Register types in type resolver (TypeId.BFLOAT16 = 18, TypeId.BFLOAT16_ARRAY = 54) - Add buffer read/write methods for bfloat16 - Add codegen support for bfloat16 - Add row format support (with temporary float16 mapping until C++ support) - Add comprehensive test suite with 11 test cases covering all edge cases - Follow existing float16 implementation patterns Fixes apache#3289

- Change single quotes to double quotes (ruff format requirement) - Remove trailing whitespace - Add blank lines after imports (PEP 8) - Remove unused import (pyfory) - Fix closing parenthesis alignment

…de style

- Remove invalid Cython type casts (<BFloat16>) in serialization.pyx and primitive.pxi - Use isinstance() check instead of type casting for Python classes - Fix bfloat16() function to use float16() as temporary workaround until C++ support is added - Comment out bfloat16() declaration in libformat.pxd with TODO for future C++ implementation

Replace unsafe pointer casts with memcpy to ensure cross-platform compatibility across all OS versions (Windows, Linux, macOS) and architectures (x86_64, ARM). This fixes strict aliasing violations that cause compilation failures on ARM and newer compilers.

Replace sizeof(float) with explicit constant 4 in memcpy calls to ensure cross-platform compatibility, especially on ARM architectures where sizeof() may cause compilation issues. This matches the project's pattern of using explicit size constants (as seen in types.py). Fixes build failures on: - ubuntu-24.04-arm (aarch64) - macos-arm64 (Apple Silicon) - ubuntu-24.04-arm with Python 3.13

…lizers

asadjan4611 · 2026-02-13T16:04:28Z

@chaokunyang please review my PR and this is very interesting Project and i learn a lot of things from this issue .

asadjan4611 · 2026-02-21T07:45:46Z

@komamitsu @chaokunyang
please review my PR ,i wanna to work on another issue still waiting from the reviewer to check this PR.

chaokunyang · 2026-02-21T12:00:02Z

python/pyfory/registry.py

        )
        register(float, type_id=TypeId.FLOAT64, serializer=Float64Serializer)
+        # BFloat16 is optional if the extension module is unavailable.
+        try:


this should always be available, could you remove the tra excep clause

chaokunyang · 2026-02-21T12:00:14Z

python/pyfory/registry.py

                serializer=PyArraySerializer(self.fory, ftype, typeid),
            )
+        # BFloat16Array is optional if the extension module is unavailable.
+        try:


chaokunyang · 2026-02-21T12:02:53Z

python/pyfory/serialization.pyx

+cpdef inline read_nullable_bfloat16(Buffer buffer):
+    if buffer.read_int8() == NOT_NULL_VALUE_FLAG:
+        from pyfory.bfloat16 import BFloat16
+        return BFloat16.from_bits(buffer.read_bfloat16())


do you need to create a bfloat.pxd, so we can import it in buffer.pyx and make buffer.read_bfloat16() return BFloat16 directly

And could you rename BFloat16 to bfloat16? This is a primitive type, use lowercase name style make it looks like buildin

chaokunyang · 2026-02-21T12:04:43Z

python/pyfory/serialization.pyx

        return False


+cdef class XlangCompatibleSerializer(Serializer):


Could you merge main branch, we've removed the xwrite/xread API, and unified API in #3348

This XlangCompatibleSerializer is not needed anymore

chaokunyang · 2026-02-21T12:06:00Z

python/pyfory/serializer.py

+        self.type_id = type_id
+        self.itemsize = 2
+
+    def xwrite(self, buffer, value):


ditto for xwrite/xread, we don't haev such API anymore

chaokunyang · 2026-02-21T14:44:29Z

python/pyfory/_serializer.py

        return False


+class XlangCompatibleSerializer(Serializer):


Please remove this, we've removed it in #3348

…ffer reads

…issues

…llection.pxi Cython cpdef functions do not support keyword arguments when called from C code. Changed all read_no_ref(buffer, serializer=...) calls to use positional arguments read_no_ref(buffer, serializer) instead.

…ruct.py and collection.py Cython cpdef functions do not support keyword arguments when called from C code. Changed all xwrite_ref, xread_ref, write_no_ref, and read_no_ref calls to use positional arguments instead of keyword arguments (serializer=...).

asadjan4611 · 2026-02-24T07:38:17Z

@chaokunyang I am facing one issue that some checks are not passing how can i handle it ?

Zakir032002 · 2026-02-24T10:23:13Z

hey @asadjan4611, went through bfloat16.pyx — two bugs here

NaN becomes Infinity

for a signaling NaN like 0x7F800001:

bf16_bits = 0x7F800001 >> 16  →  0x7F80
truncated = 0x7F800001 & 0xFFFF  →  0x0001
0x0001 > 0x8000 → false, no rounding fires
returns 0x7F80  →  Infinity, not NaN

fix — add NaN passthrough at the top (also note: cdef inside if block is illegal in Cython, all declarations go at function top):

cdef inline uint16_t float32_to_bfloat16_bits(float value) nogil:
    cdef uint32_t f32_bits
    cdef uint16_t bf16_bits
    cdef uint16_t truncated
    memcpy(&f32_bits, &value, 4)
    if (f32_bits & 0x7FFFFFFF) > 0x7F800000:
        return (<uint16_t>(f32_bits >> 16)) | 0x0040
    bf16_bits = <uint16_t>(f32_bits >> 16)
    truncated = <uint16_t>(f32_bits & 0xFFFF)
    if truncated > 0x8000:
        bf16_bits += 1
    elif truncated == 0x8000 and (bf16_bits & 1):
        bf16_bits += 1
    return bf16_bits

__hash__ breaks the eq contract for ±0

__eq__ returns True for +0 == -0 but hash(0x0000) != hash(0x8000) — python requires equal objects to have equal hashes, so {bfloat16(0.0), bfloat16(-0.0)} silently gives 2 elements instead of 1.

def __hash__(self):
    if (self._bits & 0x7FFF) == 0:
        return hash(0)
    return hash(self._bits)

Happy to discuss if im misreading the flow here

asadjan4611 requested a review from chaokunyang as a code owner February 12, 2026 06:39

asadjan4611 added 9 commits February 12, 2026 11:59

style(python): fix code formatting for bfloat16 implementation

c89d86e

- Change single quotes to double quotes (ruff format requirement) - Remove trailing whitespace - Add blank lines after imports (PEP 8) - Remove unused import (pyfory) - Fix closing parenthesis alignment

fix(python): remove trailing newline in bfloat16_array.py to match co…

b2bb7c6

…de style

fix(python): correct row schema Arrow conversion type ids

71241ba

fix(python): build and export bfloat16 python module

8066ace

docs(python): document bfloat16 support and stabilize pure mode seria…

8ed6f57

…lizers

fix(python): configure bazel shell for windows editable builds

9fcb74d

asadjan4611 added 3 commits February 21, 2026 00:12

Merge branch 'main' into feat/python-bfloat16-support

8866639

fix(python): restore xlang serializer base after conflict merge

102acfb

fix(python): restore xlang serializer base in cython runtime

bedb82e

chaokunyang reviewed Feb 21, 2026

View reviewed changes

asadjan4611 added 8 commits February 21, 2026 19:57

fix(python): align bfloat16 serializers with unified API and typed bu…

2f2b6b4

…ffer reads

fix(python): resolve bfloat16 cython build and serializer API drift

c9e2e12

style(python): align bfloat16 files with ci formatter

56f45d3

fix(python): include bfloat16 pxd in format cython target

efb0100

fix(python): resolve bfloat16 cython redeclaration and static method …

f1a3f42

…issues

fix(python): remove obsolete xlang arg for dataclass serializers

9d74ae2

fix(python): accept legacy xlang kwarg in dataclass serializers

ec59341

fix(python): repair unsigned xlang dataclass refs and retry bazel fetch

848a1cd

asadjan4611 added 3 commits February 21, 2026 21:38

style(python): format setup retry log line

66dbf19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(python): add bfloat16 and bfloat16_array support#3329

feat(python): add bfloat16 and bfloat16_array support#3329
asadjan4611 wants to merge 24 commits intoapache:mainfrom
asadjan4611:feat/python-bfloat16-support

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		return False


		cdef class XlangCompatibleSerializer(Serializer):

Conversation

Why?

What does this PR do?

Core Implementation

Integration Points

Testing

Code Quality

Related issues

Does this PR introduce any user-facing change?

Implementation Details

Wire Format

Type System

Performance

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants