diff --git a/source/bson-binary-vector/bson-binary-vector.md b/source/bson-binary-vector/bson-binary-vector.md
index d46f6681a9..445fd12ed8 100644
--- a/source/bson-binary-vector/bson-binary-vector.md
+++ b/source/bson-binary-vector/bson-binary-vector.md
@@ -7,246 +7,266 @@ ______________________________________________________________________
## Abstract
-This document describes the subtype of the Binary BSON type used for efficient storage and retrieval of vectors. Vectors
-here refer to densely packed arrays of numbers, all of the same type.
+This document describes a new *Vector* subtype (9) for BSON Binary items, used to compactly represent ordered
+collections of uniformly-typed elements. A framework is presented for future type extensibility, but adoption complexity
+is limited by allowing support for only a restricted set of element types at first:
-## Motivation
+- 1-bit unsigned integers
+- 8-bit signed integers
+- 32-bit floating point
-These representations correspond to the numeric types supported by popular numerical libraries for vector processing,
-such as NumPy, PyTorch, TensorFlow and Apache Arrow. Storing and retrieving vector data using the same densely packed
-format used by these libraries can result in significant memory savings and processing efficiency.
-
-### META
+## Meta
The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt).
+Hexadecimal values are shown here with a `0x` prefix.
+
+Bit strings are grouped with insignificant whitespace for readability.
+
+## Terms
+
+*BSON Array* - Arrays are a fundamental container type in BSON for ordered sequences, implemented as item type `4`. Each
+element can have an arbitrary data type. The encoding is relatively high-overhead, due to both the non-uniform types and
+the required element name strings.
+
+*BSON Binary* - BSON Binary items (type `5`) are a container for a variable-length byte sequence with extensible
+interpretation, according to an 8-bit *subtype*.
+
+*BSON Binary Vector* - A BSON Binary item of subtype `9`. Also referred to here as a Vector.
+
+## Motivation for Change
+
+BSON does not on its own provide a densely packed encoding for numeric data of uniform data type. Numbers stored in a
+BSON Array have high space overhead, owing to the item name and type included with each value. This specification offers
+an alternative collection type with improved performance and limited complexity.
+
+### Goals
+
+- Vectors provide improved resource efficiency compared to BSON Arrays.
+- Every Vector is guaranteed to represent a sequence of elements with uniform type and size.
+- Vectors may be reliably compared for equality by comparing their encoded BSON Binary representation.
+- Implementation complexity should be minimal.
+
+### Non-Goals
+
+- No changes to Extended JSON representation are defined. Vectors will serialize to generic Binary items with base64
+ encoding: `{"$binary": {"base64": ... , "subType": "9" }}`.
+- The Vector is a 1-dimensional container. Applications may implement multi-dimensional arrays efficiently by bundling a
+ Vector with additional metadata, but this usage is not standardized here.
+- Comprehensive support for all possible data types and bit/byte ordering is not a goal. This specification prefers to
+ reduce complexity by limiting the set of allowed types and providing no unnecessary data formatting options.
+- Vectors within a BSON document are NOT designed for "zero copy" access by direct architecture-specific load or store.
+ Typically multi-byte values will not be aligned as required, and they may need byte order conversion. Internal
+ padding for alignment is not supported, as this would impact comparison stability.
+- Vectors do not include any data compression features. Applications may see benefit from careful choice of an external
+ compression algorithm.
+- Vectors do not provide any new comparison methods beyond byte-equality. Vectors are never equal to Arrays, even when
+ they represent the same numeric elements. Vectors of different element types are not comparable.
+- Vectors do not guarantee that element types defined in the future will always be scalar numbers, only that elements of
+ a Vector always have identical type and size.
+
## Specification
-This specification introduces a new BSON binary subtype, the vector, with value `9`.
+### Scope
+
+- This specification defines the meaning of the data bytes in BSON Binary items of subtype `9`.
+- The first two data bytes form a header, with meaning defined here.
+- This specification defines validity criteria for accepting or rejecting byte strings.
+- This specification includes JSON tests with valid documents, invalid documents, and expected conversion results.
+- Drivers SHOULD provide low-overhead APIs for producing and consuming Vector data in the closest compatible language
+ types, without conversions more expensive than copying or byte-swapping. These APIs are not standardized across
+ languages.
+- Drivers MAY provide facilities for converting between BSON Binary Vector and BSON Array representations. When they
+ choose to do so, they MUST ensure compliance using the provided tests. Drivers MUST NOT automatically convert
+ between representations.
-Drivers SHOULD provide idiomatic APIs to translate between arrays of numbers and this BSON Binary specification.
-
-### Data Types (dtypes)
+### Header Format
-Each vector can take one of multiple data types (dtypes). The following table lists the dtypes implemented.
-
-| Vector data type | Alias | Bits per vector element | [Arrow Data Type](https://arrow.apache.org/docs/cpp/api/datatype.html) (for illustration) |
-| ---------------- | ---------- | ----------------------- | ----------------------------------------------------------------------------------------- |
-| `0x03` | INT8 | 8 | INT8 |
-| `0x27` | FLOAT32 | 32 | FLOAT |
-| `0x10` | PACKED_BIT | 1 `*` | BOOL |
-
-`*` A Binary Quantized (PACKED_BIT) Vector is a vector of 0s and 1s (bits), but it is represented in memory as a list of
-integers in \[0, 255\]. So, for example, the vector `[0, 255]` would be shorthand for the 16-bit vector
-`[0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1]`. The idea is that each number (a uint8) can be stored as a single byte. Of course,
-some languages, Python for one, do not have an uint8 type, so must be represented as an int in memory, but not on disk.
-
-### Byte padding
-
-As not all data types have a bit length equal to a multiple of 8, and hence do not fit squarely into a certain number of
-bytes, a second piece of metadata, the "padding" is included. This instructs the driver of the number of bits in the
-final byte that are to be ignored. The least-significant bits are ignored.
-
-### Binary structure
-
-Following the binary subtype `9`, a two-element byte array of metadata precedes the packed numbers.
-
-- The first byte (dtype) describes its data type. The table above shows those that MUST be implemented. This table may
- increase. dtype is an unsigned integer.
-
-- The second byte (padding) prescribes the number of bits to ignore in the final byte of the value. It is a non-negative
- integer. It must be present, even in cases where it is not applicable, and set to zero.
-
-- The remainder contains the actual vector elements packed according to dtype.
-
-All values use the little-endian format.
-
-#### Example
-
-Let's take a vector `[238, 224]` of dtype PACKED_BIT (`\x10`) with a padding of `4`.
-
-In hex, it looks like this: `b"\x10\x04\xee\xe0"`: 1 byte for dtype, 1 for padding, and 1 for each uint8.
-
-We can visualize the binary representation like so:
-
-
-
- | 1st byte: dtype (from list in previous table) |
- 2nd byte: padding (values in [0,7]) |
- 1st uint8: 238 |
- 2nd uint8: 224 |
-
-
- | 0 |
- 0 |
- 0 |
- 1 |
- 0 |
- 0 |
- 0 |
- 0 |
- 0 |
- 0 |
- 0 |
- 0 |
- 0 |
- 1 |
- 0 |
- 0 |
- 1 |
- 1 |
- 1 |
- 0 |
- 1 |
- 1 |
- 1 |
- 0 |
- 1 |
- 1 |
- 1 |
- 0 |
- 0 |
- 0 |
- 0 |
- 0 |
-
-
+Every valid Vector begins with one of the following 2-byte header patterns:
-Finally, after we remove the last 4 bits of padding, the actual bit vector has a length of 12 and looks like this!
+| Header bytes | Alias | Description |
+| ------------ | ----------- | ------------------------------------------------------------------------------- |
+| `0x03 0x00` | INT8 | signed bytes |
+| `0x27 0x00` | FLOAT32 | single precision (32-bit) floating point, least significant byte first |
+| `0x10 0x00` | PACKED_BITS | single-bit integers, most significant bit first, exact multiple of 8 bits total |
+| `0x10 0x01` | PACKED_BITS | as above, final 1 bit ignored |
+| `0x10` ... | PACKED_BITS | ... |
+| `0x10 0x07` | PACKED_BITS | as above, final 7 bits ignored |
-| 1 | 1 | 1 | 0 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 0 |
-| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
-
-## API Guidance
+Drivers MAY choose to interpret the header bytes as a structure with internal fields:
-Drivers MUST implement methods for explicit encoding and decoding that adhere to the pattern described below while
-following idioms of the language of the driver.
-
-### Encoding
-
-```
-Function from_vector(vector: Iterable, dtype: DtypeEnum, padding: Integer = 0) -> Binary
- # Converts a numeric vector into a binary representation based on the specified dtype and padding.
-
- # :param vector: A sequence or iterable of numbers (either float or int)
- # :param dtype: Data type for binary conversion (from DtypeEnum)
- # :param padding: Optional integer specifying how many bits to ignore in the final byte
- # :return: A binary representation of the vector
-
- Declare binary_data as Binary
-
- # Process each number in vector and convert according to dtype
- For each number in vector
- binary_element = convert_to_binary(number, dtype)
- binary_data.append(binary_element)
- End For
-
- # Apply padding to the binary data if needed
- If padding > 0
- apply_padding(binary_data, padding)
- End If
-
- Return binary_data
-End Function
-```
-
-Note: If a driver chooses to implement a `Vector` type (or numerous) like that suggested in the Data Structure
-subsection below, they MAY decide that `from_vector` that has a single argument, a Vector.
-
-### Decoding
-
-```
-Function as_vector() -> Vector
- # Unpacks binary data (BSON or similar) into a Vector structure.
- # This process involves extracting numeric values, the data type, and padding information.
-
- # :return: A BinaryVector containing the unpacked numeric values, dtype, and padding.
-
- Declare binary_vector as BinaryVector # Struct to hold the unpacked data
-
- # Extract dtype (data type) from the binary data
- binary_vector.dtype = extract_dtype_from_binary()
-
- # Extract padding from the binary data
- binary_vector.padding = extract_padding_from_binary()
-
- # Unpack the actual numeric values from the binary data according to the dtype
- binary_vector.data = unpack_numeric_values(binary_vector.dtype)
-
- Return binary_vector
-End Function
-```
-
-#### Validation
-
-Drivers MUST validate vector metadata and raise an error if any invariant is violated:
-
-- Padding MUST be 0 for all dtypes where padding doesn’t apply, and MUST be within \[0, 7\] for PACKED_BIT.
-- A PACKED_BIT vector MUST NOT be empty if padding is in the range \[1, 7\].
-- When unpacking binary data into a FLOAT32 Vector structure, the length of the binary data following the dtype and
- padding MUST be a multiple of 4 bytes.
-
-Drivers MUST perform this validation when a numeric vector and padding are provided through the API, and when unpacking
-binary data (BSON or similar) into a Vector structure.
-
-#### Data Structures
-
-Drivers MAY find the following structures to represent the dtype and vector structure useful.
-
-```
-Enum Dtype
- # Enum for data types (dtype)
+| Size | Location | Description |
+| ------ | ----------------------------------- | ----------- |
+| 4 bits | First byte, most significant half | Type code |
+| 4 bits | First byte, least significant half | Size code |
+| 5 bits | Second byte, most significant part | (reserved) |
+| 3 bits | Second byte, least significant part | Padding |
- # FLOAT32: Represents packing of list of floats as float32
- # Value: 0x27 (hexadecimal byte value)
+Reserved bits MUST be zero.
- # INT8: Represents packing of list of signed integers in the range [-128, 127] as signed int8
- # Value: 0x03 (hexadecimal byte value)
+The generic interpretation of Padding refers to the number of items that should be ignored from what would have been the
+end of the Vector, regardless of item size and bit order.
- # PACKED_BIT: Special case where vector values are 0 or 1, packed as unsigned uint8 in range [0, 255]
- # Packed into groups of 8 (a byte)
- # Value: 0x10 (hexadecimal byte value)
-
- # Documentation:
- # Each value is a byte (length of one), a convenient choice for decoding.
-End Enum
+| Type code | Description |
+| --------- | ----------------------------------------------- |
+| 0 | Signed integer, two's complement representation |
+| 1 | Unsigned integer |
+| 2 | Floating point, IEEE 754 representation |
+| 3 .. 15 | (reserved) |
-Struct Vector
- # Numeric vector with metadata for binary interoperability
+| Size code | Bits per element |
+| --------- | ------------------ |
+| 0 | 1 |
+| 1 | (reserved for 2) |
+| 2 | (reserved for 4) |
+| 3 | 8 |
+| 4 | (reserved for 12) |
+| 5 | (reserved for 16) |
+| 6 | (reserved for 24) |
+| 7 | 32 |
+| 8 | (reserved for 48) |
+| 9 | (reserved for 64) |
+| 10 | (reserved for 96) |
+| 11 | (reserved for 128) |
+| 12 | (reserved for 192) |
+| 13 | (reserved for 256) |
+| 14 | (reserved for 384) |
+| 15 | (reserved for 512) |
- # Fields:
- # data: Sequence of numeric values (either float or int)
- # dtype: Data type of vector (from enum BinaryVectorDtype)
- # padding: Number of bits to ignore in the final byte for alignment
+Reserved type and size codes MUST NOT be used.
- data # Sequence of float or int
- dtype # Type: DtypeEnum
- padding # Integer: Number of padding bits
- End Struct
-```
+### Validity Criteria
-## Reference Implementation
+To be valid, a Vector MUST be 2 bytes long or longer. Its header MUST be one of the valid bit patterns above. In
+particular, the second byte MUST be nonzero only as necessary to represent Padding values between 0 and 7 in non-empty
+PACKED_BITS vectors. Vectors with no elements MUST have a Padding value of 0.
-- PYTHON (PYTHON-4577)
+Drivers MUST reject Vectors with invalid header bytes.
-## Test Plan
+Drivers SHOULD reject Vectors with any unused bits in the final byte set to `1`.
-See the [README](tests/README.md) for tests.
+Drivers SHOULD reject Vectors with extra bytes after the last complete multi-byte element.
+
+Drivers MUST NOT generate Vectors with extra bytes after the last complete element, or with unused bits in the final
+byte set to `1`.
+
+The contents of individual elements MUST NOT be considered when checking the validity of a Vector. Unused bits in the
+final byte are not considered part of any element.
+
+Drivers MUST validate Vector metadata when provided through the API, to avoid generating byte strings that any
+conforming implementation would consider invalid. For example, if a PACKED_BIT Vector is constructed from a byte array
+paired with a Padding value:
+
+- The driver MUST ensure Padding is zero if the byte array is empty
+- The driver MUST ensure the unused bits in the final byte are zero
+- If the API allows Padding values outside the valid range of 0..7 inclusive, these MUST be rejected at runtime.
+
+Drivers MUST validate Vector byte strings when creating an API representation from a stored BSON Binary item. A
+PACKED_BIT value would have its Padding and length validated as above, and SHOULD have its unused bits checked for zero.
+A FLOAT32 Vector MUST be rejected for a nonzero second header byte, and it SHOULD be rejected for a length that isn't 2
+plus a multiple of 4.
+
+### Type Conversions
+
+Type conversion is an optional feature.
+
+Drivers may provide conversions between BSON Array and BSON Binary Vector representations. Drivers MUST only perform
+this conversion as requested, not automatically.
+
+#### Packing
+
+PACKED_BITS values MAY be optionally losslessly unpacked to a wider data type of the driver's choosing, for more
+convenient access. Drivers MUST provide a way to access PACKED_BITS without unpacking. In languages with compile-time
+abstraction, drivers SHOULD provide an abstract data type for manipulating elements in PACKED_BITS without unpacking. If
+abstraction is not practical, drivers can instead provide direct access to the byte array and 'Padding' value.
+
+#### Integer Values
+
+INT8 and PACKED_BITS values may be losslessly represented as BSON int32 elements.
+
+When converting BSON int32 or int64 elements to INT8 or PACKED_BITS, out-of-range values MUST cause conversion to fail.
-## FAQ
+There is no defined conversion from floating point to integer. Conversion from BSON double to an integer Vector MUST
+fail.
-- What MongoDB Server version does this apply to?
- - Files in the "specifications" repository have no version scheme. They are not tied to a MongoDB server version.
-- In PACKED_BIT, why would one choose to use integers in \[0, 256)?
- - This follows a well-established precedent for packing binary-valued arrays into bytes (8 bits), This technique is
- widely used across different fields, such as data compression, communication protocols, and file formats, where
- you want to store or transmit binary data more efficiently by grouping 8 bits into a single byte (uint8). For an
- example in Python, see
- [numpy.unpackbits](https://numpy.org/doc/2.0/reference/generated/numpy.unpackbits.html#numpy.unpackbits).
+#### Floating Point Values
+
+There is no defined conversion from integer to floating point. Conversion from BSON int32 or int64 to a FLOAT32 Vector
+MUST fail.
+
+When converting BSON double elements to FLOAT32, the driver MUST round to the nearest representable values.
+
+### Data Formats
+
+#### INT8 (`0x03 0x00`)
+
+Signed 1-byte integers in two's complement encoding, representing values from -128 to 127 inclusive.
+
+#### FLOAT32 (`0x27 0x00`)
+
+Single-precision floating point values in the IEEE 754 `binary32` format. 4 bytes, least significant byte first.
+
+#### PACKED_BITS (`0x10 0x00` .. `0x10 0x07`)
+
+Integers 0 and 1 represented by individual bits packed into bytes, most significant bit first.
+
+Padding indicates how many of the least significant bits from the last byte do not encode any element. Drivers MUST
+always set these non-encoding bits in the last byte to zero. Drivers SHOULD ensure these bits are zero when checking a
+Vector for validity. Vectors with no data bytes MUST have a Padding of zero.
+
+Note that the bit order and byte order in this specification are opposite. Byte order is "little-endian" to match common
+CPU architectures, whereas bit order is "big-endian" for left-to-right readability.
+
+Implementations may choose to implement accessors for packed bits using machine words larger than 8 bits for performance
+reasons. If so, they MUST not impose any additional constraints on data length or alignment.
+
+### Examples
+
+- `0x10 0x04 0xee 0xe0`
+
+ - Header: PACKED_BITS, Padding=4
+ - Data bytes: `0xee 0xe0`
+ - The same bytes in binary, most-significant bit first: `1110 1110 1110 0000`
+ - Discarding Padding (4) bits from the end, which SHOULD be zero: `1110 1110 1110`
+ - Unpacked representation, 12 elements: `[1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0]`
+
+- `0x10 0x07 0x80`
+
+ - Header: PACKED_BITS, Padding=7
+ - Data byte: `0x80`
+ - Unpacked representation, 1 element: `[1]`
+
+- `0x10 0x00 0xf0 0x42`
+
+ - Header: PACKED_BITS, Padding=0
+ - Data bytes: `0xf0 0x42`
+ - The same bytes in binary, most-significant bit first: `1111 0000 0100 0010`
+ - Unpacked representation, 16 elements: `[1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0]`
+
+- `0x03 0x00 0xff 0x00 0x01`
+
+ - Header: INT8
+ - Data bytes: `0xff 0x00 0x01`
+ - Integer elements: `[-1, 0, 1]`
+
+- `0x27 0x00 0x00 0x00 0x80 0x3f 0x34 0x12 0x80 0x7f`
+
+ - Header: FLOAT32
+ - Data bytes: `0x00 0x00 0x80 0x3f 0x34 0x12 0x80 0x7f`
+ - The same bytes as two 32-bit words, least significant byte first: `0x3f800000 0x7f801234`
+ - The same 32-bit words interpreted as IEEE 754 `binary32`: `1.0 NaN(0x001234)`
+ - Floating point elements: `[1.0, NaN]`
+ - Converted to Array, represented as Relaxed Extended JSON: `[1.0, {"$numberDouble": "NaN"}]`
+
+## Test Plan
+
+See the [README](tests/README.md) for tests.
## Changelog
+- 2025-02-05: Text clarifications, no technical change.
+
- 2025-02-04: Update validation for decoding into a FLOAT32 vector.
- 2024-11-01: BSON Binary Subtype 9 accepted DRIVERS-2926 (#1708)