Skip to content

Add field type for Binary vectors#2854

Merged
GromNaN merged 1 commit intodoctrine:2.13.xfrom
GromNaN:PHPORM-384
Oct 16, 2025
Merged

Add field type for Binary vectors#2854
GromNaN merged 1 commit intodoctrine:2.13.xfrom
GromNaN:PHPORM-384

Conversation

@GromNaN
Copy link
Member

@GromNaN GromNaN commented Oct 13, 2025

Q A
Type feature
BC Break no
Fixed issues PHPORM-384

Summary

Vector can be compressed into a binary field thanks to the new MongoDB\BSON\Binary::TYPE_VECTOR introduced in MongoDB PHP Extension v2.2.0

@GromNaN GromNaN added this to the 2.13.0 milestone Oct 13, 2025
@GromNaN GromNaN requested a review from Copilot October 13, 2025 19:26
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for MongoDB binary vector field types, leveraging the new MongoDB\BSON\Binary::TYPE_VECTOR introduced in MongoDB PHP Extension v2.2.0. It enables compression of vector data into binary fields for improved performance.

Key changes:

  • Introduces three new vector types: vector_float32, vector_int8, and vector_packed_bit
  • Implements type conversion between PHP arrays and MongoDB binary vector format
  • Adds comprehensive test coverage for the new vector types

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
lib/Doctrine/ODM/MongoDB/Types/Type.php Adds constants and type mappings for the three vector types
lib/Doctrine/ODM/MongoDB/Types/AbstractVectorType.php Base implementation for vector type conversion and validation
lib/Doctrine/ODM/MongoDB/Types/VectorFloat32Type.php Concrete implementation for Float32 vector type
lib/Doctrine/ODM/MongoDB/Types/VectorInt8Type.php Concrete implementation for Int8 vector type
lib/Doctrine/ODM/MongoDB/Types/VectorPackedBitType.php Concrete implementation for PackedBit vector type
tests/Doctrine/ODM/MongoDB/Tests/Types/VectorTypeTest.php Comprehensive test suite for all vector types
tests/Doctrine/ODM/MongoDB/Tests/Aggregation/Stage/VectorSearchTest.php Updates vector search test to use proper binary vector format
docs/en/reference/basic-mapping.rst Documents the new vector types and requirements
phpstan-baseline.neon Adds PHPStan suppressions for MongoDB extension 2.2+ features

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@GromNaN GromNaN requested a review from jmikola October 13, 2025 19:42
@GromNaN GromNaN force-pushed the PHPORM-384 branch 3 times, most recently from e143a37 to 7a1f26b Compare October 13, 2025 19:52
return $value->toArray();
}

public function closureToMongo(): string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are these methods for? Might be helpful to document. At least to me it's unclear what they are

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, addressed your request in #2855

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record, in all my years of working with ODM (on and off) I don't think I've ever fully understood why we needed all of these methods. The answer probably rests in what PHP 5.3 was capable of back in the day.

throw new InvalidArgumentException(sprintf('Invalid binary vector data of vector type %s received for vector field, expected vector type %s', $value->getVectorType()->name, $this->getVectorType()->name));
}

return $value->toArray();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the design, the property type is considered to be an array, so an array is always returned. However, it would be much more useful to know the type of the property in order to return a Binary if the property allows it, thus avoiding the need to decode the vector when it's not actually used.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something not possible with the current design of the type, since we don't have any context for the class property to which we are associated?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly.

throw new InvalidArgumentException(sprintf('Invalid binary vector data of vector type %s received for vector field, expected vector type %s', $value->getVectorType()->name, $this->getVectorType()->name));
}

return $value->toArray();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something not possible with the current design of the type, since we don't have any context for the class property to which we are associated?

use PHPUnit\Framework\TestCase;

/** Binary Vector type added in ext-mongodb 2.2+ */
#[RequiresMethod(Binary::class, 'fromVector')]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very handy. I don't get why Sebastian rejected RequiresClass: sebastianbergmann/phpunit#4792

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could have used RequiresPhpExtension('mongodb', '>= 2.2')

Copy link
Member

@jmikola jmikola Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could folks just use RequiresMethod(MyClass::class, '__construct()') since all classes have a constructor?

Edit: No they cannot. Not all classes have a constructor (even implicit): https://3v4l.org/Y5OdE#v8.4.13

Copy link
Member

@jmikola jmikola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to block this on an ext-mongodb 2.2.0 release? I'll defer to you.

I assume no changes to composer.json or the CI workflow are required.

PHP);
}

abstract protected function getVectorType(): VectorType;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the PHPStan error here just due to the class not being available?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've skipped in PHPStan all errors related to binary vectors.

@GromNaN GromNaN force-pushed the PHPORM-384 branch 3 times, most recently from e1b52dc to 82a1b5c Compare October 16, 2025 14:01
"phpstan/phpstan-deprecation-rules": "^2.0",
"phpstan/phpstan-phpunit": "^2.0",
"phpunit/phpunit": "^10.4",
"phpunit/phpunit": "^10.5.58",
Copy link
Member Author

@GromNaN GromNaN Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating the minimum version of PHPUnit as the version 10.4.0 load the data providers even if #[RequiresPhpExtension('mongodb', '>= 2.2')] is not matched.

@GromNaN GromNaN merged commit fa8affc into doctrine:2.13.x Oct 16, 2025
24 checks passed
@GromNaN GromNaN deleted the PHPORM-384 branch October 16, 2025 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants