Skip to content

Conversation

@mpdude
Copy link
Contributor

@mpdude mpdude commented Oct 8, 2025

Motivation

There are a few issues about differences in behaviour when using the collection filtering API (the Selectable interface) against collections that are database-backed (in ORM, these are PersistentCollections) vs. memory-based collections using ArrayCollection from this package here.

For example:

Database-based matching can work on the raw field values only, as those values are persisted to the database and there is no PHP code involved when filtering at the database level.

Memory-based matching currently tries a series of access methods on the objects.

The effects of this may be surprising. For example, with the ORM, it may be fine to filter entities based on the value of private or protected fields that have no getters. This works as long as a persistent collection is uninitialized. But as soon as it gets initialized, the ArrayCollection will require a getter method to be available.

Another (more rare) example is a getter method that does some type of type conversion, like having a string field with values like 'y'|'n' internally but returning a bool value from the getter; or, more generally, every type mismatch between the return value of the getter and the field value. Yet another example may be getters that cause side effects 🙈.

Proposed solution

I discussed with @beberlei at the Doctrine hackathon that the primary use case for this library here was supporting the ORM/ODM use cases. This can be seen in places as ClosureExpressionVisitor::getObjectFieldValue() that take a $field parameter.

So, although this library here has nothing to do with ORM/ODM mapping, I want to add a migration path here that moves the doctrine/collection behaviour closer to the implementation realities of ORM/ODM. This means to ultimately use direct (reflection-based) field access only. This is also what @Ocramius already suggested in this comment.

For reference, here is a list of discussions around which style of accessors, getters, issers, public access etc. should be used or not used – in the future, the answer would be "only direct state (raw property value) matters".

Migration path

This feature is opt-in and will be activated by passing accessRawFieldValues: true to the Criteria constructor. The Criteria object is what is typically constructed by users in preparation for calling the Selectable API, so it seems to be a good fit.

By opting in through this flag, memory-based comparisons and sorting will use direct field access only. Not activating the feature triggers a deprecation notice. In the next major version, direct field access will be the only (default) behaviour.

The $accessRawFieldValues can be removed in the next major version (or, possibly, go through another round of deprecations in case when it is still passed, before being eventually removed).

Caveat

As explained in more detail in #487 (references here after for more better visibility), accessing fields through reflection may bypass initialization of proxy objects in the ORM/ODM. See #487 for more details.

Remaining edge case

Given an inheritance hierarchy of classes where a multiple classes feature a private field of the same name, the downmost field will be picked.

This may differ from Doctrine ORM behaviour when this field is not mapped at all in the ORM and another field (higher up the class hierarchy) is used as the mapped field instead.

We cannot solve this without having access to ORM/ODM mapping metadata at hand, which is not possible from within an ArrayCollection that is typically created as a newable type. We rather plan to discourage or even prevent this kind of setup (entity class hierarchy with different classes having fields of the same name) at some point when loading and validating metadata.

TODO

@mpdude mpdude force-pushed the access-fields-directly branch 2 times, most recently from c5653c0 to 87770cf Compare October 8, 2025 15:12
@mpdude mpdude changed the title Add a switch to make the ClosureExpressionVisitor use direct field access through reflection only Transition to using reflection-based, direct field access only Oct 8, 2025
Copy link
Member

@stof stof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it weird that so few tests are updated to use the non-deprecated usage, while also having no deprecations reported in the PHPUnit output. I think we need to improve the CI setup regarding deprecations.

The only tests that should trigger our own deprecations should be test covering legacy behaviors (basically tests that would be removed in the next major version when removing legacy code, without regressing code coverage)

@mpdude mpdude force-pushed the access-fields-directly branch 4 times, most recently from 53d2495 to 657b65c Compare October 8, 2025 23:10
@mpdude
Copy link
Contributor Author

mpdude commented Oct 8, 2025

This PR is now based on #473, so it includes deprecation checks in PHPUnit. That was merged, so I rebased.

Please advise about the coding standard issue.

@greg0ire
Copy link
Member

greg0ire commented Oct 9, 2025

Please advise about the coding standard issue.

I think in both cases, you are facing bugs, right? If yes, let's report them (if not reported by somebody else already) and ignore them with a comment.

@mpdude
Copy link
Contributor Author

mpdude commented Oct 9, 2025

PHPCSStandards/PHP_CodeSniffer#734 - maybe no need to report anything?

@mpdude mpdude force-pushed the access-fields-directly branch from 897f378 to 1a7598d Compare October 9, 2025 07:59
@mpdude mpdude changed the base branch from 2.4.x to 2.3.x October 9, 2025 08:00
@mpdude mpdude changed the base branch from 2.3.x to 2.4.x October 9, 2025 08:00
@mpdude mpdude changed the base branch from 2.4.x to 2.3.x October 9, 2025 08:02
@mpdude mpdude requested a review from stof October 9, 2025 08:39
@mpdude
Copy link
Contributor Author

mpdude commented Oct 21, 2025

@stof Could you give me another round of review here?

Thx!

@mpdude mpdude force-pushed the access-fields-directly branch from 1a7598d to 64d3bd6 Compare October 22, 2025 11:36
@mpdude mpdude requested a review from stof October 22, 2025 16:29

use Doctrine\Deprecations\Deprecation;

Deprecation::withoutDeduplication();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, I'm working on making this easier: doctrine/deprecations#94

@mpdude
Copy link
Contributor Author

mpdude commented Oct 23, 2025

Feedback and deprecations addressed

@greg0ire
Copy link
Member

Since this introduces deprecations, you should target 2.4.x

src/Criteria.php Outdated
* @return static
*/
public static function create()
public static function create(bool $accessRawFieldValues = false): self
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a breaking change, you should use func_get_args() IMO

https://3v4l.org/Z4ZnM

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well spotted! fixed

@mpdude mpdude force-pushed the access-fields-directly branch from 1334749 to c319d1a Compare October 24, 2025 14:26
@mpdude mpdude dismissed beberlei’s stale review October 24, 2025 18:30

The merge-base changed after approval.

@beberlei
Copy link
Member

Great work cleaning this up

# Motivation

There are a few issues about differences in behaviour when using the collection filtering API (the `Selectable` interface)  against collections that are database-backed (in ORM, these are `PersistentCollections`) vs. memory-based collections using `ArrayCollection` from this package here.

For example:

* doctrine/orm#11160
* doctrine#170
* doctrine/orm#3591
* Maybe doctrine/orm#11021

Database-based matching can work on the raw field values only, as those values are persisted to the database and there is no PHP code involved when filtering at the database level.

Memory-based matching currently tries a [series of access methods on the objects](https://www.doctrine-project.org/projects/doctrine-collections/en/2.3/index.html#selectable-methods:~:text=For%20collections%20that%20contain%20objects).

The effects of this may be surprising. For example, with the ORM, it may be fine to filter entities based on the value of `private` or `protected` fields that have no getters. This works as long as a persistent collection is uninitialized. But as soon as it gets initialized, the `ArrayCollection` will require a getter method to be available.

Another (more rare) example is a getter method that does some type of type conversion, like having a `string` field with values like `'y'|'n'` internally but returning a `bool` value from the getter; or, more generally, every type mismatch between the return value of the getter and the field value. Yet another example may be getters that cause side effects 🙈.

# Proposed solution

I discussed with @beberlei at the Doctrine hackathon that the primary use case for this library here was supporting the ORM/ODM use cases. This can be seen in places as `ClosureExpressionVisitor::getObjectFieldValue()` that take a `$field` parameter.

So, although this library here has nothing to do with ORM/ODM mapping, I want to add a migration path here that moves the `doctrine/collection` behaviour closer to the implementation realities of ORM/ODM. This means to ultimately use direct (reflection-based) field access only. This is also what @Ocramius already suggested in [this comment](doctrine#149 (comment)).

For reference, here is a list of discussions around which style of accessors, getters, issers, public access etc. should be used or not used – in the future, the answer would be "only direct state (raw property value) matters".

* doctrine#276
* doctrine#263
* doctrine#149
* doctrine#135
* doctrine#134
* doctrine#95
* doctrine#62

# Migration path

This feature is opt-in and will be activated by passing `accessRawFieldValues: true` to the `Criteria` constructor. The `Criteria` object is what is typically constructed by users in preparation for calling the `Selectable` API, so it seems to be a good fit.

By opting in through this flag, memory-based comparisons and sorting will use direct field access only. Not activating the feature triggers a deprecation notice. In the next major version, direct field access will be the only (default) behaviour.

The `$accessRawFieldValues` can be removed in the next major version (or, possibly, go through another round of deprecations in case when it is still passed, before being eventually removed).

# Remaining edge case

Given an inheritance hierarchy of classes where a multiple classes feature a `private` field of the same name, the downmost field will be picked.

This may differ from Doctrine ORM behaviour when this field is not mapped at all in the ORM and another field (higher up the class hierarchy) is used as the mapped field instead.

We cannot solve this without having access to ORM/ODM mapping metadata at hand, which is not possible from within an `ArrayCollection` that is typically created as [a newable type](https://testing.googleblog.com/2008/10/to-new-or-not-to-new.html). We rather plan to discourage or even prevent this kind of setup (entity class hierarchy with different classes having fields of the same name) at some point when loading and validating metadata.
@mpdude mpdude force-pushed the access-fields-directly branch from 73b4ab7 to dd8efc7 Compare October 24, 2025 19:00
@mpdude
Copy link
Contributor Author

mpdude commented Oct 24, 2025

Now rebased against the up-to-date 2.4.x branch

@greg0ire greg0ire added this to the 2.4.0 milestone Oct 25, 2025
@greg0ire greg0ire merged commit 9acfeea into doctrine:2.4.x Oct 25, 2025
11 checks passed
@greg0ire
Copy link
Member

Thanks @mpdude !

@mpdude mpdude deleted the access-fields-directly branch October 26, 2025 10:25
mpdude added a commit to mpdude/doctrine-collections that referenced this pull request Oct 26, 2025
This removes the `accessRawFieldValues` parameter that was added in doctrine#472 in the 3.0.0 branch.

Since this major version, fields values are always as raw values by means of reflection, so the parameter was a no-op.
mpdude added a commit to mpdude/doctrine2 that referenced this pull request Oct 26, 2025
This updates the code to avoid triggering the deprecation introduced in doctrine/collections#472.
mpdude added a commit to mpdude/doctrine2 that referenced this pull request Oct 26, 2025
This updates the code to avoid triggering the deprecation introduced in doctrine/collections#472.
mpdude added a commit to mpdude/doctrine2 that referenced this pull request Oct 26, 2025
This updates the code to avoid triggering the deprecation introduced in doctrine/collections#472.
mpdude added a commit to mpdude/doctrine2 that referenced this pull request Oct 26, 2025
This updates the code to avoid triggering the deprecation introduced in doctrine/collections#472.
W0rma added a commit to W0rma/module-doctrine that referenced this pull request Oct 28, 2025
Fixes a deprecation warning when using doctrine/collections:2.4
See doctrine/collections#472
W0rma added a commit to W0rma/module-doctrine that referenced this pull request Oct 28, 2025
Fixes a deprecation warning when using doctrine/collections:2.4
See doctrine/collections#472
mpdude added a commit to mpdude/doctrine-collections that referenced this pull request Oct 28, 2025
greg0ire added a commit that referenced this pull request Oct 29, 2025
Add a note that #472 may leave proxies uninitialized
}

if ($firstResult === null && func_num_args() > 2) {
Deprecation::trigger(
Copy link

@W0rma W0rma Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed that this deprecation will always be triggered if f.e. new Criteria(accessRawFieldValues: true) is used.

That's because of the nature of func_num_args() as explained in https://www.php.net/manual/en/function.func-num-args.php:

As of PHP 8.0.0, the func_*() family of functions is intended to be mostly transparent with regard to named arguments, by treating the arguments as if they were all passed positionally, and missing arguments are replaced with their defaults. This function ignores the collection of unknown named variadic arguments. Unknown named arguments which are collected can only be accessed through the variadic parameter.

See also https://3v4l.org/5X4Xu#v8.4.14

Users can work around that by either using new Criteria(firstResult: 0, accessRawFieldValues: true) or Criteria::create(true).

Just FYI if other users might get confused about that as well.

TavoNiievez pushed a commit to Codeception/module-doctrine that referenced this pull request Nov 12, 2025
* Enable native lazy objects whenever possible.

The LazyGhostTrait will be removed from symfony/var-exporter:8.

* Fix usage of deprecated phpstan config option

* Set $accessRawFieldValues explicitly

Fixes a deprecation warning when using doctrine/collections:2.4
See doctrine/collections#472
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants