-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What are the maintainers' opinions on using php-ds as a basis for new data structures in php-src(core)? #156
Comments
Hi @TysonAndre 👋 My long-term intention has been to not merge this extension into php-src. I would like to see it become available as a default extension at the distribution level. Unfortunately I have no influence or understanding of that process. Having an independent release and development cycle is a good thing, in my opinion. If those plans change, I would like to hold off until a 2.0 release - I've learnt a lot over the last 4 years and would like to revisit some of the design decisions I made then, such as a significant reduction of the interfaces or perhaps more interfaces with greater specificity. Functions like I have been working on a research project to design persistent data structures for immutability, so there is a lot of work that I have set for myself for this project over the next 6 months or so. I have no intention to push for distribution changes in the short-term but I am open to the suggestion. |
Thanks for the response
Do you mean OS distribution level (Windows, Ubuntu, CentOS, HomeBrew for mac, etc.?) I agree on the benefits of flexibility and being able to port improvements back to older php versions but adoption rates would be lower
So for |
He meant distribution with PHP core (on all platforms where PHP is available).
Most likely he means libraries such as https://github.com/nikic/iter. |
Whichever is more viable - simply not merged into core, but distributed and enabled by default alongside it.
Correct. You could create a new instance of anything that can be constructed from an iterable, so something like Unfortunately, performance takes a hit because internal iteration is much, much faster (no need to pass through the Iterator methods). I hope to support both lazy collections and internal iteration, but I haven't drafted any plans for that yet. |
I'm going to close this discussion because I would rather start from scratch with a much smaller footprint and a focus on persistence. |
I just discovered this thread from a message by @TysonAndre on internals today: @rtheunissen, I would kindly suggest to think twice about 2 of your decisions above:
I understand some of the benefits of such a decision; however, as @TysonAndre said, adoption will be much slower, and people typically will not use data structures in their library or open-source app, because portability will suffer a lot. On the contrary, if data structures come into core, there will be a strong announcement effect, and people will surely start moving some of their arrays to specific data structures. Libraries will start using them internally, because they'll know that they can rely on it being always available. I do understand that you want to improve your first implementation (which is already excellent, by my standards!), but building on your past "mistakes", can't you build v2 with integration into core in mind? Even if that's not in the next minor release, but having a long-term goal of stabilization of the API and integration into core would be great.
Again, I understand the rationale behind this decision, like reducing duplication and keeping only the core functionality in DS. However, sometimes you have to take into consideration ease of use vs purity of the code.
Thank you for your work on DS anyway, I already use the extension in my closed-source project, in particular |
I totally agree with @BenMorel and I believe this issue should be revisited since a new RFC came up: There's a specific session on it that mentions this extension. Maybe if @rtheunissen reached out to @TysonAndre they could work on this matter together so PHP would have data structures in the core (which I think is really important) and the new improvements that would come in a v2 are implemented. Here is the discussion thread: |
@BenMorel puts the reasons why I think php should have better quality datastructures always-on in core pretty well. (always on rather than in an external extension, even if some of the numerous projects that package php and distribute it also enable some version of php-ds (docker images, Linux distributions, third party repos for Linux distributions offering alternate binary builds, automation tools such as chef/puppet, homebrew, Windows PHP team, etc.,))) NOTE: I'm currently working on https://wiki.php.net/rfc/deque instead after the RFC discussion on I mentioned that the vector rfc was on hold on a different mailing list thread on
I still had a few more minor changes planned for the
From the response to my original question - #156 (comment) - I've been assuming that they'd be opposed to using any of their work for a competing RFC, especially if future proposals would be significantly different from their plans for v2 (e.g. Vector::filter, map, etc). So I felt like reimplementing data structures independently in my spare time using only alternative sources (such as php-src's SplFixedArray, ArrayObject, SplObjectStorage,
In the event that @rtheunissen had changed their mind (on merging php-ds into core) and had plans of their own to create a competing RFC, I would be willing to postpone the (Right now, it'll take at least 3 weeks to start a vote on the |
Hi everyone, I am happy to see this discussion and I thank you all for taking part. My reservation to merge ds into core has always been because I wanted to make sure we get it right before we do that and the intention behind the mythical v2 was to achieve that, based on learnings from v1 and feedback from the community. I have no personal attachment to this project, I only want what is best for PHP and the community. I would love to see a dedicated, super-lean vec data structure in core that has native iteration and all the other same internal benefits as arrays. In my opinion, the API should be very minimal and potentially compatible with all the non-assoc array functions. An OO interface can easily be designed around that. I'm imagining something similar to Golang's slices. As for the future of ds itself, I think these can co-exist and ds can remain external. I've been researching and designing immutable data structures over the last 4 years and I still hope to develop a v2 that simplifies the interfaces and introduces immutable structures. Attempting to implement a suite of structures in core or an OO vector would take a lot of work and might be difficult to reach consensus on with the API. I don't think we should attempt to merge ds into core at any time. I am currently traveling and have not followed this discussion in detail on the mailing list. I'd be happy to assist in any way I can and will catch up as soon as I am home again this week. Feel free to quote this response on the mailing list as well. |
That's great, many in the PHP community and internals will be happy to hear that, and may speed up the timeline for the inclusion of some datastructures.
I would also want a native
Some questions for when you get back:
|
@TysonAndre Could you add \Ds\Vector to the benchmark? Considering the extension is mentioned in the RFC I think it would make sense to have it in the comparison as well. I'm quite surprised that arrays are so fast. Also I noticed that in the benchmark for the new Vector class you're using array access instead of methods. I wonder if methods would be faster. |
Method calls are significantly slower than the array operator shorthands, unsurprisingly, it's widely known php method/function calls are slow and high overhead (compared to other programming languages) despite optimization efforts and investigations. arrays are highly optimized in PHP due to the heavy optimization focus from high frequency of use and the fact opcache can make many inferences about them to avoid dead code, unnecessary cleanup, etc. https://externals.io/message/116048#116077 and the limitations of that benchmark is mentioned in my response to that in https://externals.io/message/116048#116080 for the thread on This benchmark is focusing on the improvements that new datastructures in core would have over the options already available in core, for use cases that would benefit from having this functionality in core (e.g. required for portability reasons). Mentioning an option that is not in core in benchmarking would be mixed messaging if people skimmed over the RFC and failed to realize
php-ds is mentioned in the RFC because the question of why we should work on this functionality in core comes up frequently in proposals to improve datastructures in php itself and needed to be answered for people asking those questions. As the maintainer of php-ds said, improvements to php's core can coexist with PECL libraries outside of core. If there is a competing RFC, I will update the RFC with those benchmarks. |
My position currently is that I think we should start from scratch and borrow whatever is good from php-ds that we want to keep to implement something natively in core. In the 4 or 5 years since I wrote this extension I've been studying persistent data structures in-depth and there are a lot of decisions that I made then that I would do differently now. I would like to be a part of the design and implementation of the data structures themselves, but I do not have the understanding or capacity to be involved in work relating to the engine or its integration. It seems unlikely that a 2.0 of this extension will come about, I'm not convinced that a complete rework would be a good investment of our time. Would anyone be interested in a call sometime to discuss some hopes and dreams for all of this? |
I didn't find any similar questions after a quick search on the repo's issue tracker,
and couldn't be certain from linked discussions from the blog post.
This was suggested on the php-internals mailing list a few years ago, but never made it past the idea phase (e.g. a comment wondering if new classes should be enhanced with a wider method set for functional programming, but that seems to have been largely added for many classes? https://externals.io/message/93301#93347 ).
(e.g. creating an RFC proposing moving DS\Vector to \Vector or \PHP\Vector in php 8.1, or an RFC proposing moving the extension as-is into php-src/ext/ds keeping the
\DS
namespace and moving development to php-src)@author
tags, phpinfo() attribution, input on initial RFCs, etc.The text was updated successfully, but these errors were encountered: