-
Notifications
You must be signed in to change notification settings - Fork 403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENHANCEMENT] Argilla SDK: Updating record fields and vectors #5026
Conversation
…tainers (api_model method)
@frascuchon @davidberenstein1957 Looks nice, just some in line comments and queries. |
def __init__(self, metadata: Optional[Dict[str, MetadataValue]] = None) -> None: | ||
super().__init__(metadata or {}) | ||
|
||
def __getattr__(self, item: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we definitely going to support both of these strategies?
record.vectors["new-vector"] = [1.0, 2.0, 3.0]
record.vectors.vector = [1.0, 2.0, 3.0]
record.metadata["new-key"] = "new_value"
record.metadata.key = "new_value"
Personally, I'm not 100% sure about the attribute based approach because I don't like how the IDE behaves. Is there a way to improve the experience there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is clearer to only support one, because otherwise, we might end up with these too fragmented approaches for doing the same thing again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I agree with this. I've aligned fields and vectors with metadata, supporting both approaches. Working with dict is fine for me, and is clearest to users since in the doc they are described as dict. Same feeling with IDE warnings when using attributes. I would say to keep the dict-like access and raise an error when accessing using attributes (I think the default behavior is do-nothing)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A question here, should we do the same for responses and suggestions? Now, we support:
record.responses.label
record.suggestions.text
which have the same IDE problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. So we'll open a new PR for suggestions and responses, where we'll use a dictionary style access as well? Nice.
return [Vector(name=name, values=value).api_model() for name, value in self.items()] | ||
|
||
def to_dict(self) -> Dict[str, List[float]]: | ||
return {key: value for key, value in self.items()} | ||
|
||
|
||
class RecordResponses(Iterable[Response]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How will this class be used when users want to:
- add a first response after retrieving records?
- add subsequent responses after retrieving records?
Ideally, these should just be the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me, both cases should be something like this:
record.responses.add(rg.Response(...)..)
# or
record.responses.add[rg.Response(...), rg.Response(...)])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments.
def to_dict(self) -> Dict[str, Union[str, None]]: | ||
return self.__fields | ||
def to_dict(self) -> dict: | ||
return {key: value for key, value in self.items()} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return {key: value for key, value in self.items()} | |
return dict(self.items()) |
def __init__(self, metadata: Optional[Dict[str, MetadataValue]] = None) -> None: | ||
super().__init__(metadata or {}) | ||
|
||
def __getattr__(self, item: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is clearer to only support one, because otherwise, we might end up with these too fragmented approaches for doing the same thing again.
self[key] = value | ||
|
||
def to_dict(self) -> dict: | ||
return {key: value for key, value in self.items()} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return {key: value for key, value in self.items()} | |
return dict(self.items()) |
return [Vector(name=name, values=value).api_model() for name, value in self.items()] | ||
|
||
def to_dict(self) -> Dict[str, List[float]]: | ||
return {key: value for key, value in self.items()} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return {key: value for key, value in self.items()} | |
return dict(self.items()) |
vectors=[rg.Vector("text", [0, 0, 0])], | ||
id=str(uuid.uuid4()), | ||
vectors={"text": [0, 0, 0]}, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<!-- Thanks for your contribution! As part of our Community Growers initiative 🌱, we're donating Justdiggit bunds in your name to reforest sub-Saharan Africa. To claim your Community Growers certificate, please contact David Berenstein in our Slack community or fill in this form https://tally.so/r/n9XrxK once your PR has been merged. --> # Description This PR addresses [discussion](#5026 (comment)) from PR #5026 and removes attribute-like access for record fields, vectors and metadata. **Type of change** (Please delete options that are not relevant. Remember to title the PR according to the type of change) - [ ] New feature (non-breaking change which adds functionality) - [ ] Refactor (change restructuring the codebase without changing functionality) - [ ] Improvement (change adding some improvement to an existing functionality) **How Has This Been Tested** (Please describe the tests that you ran to verify your changes. And ideally, reference `tests`) - [ ] Test A - [ ] Test B **Checklist** - [ ] I added relevant documentation - [ ] I followed the style guidelines of this project - [ ] I did a self-review of my code - [ ] I made corresponding changes to the documentation - [ ] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I filled out [the contributor form](https://tally.so/r/n9XrxK) (see text above) - [ ] I have added relevant notes to the `CHANGELOG.md` file (See https://keepachangelog.com/)
Description
This PR reviews the record attributes and normalizes how to work with fields, vectors, and metadata. Now, all are treated as dictionaries and users can update in the same way that working with dictionaries or creating new attributes:
Once this approach is approved, I will create a new PR changing the docs.
Type of change
(Please delete options that are not relevant. Remember to title the PR according to the type of change)
How Has This Been Tested
(Please describe the tests that you ran to verify your changes. And ideally, reference
tests
)Checklist
CHANGELOG.md
file (See https://keepachangelog.com/)