Skip to content

Conversation

@kannwism
Copy link
Contributor

No description provided.

@kannwism
Copy link
Contributor Author

kannwism commented Aug 2, 2024

@Makuh17
This should be in a somewhat functional state. In the examples folder, there is a jupyter notebook with two examples. If you find the time, could you have a look and let me know what you think?

The logic for re fetching an expired token isn't there yet, so not ready to merge yet

@kannwism kannwism marked this pull request as ready for review August 13, 2024 08:24
@kannwism kannwism requested a review from Makuh17 August 13, 2024 09:03
Copy link
Contributor

@Makuh17 Makuh17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a few questions and comments. Mostly nitpicking, or confusions from my side

"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mException\u001b[0m Traceback (most recent call last)",
"Cell \u001b[0;32mIn[3], line 17\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[38;5;28;01mfrom\u001b[39;00m \u001b[38;5;21;01msrc\u001b[39;00m\u001b[38;5;21;01m.\u001b[39;00m\u001b[38;5;21;01mrapidata_client\u001b[39;00m\u001b[38;5;21;01m.\u001b[39;00m\u001b[38;5;21;01morder\u001b[39;00m\u001b[38;5;21;01m.\u001b[39;00m\u001b[38;5;21;01mreferee\u001b[39;00m\u001b[38;5;21;01m.\u001b[39;00m\u001b[38;5;21;01mnaive_referee\u001b[39;00m \u001b[38;5;28;01mimport\u001b[39;00m NaiveReferee\n\u001b[1;32m 4\u001b[0m \u001b[38;5;66;03m# Configure order\u001b[39;00m\n\u001b[1;32m 5\u001b[0m order \u001b[38;5;241m=\u001b[39m (\n\u001b[1;32m 6\u001b[0m \u001b[43mrapi\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mnew_order\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 7\u001b[0m \u001b[43m \u001b[49m\u001b[43mname\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mExample Classify Order\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m 8\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 9\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mworkflow\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 10\u001b[0m \u001b[43m \u001b[49m\u001b[43mClassifyWorkflow\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 11\u001b[0m \u001b[43m \u001b[49m\u001b[43mquestion\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mWho should be president?\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m 12\u001b[0m \u001b[43m \u001b[49m\u001b[43mcategories\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m[\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mKamala Harris\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mDonald Trump\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m]\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 13\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 14\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mreferee\u001b[49m\u001b[43m(\u001b[49m\u001b[43mNaiveReferee\u001b[49m\u001b[43m(\u001b[49m\u001b[43mrequired_guesses\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;241;43m15\u001b[39;49m\u001b[43m)\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 15\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mfeature_flags\u001b[49m\u001b[43m(\u001b[49m\u001b[43mFeatureFlags\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43malert_on_fast_response\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m3\u001b[39;49m\u001b[43m)\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 16\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[0;32m---> 17\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mcreate\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 18\u001b[0m )\n\u001b[1;32m 20\u001b[0m \u001b[38;5;66;03m# Add data\u001b[39;00m\n\u001b[1;32m 21\u001b[0m order\u001b[38;5;241m.\u001b[39mdataset\u001b[38;5;241m.\u001b[39madd_images_from_paths([\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mexamples/data/kamala_trump.jpg\u001b[39m\u001b[38;5;124m\"\u001b[39m])\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess ideally we do not want to push the outputs?

" criteria=\"Who should be president?\",\n",
" )\n",
" .matches_until_completed(5)\n",
" .match_size(2)\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if you do not set e.g. match size? I guess it defaults to two?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, defaults to two

"GB",
"US",
]
GERMAN_SPEAKING = (["AT", "DE"],)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol at CH not being included xD

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:D we should do language targeting in the future anyway


def alert_on_fast_response(self, value: int):
self._flags["alertOnFastResponse"] = str(value)
return self
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the return self achieve here? I just returns the whole FF object again for convienience?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the idea of these "fluent" apis, where you can chain the calls. Not quite sure if we like them or not ^^

self._flags["alertOnFastResponse"] = str(value)
return self

def disable_translation(self, value: bool):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to default to True? Would we ever use these types of FFs without setting them true?

client_secret: str,
endpoint: str = "https://api.rapidata.ai",
):
self._rapidata_service = RapidataService(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you differentiate between a Client and and Service? And where is the client used and when are the more low level classes used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was that all http requests are contained in some sort of service.

  • Should make it possible to write unit tests for the client by mocking the rapidataService, not the requests package itself
  • Layer of separation between the client, that deals with all sorts of things and the actual handling of the http stuff

)

def upload_text_sources(self, dataset_id: str, text_sources: list[str]):
url = f"{self.endpoint}/Dataset/UploadTextSourcesToDataset"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a new endpoint?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not super new I think, but that's the one used for text classification a la multilingual fine web


response = self._post(url, json=payload)

return response
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to raise errors if the request does not go through? Or is this done somewhere else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the ._post() method contains some basic checking logic that raises an error

Comment on lines 28 to 35
images_bytes: list[bytes] = [
ImageUtils.convert_PIL_image_to_bytes(image) for image in images
]

files = [
("files", (image_name, image_bytes))
for image_name, image_bytes in zip(image_names, images_bytes)
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need PIL and image utils and such? is it to allow multiple images? This is how I have uploaded files so far:
files = {"files": (file_name, open(file_path, "rb"), file_type)}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah maybe we don't 😅 But it was a bit of a pain to get it to work with PIL, so I'd leave it in there in case I need it at some point since I do tend to work with PIL images in python scripts. But I could switch it back to using the BuffereReader, like it works for uploading videos.

@kannwism kannwism merged commit a954782 into main Aug 13, 2024
@kannwism kannwism deleted the mari/first-version branch August 13, 2024 15:35
This was referenced Nov 4, 2025
@claude claude bot mentioned this pull request Dec 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants