Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Multimodal Input Support (Image, Audio, Video) to App-UI in MS-Swift Library #2469

Open
SushantGautam opened this issue Nov 18, 2024 · 0 comments

Comments

@SushantGautam
Copy link

The MS-Swift library currently supports models capable of processing multimodal input (image, audio, video) via the web-UI. However, this functionality is not available in the app UI. We request the inclusion of multimodal input support in the app-UI to enable seamless integration and usage of models with multimodal capabilities, aligning it with the web UI's features.

Adding this feature will enhance the MS-Swift library's usability in mobile or desktop application development, ensuring consistent multimodal support across platforms. This could involve creating APIs for uploading and processing different data modalities and providing developers with examples or templates for implementation. Such an update would broaden the library’s applicability in real-world scenarios, such as multimedia content analysis, accessibility tools, and creative applications.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant