Skip to content

Commit b3f0aa8

Browse files
committed
0.1.4.6
1 parent 968669b commit b3f0aa8

File tree

3 files changed

+13
-26
lines changed

3 files changed

+13
-26
lines changed

README.md

+13-26
Original file line numberDiff line numberDiff line change
@@ -61,43 +61,30 @@ But remains a great solution for users with minimal technical knowledge or exper
6161

6262
### Tested on Windows 10+ and Nvidia GPU-based cards
6363

64-
### Update [0.1.4]
64+
### Update [0.1.4.6]
65+
<img src="media/covers/SEAIT_anim.gif">
6566

66-
Added
67+
Added:
6768

68-
### bark-gui and openai whisper-ui both tested on GTX 970 4GB and worked great.
69+
- VisionCrafter
6970

70-
[Play man](media/preview/0.1.4/final_23-25-37.wav)
71+
A tool that can generates animations and music from text,
72+
ideal for producing short videos and GIFs, as well as creating brief cinematic scenes.
7173

72-
[Play woman](media/preview/0.1.4/final_23-33-38.wav)
74+
https://github.com/diStyApps/VisionCrafter
7375

74-
- bark-gui [text-to-speech and voice cloning]
76+
- VisualClipPicker
7577

76-
https://github.com/suno-ai/bark
78+
Trimming Clips by Face Recognition
7779

78-
Bark is a transformer-based text-to-audio model created by Suno. Bark can generate highly realistic, multilingual speech as well as other audio - including music, background noise and simple sound effects. The model can also produce nonverbal communications like laughing, sighing and crying. To support the research community, we are providing access to pretrained model checkpoints ready for inference.
80+
https://github.com/diStyApps/VisualClipPicker
7981

80-
https://github.com/C0untFloyd/bark-gui
82+
Changed:
8183

82-
bark-gui is This is a simple Web UI for an extended Bark Version using Gradio, meant to be run locally.
83-
84-
- whisper-ui [speech-to-text]
85-
86-
https://github.com/openai/whisper
87-
88-
Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification.
89-
90-
A bit old maybe there new GUIs for whisper but i used this one.
91-
92-
https://github.com/hayabhay/whisper-ui
93-
94-
whisper-ui is a simple Streamlit UI for OpenAI's Whisper speech-to-text model. It let's you download and transcribe media from YouTube videos, playlists, or local files. You can then browse, filter, and search through your saved audio files.
95-
96-
I have also have an old fork of this project with some differences that let chose gpu or cpu but its older then this one i might added later if requested.
97-
98-
Minor fixes and changes to the code.
84+
Vladiffsuion to SD.Next
9985

10086
### Spread the word; don't only keep it to yourself.
87+
<img src="media/preview/0.1.4.6/1.png">
10188
<img src="media/preview/0.1.4/1_0.1.4.jpg">
10289
<img src="media/preview/0.1.2/1_0.1.2.jpg">
10390
<img src="media/preview/0.1.3/1_0.1.3.jpg">

media/covers/SEAIT_anim.gif

6.3 MB
Loading

media/preview/0.1.4.6/1.png

128 KB
Loading

0 commit comments

Comments
 (0)