-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
So many examples but no example for punctuate #11
Comments
Hi, you can refer to You just need to modify the code in
Above content's output:
|
@FerdinandZhong ty very much for example. Your output pretty decent and would make my job much easier. Do you have a pip install so that I can directly use your example code in visual studio? Or do you know how can I add your project to Python virtual environment in visual studio in windows 10 I am a C# programmer so I have very little knowledge of Python e.g. i want to use like this |
Hi, you can install the package by Then you can use it as shown in the example. |
Hi @FurkanGozukara, may I know if you have managed to use the punctuator? If yes, I will close the issue |
thank you so much for the follow up i get this error any ideas? I have RTX 3060 and i run some other models with CUDA fine i copy pasted your example code CUDA error: invalid device ordinal inference arguments from debugging like below InferenceArguments(model_name_or_path='Qishuai/distilbert_punctuator_en', tokenizer_name='Qishuai/distilbert_punctuator_en', tag2punctuator={'O': ('', False), 'COMMA': (',', False), 'PERIOD': ('.', True), 'QUESTIONMARK': ('?', True), 'EXLAMATIONMARK': ('!', True)}, tag2id_storage_path=None, gpu_device=1) |
Hi @FurkanGozukara unfortunately, as mentioned by torch official, currently multiprocessing with torch tensors in windows is not supported as shown in below image. I will go and check how to overcome this issue and release a newer version soon. I will keep this issue open for now. |
@FerdinandZhong awesome ty very much. I use my GPU on Whisper and it works great as i shown in the below videos How Good is RTX 3060 for ML AI Deep Learning Tasks and Comparison With GTX 1050 Ti and i7 10700F CPU How to do Free Speech-to-Text Transcription Better Than Google Premium API with OpenAI Whisper Model by the way currently can i use it with CPU instead of GPU? |
Hi @FurkanGozukara glad to hear it works for your video. Yes, you can also use CPU instead, however the inference speed might be a little bit slower. |
@FerdinandZhong so how do we use CPU instead of GPU? I tried several parameters but all failed with CUDA error |
Just simply run it in a machine without any GPU cards. In the current version, it will check if the cuda is available as shown below: if torch.cuda.is_available():
self.device = torch.device(f"cuda:{inference_arguments.gpu_device}")
logger.info(f"device type: {self.device.type}")
else:
self.device = torch.device("cpu") This behaviour can also be optimised as to include CPU as an option to the inference arguments. |
@FerdinandZhong i removed the code and it works really fast on CPU as well. But you should support CPU argument too so the result is printed on the screen but how can I save the result into a text file?
|
You can simply assign the output to a string
and save (append) the string to file, which is very simple in python |
Hello. I want to punctate big chunk of text. E.g. like below. How can I do that? Thank you
Could you write a simple python code to punctate text below?
The text is from my lecture video (https://www.youtube.com/watch?v=_nKwisL8dTs) which I am trying to generate subtitles. Whisper does very well but fails to punctate at some parts.
okay sorry about this confusion what I did is when I have forgotten to unpause the video is simply I have coded a test button and the test button is using our original static file cmd and gif file cmd and I also fixed something in gif file cmd which is I have removed the loses command because it was giving an error now they are working I am using a wait for exists so let me show you how it works okay okay let me start test so the first process is started it is taking some time because that image is pretty big then it is starting the other one and now they are generated okay so you see original file is 820 kilobytes and let's see how much did we gain okay so 820 minus 572 over 820 you see 30 percent gain we have in this file it is significant and it has zero difference how can I be so sure about that we can be sure about that with a comparison okay so I am going to only make a single line of single pixel of difference here on this web p file and I will save it as a test on my desktop here as a png so I will name it as test to png okay and then I will save my original file as test png on the desktop here then I will use online comparison website let me show you compare image difference okay there are several pages for that so first try with diff checker diff checker is awesome website believe me okay so when I see check the difference there is a single line of difference here on this image so how they achieve this I wonder yeah so here when I hover and when I zoom in okay like this you see there is a single line a single pixel of difference here and no other differences it is exactly same and let's compare with another website okay online diff so first image and the second image so I will make the fuzziness zero and it will show as a red color okay so on this image there is a single pixel difference here which is what I have made and there is no other red dot okay so I can copy this image to zoom in so you see there is no other red dot because they are exactly same except the single line single pixel that I have made myself so basically we gain 35 percent 30 percent size in this image and on this gift image we gained from minus to 26.9 over this 35 percent you see with on the gift image we gain 35 percent and let's test if they are working or not so this is our WebP GIF and this is our iponic GIF this is original GIF file and this is WebP file they are looking pretty much same to me we can also use some online websites online GIF to WebP there is one website which I have found working very well this one or yeah let's try this I think it was this one so let's open our debug test so here our GIF upload it then you see there is losing compression mixed compression I unmark them and convert the WebP so this website generated a little bit higher kilobyte because probably it is not using the best compression and that's it okay so we are able to properly convert GIF and static PNG and probably GPX as well we haven't tested GPX so let's also test the GPX for example yeah this wallpaper it's pretty big so it will probably take a lot of time okay let's copy and paste this okay so I will remove this probably we don't even need it right now what is the file name it is this I am not sure if it if it can produce better than GPX because GPX is already losing compression as you know okay let's try it so all processes started at the same time because we are not waiting them and they are running right now as they get completed it will close the window and why it takes so long is that we are using the best possible algorithm and let's see the output okay so yes the WebP file is bigger than the original GPX it is because GPX is already losing and when I save this GPX as a PNG let's see the size okay size of the PNG is this we can of course optimize it a little bit more with PNG out win and I am pretty sure there will be still significant difference between PNG version and WebP version this is a software that I have purchased to optimize my PNG files previously but it is not anymore necessary because now we can use WebP format which is much better format okay so this software is single threaded on a single image so it is taking some time it has so many passes okay so the optimized PNG file is 2.53 megabytes and minus 1.52 megabytes over or not this one actually since GPX files are already losing we shouldn't convert them to WebP probably we we cannot we cannot achieve same quality I wonder if there is an losing but no point of converting GPX into WebP let me check that first okay okay same quality for GPX I think we need to have some losing compression probably for GPX compression we need to use some other methodology so let's see which which options we can use okay let's see okay so there is version loses near loses int so we can use near loses for GPX I think okay so which which option should we use I'm not sure I think I will try near loses yeah let's try it with so for that I'm going to have another file it will be for GPX for GPX I'm going to remove loses and change it with near loses with zero and I think I have to remove z9 as well so yeah I have to remove z9 okay let's try this way for GPX okay and this is the file name okay let's test GPX SR or a GPX and let's comment out this is and let's make it like this yeah okay let's see what kind of results we are going to get with GPX command okay so it is done oh wow now we have a better result than original GPX so let's compare two images quality of course I am not expecting them to be same yeah I can see the difference there is already some difference but I am not sure if we have lost some quality or not yeah we have lost some quality as you can see definitely and it is not small as well okay I wonder if it is possible to compress GPX losing quality is this even possible I'm not sure compress GPX okay okay
The text was updated successfully, but these errors were encountered: