Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
7c309ec
ruby : Bump version to 1.3.6
KitaitiMakoto Jan 21, 2026
49b3f10
Fix code in example
KitaitiMakoto Jan 21, 2026
fd6c912
Add sample code to transcribe from MemoryView
KitaitiMakoto Jan 28, 2026
28b97fe
Define GetVADContext macro
KitaitiMakoto Jan 28, 2026
101df90
Use GetVADContext
KitaitiMakoto Jan 28, 2026
47729a4
Extract parse_full_args function
KitaitiMakoto Jan 28, 2026
8be0ed8
Use parse_full_args in ruby_whisper_full_parallel
KitaitiMakoto Jan 28, 2026
1d63ed4
Free samples after use
KitaitiMakoto Jan 28, 2026
b7068be
Check return value of parse_full_args()
KitaitiMakoto Jan 28, 2026
4fecfd6
Define GetVADParams macro
KitaitiMakoto Jan 28, 2026
875f204
Add VAD::Context#segments_from_samples
KitaitiMakoto Jan 28, 2026
eb03a9d
Add tests for VAD::Context#segments_from_samples
KitaitiMakoto Jan 28, 2026
540d48e
Add signature for VAD::Context#segments_from_samples
KitaitiMakoto Jan 28, 2026
aa2f792
Add sample code for VAD::Context#segments_from_samples
KitaitiMakoto Jan 28, 2026
4e23821
Add test for Whisper::Context#transcribe with Pathname
KitaitiMakoto Jan 28, 2026
4b573c9
Make Whisper::Context#transcribe and Whisper::VAD::Context#detect acc…
KitaitiMakoto Jan 28, 2026
50420fc
Update signature of Whisper::Context#transcribe
KitaitiMakoto Jan 28, 2026
6c8a20c
Fix variable name
KitaitiMakoto Jan 28, 2026
a825c01
Don't free memory view
KitaitiMakoto Jan 28, 2026
d247d5a
Make parse_full_args return struct
KitaitiMakoto Jan 28, 2026
bea2bec
Fallback when failed to get MemoryView
KitaitiMakoto Jan 28, 2026
d63f441
Add num of samples when too long
KitaitiMakoto Jan 28, 2026
f8164f3
Check members of MemoryView
KitaitiMakoto Jan 28, 2026
841734b
Fix a typo
KitaitiMakoto Jan 28, 2026
8250f1b
Remove unnecessary include
KitaitiMakoto Jan 28, 2026
256005d
Fix a typo
KitaitiMakoto Jan 28, 2026
56335dd
Fix a typo
KitaitiMakoto Jan 28, 2026
94b90c8
Care the case of MemoryView doesn't fit spec
KitaitiMakoto Jan 28, 2026
4cb862d
Add TODO comment
KitaitiMakoto Jan 28, 2026
93a49da
Add optimazation option to compiler flags
KitaitiMakoto Jan 28, 2026
98bb5a9
Use ALLOC_N instead of malloc
KitaitiMakoto Jan 28, 2026
afd8deb
Add description to sample code
KitaitiMakoto Jan 28, 2026
f32a33d
Rename and change args: parse_full_args -> parse_samples
KitaitiMakoto Jan 29, 2026
e135c54
Free samples when exception raised
KitaitiMakoto Jan 29, 2026
dc11257
Assign type check result to a variable
KitaitiMakoto Jan 29, 2026
635cadc
Define wrapper function of whisper_full
KitaitiMakoto Jan 29, 2026
adfeb10
Change signature of parse_samples for rb_ensure
KitaitiMakoto Jan 29, 2026
d2ba091
Ensure release MemoryView
KitaitiMakoto Jan 29, 2026
22a4809
Extract fill_samples function
KitaitiMakoto Jan 29, 2026
55e4954
Free samples memory when filling it failed
KitaitiMakoto Jan 29, 2026
58fb46e
Free samples memory when transcription failed
KitaitiMakoto Jan 29, 2026
7938864
Prepare transcription in wrapper funciton
KitaitiMakoto Jan 29, 2026
f63d6f4
Change function name
KitaitiMakoto Jan 29, 2026
ac13945
Simplify function boundary
KitaitiMakoto Jan 29, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 33 additions & 2 deletions bindings/ruby/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -323,7 +323,24 @@ whisper
end
```

The second argument `samples` may be an array, an object with `length` and `each` method, or a MemoryView. If you can prepare audio data as C array and export it as a MemoryView, whispercpp accepts and works with it with zero copy.
The second argument `samples` may be an array, an object with `length` and `each` method, or a MemoryView.

If you can prepare audio data as C array and export it as a MemoryView, whispercpp accepts and works with it with zero copy.

```ruby
require "torchaudio"
require "arrow-numo-narray"
require "whisper"

waveform, sample_rate = TorchAudio.load("test/fixtures/jfk.wav")
# Convert Torch::Tensor to Arrow::Array via Numo::NArray
samples = waveform.squeeze.numo.to_arrow.to_arrow_array

whisper = Whisper::Context.new("base")
whisper
# Arrow::Array exports MemoryView
.full(Whisper::Params.new, samples)
```

Using VAD separately from ASR
-----------------------------
Expand All @@ -334,13 +351,27 @@ VAD feature itself is useful. You can use it separately from ASR:
vad = Whisper::VAD::Context.new("silero-v6.2.0")
vad
.detect("path/to/audio.wav", Whisper::VAD::Params.new)
.each_with_index do |segment, index|
.each.with_index do |segment, index|
segment => {start_time: st, end_time: ed} # `Segment` responds to `#deconstruct_keys`

puts "[%{nth}: %{st} --> %{ed}]" % {nth: index + 1, st:, ed:}
end
```

You may also low level API `Whisper::VAD::Context#segments_from_samples` as such `Whisper::Context#full`:

```ruby
# Ruby Array
reader = WaveFile::Reader.new("path/to/audio.wav", WaveFile::Format.new(:mono, :float, 16000))
samples = reader.enum_for(:each_buffer).map(&:samples).flatten

# Or, object which exports MemoryView
waveform, sample_rate = TorchAudio.load("test/fixtures/jfk.wav")
samples = waveform.squeeze.numo.to_arrow.to_arrow_array

segments = vad.segments_from_samples(Whisper::VAD::Params.new, samples)
```

Development
-----------

Expand Down
1 change: 1 addition & 0 deletions bindings/ruby/ext/extconf.rb
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
have_library("gomp") rescue nil
libs = Dependencies.new(cmake, options).to_s

$CFLAGS << " -O3 -march=native"
$INCFLAGS << " -Isources/include -Isources/ggml/include -Isources/examples"
$LOCAL_LIBS << " #{libs}"
$cleanfiles << " build #{libs}"
Expand Down
2 changes: 0 additions & 2 deletions bindings/ruby/ext/ruby_whisper.c
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
#include <ruby.h>
#include <ruby/memory_view.h>
#include "ruby_whisper.h"

VALUE mWhisper;
Expand Down
20 changes: 20 additions & 0 deletions bindings/ruby/ext/ruby_whisper.h
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
#ifndef RUBY_WHISPER_H
#define RUBY_WHISPER_H

#include <ruby.h>
#include <ruby/memory_view.h>
#include "whisper.h"

typedef struct {
Expand Down Expand Up @@ -55,6 +57,13 @@ typedef struct {
struct whisper_vad_context *context;
} ruby_whisper_vad_context;

typedef struct parsed_samples_t {
float *samples;
int n_samples;
rb_memory_view_t memview;
bool memview_exported;
} parsed_samples_t;

#define GetContext(obj, rw) do { \
TypedData_Get_Struct((obj), ruby_whisper, &ruby_whisper_type, (rw)); \
if ((rw)->context == NULL) { \
Expand All @@ -69,6 +78,17 @@ typedef struct {
} \
} while (0)

#define GetVADContext(obj, rwvc) do { \
TypedData_Get_Struct((obj), ruby_whisper_vad_context, &ruby_whisper_vad_context_type, (rwvc)); \
if ((rwvc)->context == NULL) { \
rb_raise(rb_eRuntimeError, "Not initialized"); \
} \
} while (0)

#define GetVADParams(obj, rwvp) do { \
TypedData_Get_Struct((obj), ruby_whisper_vad_params, &ruby_whisper_vad_params_type, (rwvp)); \
} while (0)

#define GetVADSegments(obj, rwvss) do { \
TypedData_Get_Struct((obj), ruby_whisper_vad_segments, &ruby_whisper_vad_segments_type, (rwvss)); \
if ((rwvss)->segments == NULL) { \
Expand Down
Loading
Loading