@@ -11,93 +11,14 @@ options:
11
11
-m FNAME, --model FNAME [models/ggml-base.en.bin] model path
12
12
-di, --diarize [false ] stereo audio diarization
13
13
```
14
- ## service
14
+ ## whisper_http_server_base_httplib
15
15
16
- Simple http service. WAV Files are passed to the inference model via http requests.
16
+ Simple http service. WAV mp4 and m4a Files are passed to the inference model via http requests.
17
17
18
18
```
19
- ./cmake-build-debug/service -m models/ggml-base.en.bin
20
- ```
21
-
22
- ``` shell
23
- whisper_init_from_file_with_params_no_state: loading model from ' models/ggml-base.en.bin'
24
- whisper_model_load: loading model
25
- whisper_model_load: n_vocab = 51864
26
- whisper_model_load: n_audio_ctx = 1500
27
- whisper_model_load: n_audio_state = 512
28
- whisper_model_load: n_audio_head = 8
29
- whisper_model_load: n_audio_layer = 6
30
- whisper_model_load: n_text_ctx = 448
31
- whisper_model_load: n_text_state = 512
32
- whisper_model_load: n_text_head = 8
33
- whisper_model_load: n_text_layer = 6
34
- whisper_model_load: n_mels = 80
35
- whisper_model_load: ftype = 1
36
- whisper_model_load: qntvr = 0
37
- whisper_model_load: type = 2 (base)
38
- whisper_model_load: adding 1607 extra tokens
39
- whisper_model_load: n_langs = 99
40
- whisper_backend_init: using Metal backend
41
- ggml_metal_init: allocating
42
- ggml_metal_init: found device: Apple M2
43
- ggml_metal_init: picking default device: Apple M2
44
- ggml_metal_init: default.metallib not found, loading from source
45
- ggml_metal_init: error: could not use bundle path to find ggml-metal.metal, falling back to trying cwd
46
- ggml_metal_init: loading ' ggml-metal.metal'
47
- ggml_metal_init: GPU name: Apple M2
48
- ggml_metal_init: GPU family: MTLGPUFamilyApple8 (1008)
49
- ggml_metal_init: hasUnifiedMemory = true
50
- ggml_metal_init: recommendedMaxWorkingSetSize = 11453.25 MB
51
- ggml_metal_init: maxTransferRate = built-in GPU
52
- ggml_metal_add_buffer: allocated ' backend ' buffer, size = 156.68 MB, ( 157.20 / 11453.25)
53
- whisper_model_load: Metal buffer size = 156.67 MB
54
- whisper_model_load: model size = 156.58 MB
55
- whisper_backend_init: using Metal backend
56
- ggml_metal_init: allocating
57
- ggml_metal_init: found device: Apple M2
58
- ggml_metal_init: picking default device: Apple M2
59
- ggml_metal_init: default.metallib not found, loading from source
60
- ggml_metal_init: error: could not use bundle path to find ggml-metal.metal, falling back to trying cwd
61
- ggml_metal_init: loading ' ggml-metal.metal'
62
- ggml_metal_init: GPU name: Apple M2
63
- ggml_metal_init: GPU family: MTLGPUFamilyApple8 (1008)
64
- ggml_metal_init: hasUnifiedMemory = true
65
- ggml_metal_init: recommendedMaxWorkingSetSize = 11453.25 MB
66
- ggml_metal_init: maxTransferRate = built-in GPU
67
- ggml_metal_add_buffer: allocated ' backend ' buffer, size = 16.52 MB, ( 173.72 / 11453.25)
68
- whisper_init_state: kv self size = 16.52 MB
69
- ggml_metal_add_buffer: allocated ' backend ' buffer, size = 18.43 MB, ( 192.15 / 11453.25)
70
- whisper_init_state: kv cross size = 18.43 MB
71
- whisper_init_state: loading Core ML model from ' models/ggml-base.en-encoder.mlmodelc'
72
- whisper_init_state: first run on a device may take a while ...
73
- whisper_init_state: Core ML model loaded
74
- ggml_metal_add_buffer: allocated ' backend ' buffer, size = 0.02 MB, ( 196.51 / 11453.25)
75
- whisper_init_state: compute buffer (conv) = 5.67 MB
76
- ggml_metal_add_buffer: allocated ' backend ' buffer, size = 0.02 MB, ( 196.53 / 11453.25)
77
- whisper_init_state: compute buffer (cross) = 4.71 MB
78
- ggml_metal_add_buffer: allocated ' backend ' buffer, size = 0.02 MB, ( 196.54 / 11453.25)
79
- whisper_init_state: compute buffer (decode) = 96.41 MB
80
- ggml_metal_add_buffer: allocated ' backend ' buffer, size = 4.05 MB, ( 200.59 / 11453.25)
81
- ggml_metal_add_buffer: allocated ' backend ' buffer, size = 3.08 MB, ( 203.67 / 11453.25)
82
- ggml_metal_add_buffer: allocated ' backend ' buffer, size = 94.78 MB, ( 298.45 / 11453.25)
83
-
84
- whisper service listening at http://0.0.0.0:8080
85
-
86
- Received request: jfk.wav
87
- Successfully loaded jfk.wav
88
-
89
- system_info: n_threads = 4 / 8 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | METAL = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 | CUDA = 0 | COREML = 1 | OPENVINO = 0 |
90
-
91
- handleInference: processing ' jfk.wav' (176000 samples, 11.0 sec), 4 threads, 1 processors, lang = en, task = transcribe, timestamps = 1 ...
92
-
93
- Running whisper.cpp inference on jfk.wav
94
-
95
- [00:00:00.000 --> 00:00:11.000] And so my fellow Americans, ask not what your country can do for you, ask what you can do for your country.
96
- ` ` `
97
- ` ` `
98
- ./service -h
19
+ ./whisper_http_server_base_httplib -h
99
20
100
- usage: ./bin/service [options]
21
+ usage: ./bin/whisper_http_server_base_httplib [options]
101
22
102
23
options:
103
24
-h, --help [default] show this help message and exit
@@ -131,7 +52,12 @@ options:
131
52
--host HOST, [127.0.0.1] Hostname/ip-adress for the service
132
53
--port PORT, [8080 ] Port number for the service
133
54
```
134
-
55
+ ## start whisper_http_server_base_httplib
56
+ ```
57
+ ./cmake-build-debug/whisper_http_server_base_httplib -m models/ggml-base.en.bin
58
+ ```
59
+ Test server
60
+ see request doc in [ doc] ( doc )
135
61
## request examples
136
62
137
63
** /inference**
@@ -140,11 +66,21 @@ curl --location --request POST http://127.0.0.1:8080/inference \
140
66
--form file=@"./samples/jfk.wav" \
141
67
--form temperature="0.2" \
142
68
--form response-format="json"
69
+ --form audio_format="wav"
143
70
```
144
71
145
72
** /load**
146
73
```
147
74
curl 127.0.0.1:8080/load \
148
75
-H "Content-Type: multipart/form-data" \
149
76
-F model="<path-to-model-file>"
150
- ` ` `
77
+ ```
78
+
79
+ ## whisper_server_base_on_uwebsockets
80
+ web socket server
81
+ start server
82
+ ```
83
+ ./cmake-build-debug/whisper_server_base_on_uwebsockets -m models/ggml-base.en.bin
84
+ ```
85
+ Test server
86
+ see python [ client] ( client )
0 commit comments