feat(vad): implement native voice activity detection for Linux #846
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Implements native voice activity detection for Linux using CPAL audio capture and the Silero VAD model, providing better platform integration and reliability compared to browser-based audio processing.
Motivation
While the existing web-based VAD works well, browser audio APIs can have limitations on Linux systems. This native implementation provides:
Implementation Details
Architecture
NativeVadServicealongside existingVadServicewith identical interfacerecording.vad.useNative)src-tauri/src/recorder/vad.rsKey Features
Technical Implementation
User Experience
Settings Integration
Testing
Breaking Changes
None. This is purely additive:
Dependencies
Added
voice_activity_detector = "0.2.1"to provide Silero VAD model integration.Files Changed
src-tauri/src/recorder/vad.rs- New native VAD implementationsrc/lib/services/native-vad.ts- TypeScript service wrappersrc/lib/settings/settings.ts- Added VAD configuration optionssrc/routes/(config)/settings/recording/+page.svelte- UI controls and descriptionsFuture Considerations
This implementation maintains Epicenter's local-first philosophy while providing Linux users with improved audio processing reliability through native platform integration.