Sal is a cross-platform conversational AI application featuring real-time speech recognition, text-to-speech synthesis, and large language model (LLM) chat capabilities. It leverages WhisperKit for speech-to-text, eSpeak-NG for text-to-speech, and Llama.cpp for LLM inference. The app is built with SwiftUI and supports both macOS and iOS.
Sal in action
- Real-Time Speech Recognition: Uses WhisperKit for accurate, on-device transcription.
- Text-to-Speech Synthesis: Integrates eSpeak-NG for fast, multi-language voice synthesis.
- Conversational LLM: Employs Llama.cpp for local, private AI chat.
- Chat History: Stores and manages previous conversations.
- Customizable Models: Download, select, and manage Whisper and Llama models.
- Modern UI: Neumorphic SwiftUI interface with animated controls and visual feedback.
- Cross-Platform: Runs on both macOS and iOS.
Sal/
SalApp.swift # App entry point
ContentView.swift # Main UI
backend/ # Core logic for Whisper, Llama, eSpeak, chat, and managers
top_card/ # Main controls, TV view, settings, and Whisper UI
bottom_card/ # Message board UI
neumorphic/ # Custom SwiftUI styles and effects
Resources/ # App resources
Assets.xcassets/ # App icons and color assets
...
ESpeakExtension/
SynthAudioUnit.swift # eSpeak-NG AudioUnit implementation
...
SalTests/
SalUITests/
- Xcode 15+
- Swift 5.9+
- macOS 14.5+ or iOS 17.5+
- llama.cpp (via Swift Package)
- WhisperKit (via Swift Package)
- espeak-ng-spm (via Swift Package)
-
Clone the repository:
git clone https://github.com/ThyOwen/sal.git cd sal -
Open in Xcode:
- Open Sal.xcodeproj in Xcode.
-
Resolve Swift Packages:
- Xcode should automatically fetch dependencies. If not, go to
File > Packages > Resolve Package Versions.
- Xcode should automatically fetch dependencies. If not, go to
-
Build and Run:
- Select your target platform (macOS or iOS) and click Run.
Sal in Light and Dark Modes
- Start the app: On first launch, load or download the desired Whisper and Llama models.
- Microphone Controls: Use the main controls to start/stop recording and interact with the AI.
- Chat History: Switch between active chats and history using the controls.
- Settings: Adjust model, language, and sensitivity in the settings panel.
ContentView: Main application view and layout.ChatViewModel: Central state and logic manager.Whisper: Handles speech-to-text.ESpeak: Handles text-to-speech.Llama: Handles LLM inference.MessageBoardManager: Manages status and temporary messages.
This project includes code under the GPL (eSpeak-NG) and other open-source licenses. See individual package licenses for details.

