Sal

Sal is a cross-platform conversational AI application featuring real-time speech recognition, text-to-speech synthesis, and large language model (LLM) chat capabilities. It leverages WhisperKit for speech-to-text, eSpeak-NG for text-to-speech, and Llama.cpp for LLM inference. The app is built with SwiftUI and supports both macOS and iOS.

Sal in action

Features

Real-Time Speech Recognition: Uses WhisperKit for accurate, on-device transcription.
Text-to-Speech Synthesis: Integrates eSpeak-NG for fast, multi-language voice synthesis.
Conversational LLM: Employs Llama.cpp for local, private AI chat.
Chat History: Stores and manages previous conversations.
Customizable Models: Download, select, and manage Whisper and Llama models.
Modern UI: Neumorphic SwiftUI interface with animated controls and visual feedback.
Cross-Platform: Runs on both macOS and iOS.

Project Structure

Sal/
  SalApp.swift           # App entry point
  ContentView.swift      # Main UI
  backend/               # Core logic for Whisper, Llama, eSpeak, chat, and managers
  top_card/              # Main controls, TV view, settings, and Whisper UI
  bottom_card/           # Message board UI
  neumorphic/            # Custom SwiftUI styles and effects
  Resources/             # App resources
  Assets.xcassets/       # App icons and color assets
  ...
ESpeakExtension/
  SynthAudioUnit.swift   # eSpeak-NG AudioUnit implementation
  ...
SalTests/
SalUITests/

Getting Started

Prerequisites

Xcode 15+
Swift 5.9+
macOS 14.5+ or iOS 17.5+
llama.cpp (via Swift Package)
WhisperKit (via Swift Package)
espeak-ng-spm (via Swift Package)

Installation

Clone the repository:

git clone https://github.com/ThyOwen/sal.git
cd sal

Open in Xcode:
- Open Sal.xcodeproj in Xcode.
Resolve Swift Packages:
- Xcode should automatically fetch dependencies. If not, go to File > Packages > Resolve Package Versions.
Build and Run:
- Select your target platform (macOS or iOS) and click Run.

Sal in Light and Dark Modes

Usage

Start the app: On first launch, load or download the desired Whisper and Llama models.
Microphone Controls: Use the main controls to start/stop recording and interact with the AI.
Chat History: Switch between active chats and history using the controls.
Settings: Adjust model, language, and sensitivity in the settings panel.

Key Components

ContentView: Main application view and layout.
ChatViewModel: Central state and logic manager.
Whisper: Handles speech-to-text.
ESpeak: Handles text-to-speech.
Llama: Handles LLM inference.
MessageBoardManager: Manages status and temporary messages.

License

This project includes code under the GPL (eSpeak-NG) and other open-source licenses. See individual package licenses for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Sal

Features

Project Structure

Getting Started

Prerequisites

Installation

Usage

Key Components

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
ESpeakExtension		ESpeakExtension
Sal.xcodeproj		Sal.xcodeproj
Sal		Sal
SalTests		SalTests
SalUITests		SalUITests
assets		assets
.DS_Store		.DS_Store
readme.md		readme.md

ThyOwen/brain

Folders and files

Latest commit

History

Repository files navigation

Sal

Features

Project Structure

Getting Started

Prerequisites

Installation

Usage

Key Components

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages