Skip to content

dharmab/skyeye

Repository files navigation

SkyEye: AI Powered GCI Bot for DCS

SkyEye is a Ground Controlled Intercept (GCI) bot for the flight simulator Digital Combat Simulator (DCS). It is an advanced replacement for the in-game E-2, E-3 and A-50 AI aircraft.

SkyEye is a substantial improvement over the DCS AWACS:

  1. SkyEye offers modern voice recognition using a current-generation AI model. Keyboard input is also supported.
  2. SkyEye has natural sounding voices, instead of robotically clipping together samples. On Windows and Linux, SkyEye uses a neural network to speak in a human-like voice. On macOS, SkyEye speaks using Siri's voice.
  3. SkyEye adheres more closely to real-world brevity and procedures instead of the incorrect brevity used by the in-game AWACS.
  4. SkyEye supports a larger number of commands, including PICTURE, BOGEY DOPE, DECLARE, SNAPLOCK, SPIKED, and ALPHA CHECK.
  5. SkyEye intelligently monitors the battlespace, providing automatic THREAT, MERGED and FADED callouts to improve situational awareness.

SkyEye uses Speech-To-Text and Text-To-Speech technology which can run locally on the same computer as SkyEye. No cloud APIs are required, although cloud APIs are optionally supported. It works with any DCS mission, singleplayer or multiplayer. No special scripting or mission editor setup is required. You can run it for less than a nickel per hour on a cloud server, or run it on a computer in your home running Windows, Linux or macOS.

SkyEye is production ready software. It is used by a few public servers and many private squadrons. Based on download statistics, I estimate over 100 communities are using SkyEye, such as:

SkyEye is free software. It is free as in beer; you can download and run it for free. It is also free as in freedom; the source code is available for you to study and modify to fit your needs.

Getting Started

Demonstration

See it in action! Jump to 7:24 in this demo video by DCS ANZUS

FAQ

Where can I try SkyEye?

You can try SkyEye on the Flashpoint Levant server. No installation is required, just connect to their DCS and SRS server and tune to one of these radio frequencies:

  • 136.0 AM
  • 255.0 AM
  • 40.0 FM

See https://limakilo.net for server details.

Where do I download SkyEye?

On Windows and Linux, SkyEye can be downloaded from GitHub Releases.

On Linux, SkyEye is also available as a container: ghcr.io/dharmab/skyeye:latest. Note this container won't work on Windows or macOS.

On macOS, SkyEye can be installed using Homebrew:

brew tap dharmab/skyeye
brew install dharmab/skyeye/skyeye

See the admin guide for detailed instructions on installing, configuring and running SkyEye.

What do I need to run SkyEye?

There are a few different ways to run SkyEye. In order from best to least recommended:

  1. On an Apple Sillicon Mac networked to your DCS server, using local speech recognition. This offers the fastest speech recognition and the highest quality AI voice.
  2. On your DCS server, using the OpenAI API for speech recognition. This offers fast speech recognition and good quality AI voices, but requires a credit card accepted by OpenAI to purchase API credits from OpenAI. At current pricing, $1 of OpenAI credit pays to recognize more than 1000 transmissions over SRS.
  3. On a separate Windows or Linux computer networked to your DCS server, using local speech recognition. This offers good-enough speech recognition performance and good quality AI voices without any credit card required. This also works with rented cloud servers, some of whom accept other payment methods compared to OpenAI.

Running SkyEye on the same computer as DCS, using local speech recognition, is not recommended and no support can be provided for that configuration. Use a separate computer or OpenAI's API instead.

What kind of hardware does it require?

Generally, local speech recognition requires one of:

  • Any Apple Silicon Mac, such as a Mac Mini or MacBook Air/Pro.
  • A Windows or Linux computer with a fast quad-core CPU from the last 2-3 CPU generations.

Cloud speech recognition requirements are quite modest.

See the Hardware section of the admin guide for more details, including a table of benchmarks.

Can I train the speech recognition on my voice/accent?

Since the software runs 100% locally, the speech recognition model is a local file. Server operators can provide a trained model as an alternative to the off-the-shelf model. See this blog post for an example.

I don't plan to provide a mechanism for players to submit their voice recordings to the main repository due to data privacy concerns.

Does this use Line-Of-Sight restrictions?

Not at this time. I am working on a solution for this, but it will take me a while.

If this is a critical feature for you, consider using MOOSE's AWACS module instead. It supports Line-Of-Sight and datalink simulation, at the tradeoff of requiring some special setup in the Mission Editor.

OverlordBot also optionally supports this feature, although less than 1% of users used it.

Will this work with DCS' built-in VoIP?

As of this writing, DCS' built-in VoIP does not support external clients. SkyEye therefore requires SRS to function.

Could this use a Large Language Model? (llama, mistral, etc.)

SkyEye uses an embedded LLM for speech-to-text, but I deliberately chose not to use an LLM for SkyEye's language parsing or decision-making logic.

Within the domain of air combat communication, these problems are less linguistic and more mathematical in nature. Air combat communication uses a limited, highly specific vocabulary and a low-context grammar that can be parsed quickly with traditional programming methods. The workflow for the tactical controller is a straightforward decision tree mostly based on tables of aircraft data, some middle school geometry and a few statistical methods. These workflows can be implemented in a few hundred lines of code and run in a few milliseconds. An LLM would have worse performance, no guarantee of consistency, much larger CPU and memory requirements, and introduces a large surface area of ML-specific issues such as privacy of training data sets, debugging hallucinations, and a much more difficult testing and validation process.

While working on this software I spoke to a number of people who thought it would be as easy as feeding a bunch of PDFs to an LLM and it would magically learn how to be a competent tactical controller. This could not be further from the truth!

Could this provide ATC services?

I have no plans to attempt an ATC bot due to limitations within DCS.

AI aircraft in DCS cannot be directly commanded through scripting or external software and are incapable of safely operating in controlled airspace. for example, AI aircraft in DCS do not sequence for landing, and will only begin an approach if the entire approach and runway are clear. AI aircraft also cannot execute a hold or a missed approach, and they make no effort to maintain separation from other aircraft.

While working on this software I spoke to a number of people who thought it would be as easy as feeding a bunch of PDFs to an LLM and it would magically become a capable Air Traffic Controller. This could not be further from the truth! See this post by a startup working on AI for ATC on the challenges involved.

Are there options for different voices?

SkyEye can be used with one of these voices:

  1. Jenny, a feminine Irish English voice available on Windows and Linux.
  2. Alan, a masculine British English voice available on Windows and Linux.
  3. Samantha, a feminine US English voice available on macOS. This is the older version of Siri's voice from the iPhone 4s, iPhone 5 and iPhone 6.
  4. Siri's voices are available on macOS. Additional download and setup steps are required to use them.

I have chosen these voices because they meet the following criteria:

  • Permissive licensing
  • Source data was recorded with consent
  • Correct and unambiguous pronunciation, especially of numeric values, NATO reporting names and the Core Information Format
  • Able to run fully offline on modest hardware in near-realtime
  • Easily redistributable without requiring complex additional software to be installed
  • Sound the same regardless of the make and model of CPU or GPU used to generate it
  • Likely to remain functional many years into the future, including on future OS versions

I have investigated a number of alternative AI voices including ElevenLabs, OpenAI, Kokoro, Sherpa, Coqui, and others. I have not found voices that better meet these criteria. I continue to follow the state of the art and watch for new developments.

Can you add an option to do insert feature here?

I'm happy to hear your ideas, but I am very selective about what I choose to implement.

I develop SkyEye at no monetary cost to the user; therefore, one of my priorities is to keep the complexity of the software close to the minimum necessary level to ease the maintenance burden. I'm focusing only on features that are useful to most players. I avoid adding features that are gated by configuration options, because each one multiplies the permutations that need to be tested and debugged. See this video.

SkyEye is open source software. If you want a feature that I don't want to maintain, you have the right to fork the project and add it yourself (or hire a programmer to add it for you).

Technology

SkyEye would not be possible without these people and projects, for whom I am deeply appreciative:

  • DCS-SRS by @ciribob. Ciribob also patiently answered many of my questions on SRS internals and provided helpful debugging tips whenever I ran into a block in the SRS integration.
  • Tacview - specifically, ACMI real time telemetry - provides the data feed from DCS World.
  • @rurounijones's OverlordBot was a useful reference against SkyEye during early development, and Jones himself was also patient with my questions on Discord.
  • OpenAI's Whisper provides speech-to-text. @ggerganov's ggml and whisper.cpp allows Whisper to be used locally without requiring cloud services or complex external software.
  • @rodaine's numwords module is invaluable for parsing numeric quantities from voice input.
  • Piper by the Rhasspy voice assistant project is used for speech-to-text on Windows and Linux.
  • The Jenny dataset by Dioco provides the feminine voice for SkyEye on Windows and Linux.
  • @popey's dataset provides the masculine voice for SkyEye on Windows and Linux.
  • @amitybell's embedded Piper module makes distribution and implementation of Piper a breeze. @nabbl improved this module.
  • Apple's Speech Synthesis Manager is used for text-to-speech on macOS.
  • @mattetti's go-audio project is used for decoding AIFF audio.
  • The Opus codec and the hraban/opus module provides audio compression for the SRS protocol.
  • @hbollon's go-edlib module provides algorithms to help SkyEye understand when it slightly mishears/the user slightly misspeaks a callsign or command over the radio.
  • @lithammer's shortuuid module provides a GUID implementation compatible with the SRS protocols.
  • @zaf's resample module helps with audio format conversion between Piper and SRS.
  • @martinlindhe's unit module provides easy angular, length, speed and frequency unit conversion.
  • @paulmach's orb module provides a simple, flexible GIS library for analyzing the geometric relationships between aircraft.
  • @proway's go-igrf module implements the International Geomagnetic Reference Field used to correct for magnetic declination.
  • @rsc and @jba's omap module provides a data structure used as part of SkyEye's algorithm for combining player callsigns.
  • Cobra is used for the CLI frontend, including configuration flags, help and examples. Viper is used to load configuration from a file/environment variables.
  • MSYS2 provides a Windows build environment.
  • @bwmarrin's discordgo module provides the Discord tracing integration.
  • @pasztorpisti's go-crc module provides algorithms for negotiating handshakes with TacView telemetry sources.
  • Oto was helpful for debugging audio format conversion problems.
  • zerolog is helpful for general logging and printf debugging.
  • testify is used in unit tests.
  • flock, maintained by the Gofrs, provides optional concurrency controls for running multiple instances of SkyEye on a single CPU.
  • Multiple DCS communities provide invaluable feedback and morale-booster energy:
  • The Ace Combat series by PROJECT ACES/Bandai Namco and Project Wingman by Sector D2 are massive influences on my interest in GCI/AWACS, and aviation in general. This project would not exist without the impact of Ace Combat 04: Shattered Skies.
  • And of course, DCS World is produced by Eagle Dynamics.