Skip to content

Utility classes

Alex Andres edited this page Dec 3, 2021 · 7 revisions

Capture audio frames with AudioRecorder

The AudioRecorder allows you to record audio from an audio input device, e.g. microphone. To enumerate devices you can use MediaDevices.getAudioCaptureDevices(). For more detail on devices read the MediaDevices section.

Note: Each sample is assumed to be a 16-bit PCM sample.

AudioDevice device = getAudioDevice("...");

AudioSink sink = new AudioSink() {

	@Override
	public void onRecordedData(byte[] audioSamples, int nSamples,
			int nBytesPerSample, int nChannels, int samplesPerSec,
			int totalDelayMS, int clockDrift) {
		System.out.printf("%d - %d, %d, %d, %d, %d, %d %n",
			audioSamples.length, nSamples, nBytesPerSample,
			nChannels, samplesPerSec, totalDelayMS, clockDrift);

		// write(audioSamples, 0, nSamples * nBytesPerSample);
	}
};

AudioRecorder recorder = new AudioRecorder();
recorder.setAudioDevice(device);
recorder.setAudioSink(sink);
recorder.start();

// wait..

recorder.stop();

Convert audio sampling rate with AudioResampler

Let's assume we want to convert recorded audio which comes from a microphone to a different sampling rate. For the sake of simplicity we extend the AudioSink from the previous example.

// Convert from the sampling rate of 48 kHz to 32 kHz with two channels (stereo).
AudioResampler resampler = new AudioResampler(48000, 32000, 2);

AudioSink sink = new AudioSink() {

	@Override
	public void onRecordedData(byte[] audioSamples, int nSamples,
			int nBytesPerSample, int nChannels, int samplesPerSec,
			int totalDelayMS, int clockDrift) {
		byte[] dst = new byte[nSamples * nBytesPerSample];

		int nResampled = resampler.resample(audioSamples, nSamples * nChannels, dst);

		// write(dst, 0, nResampled * (nBytesPerSample / nChannels));
	}
};

Audio resampling and remixing with AudioConverter

The AudioConverter is an extension of the AudioResampler with the additional support of channel remixing.

// Convert audio with 48 kHz stereo to 44.1 kHz mono.
AudioConverter converter = new AudioConverter(48000, 2 44100, 1);

// 48000 / 100 (10 ms frame) * 2 (channels) * 2 (16-bit PCM sample)
byte[] input = new byte[48000 / 100 * 2 * 2];
// Fill the input buffer...

byte[] output = new byte[converter.getTargetBufferSize()];

// 'nConverted' represents the number of samples in the output buffer.
int nConverted = converter.convert(input, output);

// write(output, 0, nConverted * 2); (16-bit PCM sample)

converter.dispose();

Enhance audio quality with AudioProcessing

With AudioProcessing you can enable audio filters which will enhance audio quality, e.g. filtering out (suppress) noise and echo or remove low frequencies. In addition, AudioProcessingStats can be gathered, e.g. to retrieve the audio signal level or if voice has been detected in an audio section.

Below we extend the AudioSink from the first example.

// Configure audio filters you want to enable.
AudioProcessingConfig config = new AudioProcessingConfig();
config.echoCanceller.enabled = true;
config.echoCanceller.enforceHighPassFiltering = true;
config.gainControl.enabled = true;
config.highPassFilter.enabled = true;
config.noiseSuppression.enabled = true;
config.noiseSuppression.level = NoiseSuppression.Level.HIGH;
config.residualEchoDetector.enabled = true;
config.transientSuppression.enabled = true;
config.voiceDetection.enabled = true;

AudioProcessing audioProcessing = new AudioProcessing();
audioProcessing.applyConfig(config);

// Process input rate of 48 kHz with one channel (mono, can be 2 for stereo, etc.),
// depending on your device/stream configuration.
AudioProcessingStreamConfig streamConfig = new AudioProcessingStreamConfig(48000, 1);

AudioSink sink = new AudioSink() {

	@Override
	public void onRecordedData(byte[] audioSamples, int nSamples,
			int nBytesPerSample, int nChannels, int samplesPerSec,
			int totalDelayMS, int clockDrift) {
		// Perform in-place analysis and processing:
		// - Result is written again into 'audioSamples'.
		// - Or create your own target buffer.
		audioProcessing.processStream(audioSamples, streamConfig, streamConfig, audioSamples);

		// Show some statistics.
		AudioProcessingStats stats = audioProcessing.getStatistics();

		System.out.printf("%d, %d, %b %n", stats.delayMs,
				stats.outputRmsDbfs, stats.voiceDetected);

		// To save enhanced audio. Not required if you're only interested in statistics.
		// write(audioSamples, 0, nSamples * nBytesPerSample);
	}
};
Clone this wiki locally