Audio Processing Laboratory

Past course
Please visit the education page for information on current courses.
image1 image2 image3 image4


  • Instructor: Prof. Dr. Meinard Müller
  • Tutors:
    • Soumitro Chakrabarty, Maja Taseska
    • Stefan Balke, Jonathan Driedger, Thomas Prätzlich
    • Prof. Dr. Tom Bäckström, Johannes Fischer
    • Fabian-Robert Stöter, Michael Schöffler
  • Language: English
  • Place: Am Wolfsmantel 33, Erlangen-Tennenlohe
  • Credits: 2,5 ECTS

Relation to previous Lab:

This lab has been significantly extended and revised based on the feedback we recieved. It now includes new experiments including basic audio and music processing as well as spatial (3D) audio. Also the time management of the lab experiments has been improved such that each session should not take longer than the estimated four hours. However, we assume that the documents to be issued are carefully read by participants prior to the respective lab experiment. To support the participants, we also offer a mini test as well as an additional MATLAB introduction.

Schedule

The lab consists of:

  • one preleminary meeting (2 hours),
  • a second meeting (2 hours, including a short mini test) and
  • five units (4 hours each).

Note: Attendance is mandatory for all meetings and labs.

  • First meeting: 09.04.2014, 14:00-16:00, Room 3R4.04.
  • Second meeting + mini test: 14.05.2014, 14:00-16:00, Room 3R4.04.

All following lab courses take place in Room 3R3.06 (P1, LIKE):

  • Lab 1: Short-Time Fourier Transform and Chroma Features (Meinard Müller, Stefan Balke) Wed 21.05.2014, 14:00-19:00
  • Lab 2: Statistical Methods for Audio Experiments (Michael Schöffler, Fabian-Robert Stöter) Wed 28.05.2014, 14:00-19:00
  • Lab 3: Harmonic Percussive Source Separation (Jonathan Driedger, Thomas Prätzlich) Wed 04.06.2014, 14:00-19:00
  • Lab 4: Speech Analysis (Tom Bäckström, Johannes Fischer) Wed 11.06.2014, 14:00-19:00
  • Lab 5: Beamforming for Speech Enhancement (Soumitro Chakrabarty, Maja Taseska) Wed 18.06.2014, 14:00-19:00

Enrollment

logo-studon

If you want take this lab course, please register before 9th of April via StudOn. For questions, please contact Thomas Prätzlich.

Objectives and Format

The objective of this lab course is to give students a hands on experience in audio processing. The lab is organised as follows:

  • First meeting in first semester week. Assignment of topics and groups. Further instructions.
  • Short mini test beginning May covering the basics from the provided course material. Every student has to understand these basics (checked by the mini test) prior to participating in the lab sessions.
  • Every group (each consisting of two participants) has to pass all lab courses.
  • The lab course material will be made available in April. The hand outs cover theoretical as well as practical aspects of the labs. They also include homework excercises which are required to be prepared before the lab starts.

The lab courses will be held weekly for each group and will be supervised by a member of the AudioLabs team.

Topics and Sources

Lab 0: Basics on MATLAB

This document gives a small introduction to MATLAB. Rather than being comprehensive, we only introduces some basic functions that are needed in the subsequent lab courses.

Lab 1: Short-Time Fourier Transform and Chroma Features

The Fourier transform, which is used to convert a time-dependent signal to a frequency-dependent signal, is one of the most important mathematical tools in audio signal processing. Applying the Fourier transform to local sections of an audio signal, one obtains the short-time Fourier transform (STFT). In this lab course, we study a discrete version of the STFT. To work with the discrete STFT in practice, one needs to correctly interpret the discrete time and frequency parameters. Using MATLAB, we compute a discrete STFT and visualize its magnitude in form of a spectrogram representation. Then, we derive from the STFT various audio features that are useful for analyzing music signals. In particular, we develop a log-frequency spectrogram, where the frequency axis is converted into an axis corresponding to musical pitches. From this, we derive a chroma representation, which is a useful tool for capturing harmonic information of music.

Lab 2: Statistical Methods for Audio Experiments

This course intends to teach students the basics of experimental statistics as it is used for evaluating auditory experiments. Listening tests or experiments are a crucial part of assessing the quality of audio systems. There is currently no system available to give researchers and developers the possibility to evaluate the quality of audio systems fully objectively. In fact the best evaluation instrument is the human ear. Since only fair and unbiased comparisons between codecs guarantee that new developments are more preferred than the previous system, it is important to bring fundamental knowledge of statistics into the evaluation process to address the main problems of experimental tests, such as uncontrolled environments, subpar headphones or loudspeaker reproduction systems, listeners who have no experience to listening tests and so on.

Lab 3: Harmonic Percussive Source Separation

Sounds can broadly be classified into two classes. Harmonic sound on the one hand side is what we perceive as pitched sound and what makes us hear melodies and chords. Percussive sound on the other hand is noise-like and usually stems from instrument onsets like the hit on a drum or from consonants in speech. The goal of harmonic-percussive source separation (HPSS) is to decompose an input audio signal into a signal consisting of all harmonic sounds and a signal consisting of all percussive sounds. In this lab course, we study an HPSS algorithm and implement it in MATLAB. Exploiting knowledge about the spectral structure of harmonic and percussive sounds, this algorithm decomposes the spectrogram of the given input signal into two spectrograms, one for the harmonic, and one for the percussive component. Afterwards, two waveforms are reconstructed from the spectrograms which finally form the desired signals. Additionally, we describe the application of HPSS for enhancing chroma feature extraction and onset detection. The techniques used in this lab cover median filtering, spectral masking and the inversion of the short-time Fourier transform.

Lab 4: Speech Analysis

This experiment is designed to give you a brief overview of the physiology of the production of speech. Moreover, it will give a descriptive introduction to the tools of speech coding, their functionality and their strengths but also their shortcomings.

Lab 5: Beamforming for Speech Enhancement

This module is designed to give the students a practical understanding of beamforming for speech enhancement and demonstrate the difference in performance of fixed and signal-dependent beamformers. The module is closely related to the lecture “Speech Enhancement” given by Prof. Dr. ir. Emanuel Habets. In this exercise, the students are expected to implement a fixed or signal-independent beamformer known as delay-and-sum beamformer and a signal-dependent beamformer known as minimum variance distortionless response (MVDR) beamformer for a noise reduction and interference rejection task. Their performances are then compared via objective measures to demonstrate the improved performance of the signal-dependent beamformers.

  • Tutors: Soumitro Chakrabarty, Maja Taseska
  • Instructions: tba
  • Sources: tba

Course pre-requisites and assessment criteria

Requirements are a solid mathematical background, a good understanding of fundamentals in digital signal processing, as well as a general background and personal interest in audio. Furthermore, the students are required to have experience with MATLAB. The Statistics Lab will use the R Programming language.

To pass the lab course you need to pass all five individual labs.