Subjective Evaluation of Blind Audio Source Separation Database: SEBASS-DB

The SEBASS-DB is a collection of subjective ratings on the perceived quality of separated audio source signals. It contains the results of five listening tests on assessing the Basic Audio Quality of such signals. The audio signals graded by the listeners are publicly available and have been taken from community-based Signal Separation Evaluation Campaigns and a study on subjective and objective quality assessment of Audio Source Separation. The subjective ratings and audio signals can be accessed as described below.

Updates

  • September 2022: A dedicated paper on the SEBASS DB is now available, presented at the IWAENC 2022 [12]. Please reference now this publication when using the database.

  • October 2021: A non-intrusive DNN-based quality estimate for controlling the remixing of separated dialogue is presented using an adapted 2f-model as underlying quality measure [11] Online

  • April 2021: 2f-model shown to be one of the best state-of-the-art objective measures of perceptual audio quality in terms of correlation with human perceptual scores across different domains [10] Online.

  • April 2020: Supplement to the 2f-model [3] for estimating subjective quality of separated audio source signals. An additional set of parameters is available, optimised for usage with PQevalAudio, an open source implementation of the PEAQ Basic measurement scheme by P. Kabal.

Collection of the subjective ratings

The subjective ratings were collected using MUSHRA based listening tests. The webMUSHRA software was used for some of the listening tests. It provides a graphical user interface for the participants of the listening test.

The test procedure can be explained as follows:

Method of presentation

In each trial of a listening test, the participants blindly rated the following test signals in comparison to the known (ideal) reference source signal:

  • separated versions of the source signal from different separation algorithms
  • the original source signal (hidden reference)
  • the signal mixture (as anchor signal)

The subjects had the possibility to switch instantaneously between each of the presented signals and to set playback loops. The items were presented via headphones in a quite listening room.

Method of quantification

The listener had to grade the unknown signals on the MUSHRA scale (0..100). The numerical scale is equally divided into five segments and semantical annotated ("bad", "poor", "fair", "good", "excellent"). The question asked to the listener was: "Grade the Basic Audio Quality of the items under test with respect to the reference signal. Any perceived differences between the reference and the other signal must be interpreted as an impairment."

License

Due to the different origin of the audio signals and the subjective ratings, different licenses apply to these data.

  • The subjective grades are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
  • The license of the audio signals differs for the individual audio datasets. Links to licensing are presented in the according section on the datasets.
  • When using this database, the following paper must be referenced: "The SEBASS-DB: A Consolidated Public Data Base of Listening Test Results for Perceptual Evaluation of BSS Quality Measures" (in-press) by Thorsten Kastner and Jürgen Herre, International Workshop on Acoustic Signal Enhancement (IWAENC 2022), Bamberg, Germany [12].

The SEBASS-DB Datasets

The SEBASS-DB consists of 5 different sets of listening test results listed in the following. Test signals and listener ratings can be downloaded completely as one file archive for four of theses datasets: SASSEC, SiSEC08, SAOC, PEASS-DB. Only the listener ratings are provided for the SiSEC18 dataset. The audio signals from the SiSEC18 datasets are not provided. They need to be downloaded from the MUSDB18 website. To get the test signals, you have to contact the provider of the MUSDB18 corpus (see below for details).

The number of listener participated in a listening test varied between 7 and 19. Experienced listener took part in the tests except for the listening tests for the SiSEC18 dataset. Experienced and naive listeners took part in the tests for this dataset.

All audio signals had been transformed to 2 channel stereo, 16 Bit, 48kHz Sampling rate and PCM. The audio format of the original audio signals may differ.

The duration of the signals presented to the listener is maximum 10s. In case the original audio signal is longer, the first 10s of the audio signal are used for the listening tests.

The subjective ratings are provided as csv-files with the fields: Testname, Listener name, Test Trial, Test Condition and Rating score.

a) SASSEC Dataset

b) SiSEC08 Dataset

c) SAOC Dataset

d) PEASS-DB

e) SiSEC18

Supplement

The SEBASS-DB had been used to train and test an efficient model for estimating subjective quality of separated audio source signals [3]. This model uses two Model Output Variables (MOVs) from the ITU-R BS.1387 PEAQ [9] measurement scheme as input for computing a subjective quality score.

An additional parameter set for this so-called '2f-model' is now available.

We observed differences in the computed MOVs between our internal implementation of PEAQ used for calibrating the 2f-model and publicly available implementations of PEAQ.
These differences lead to noticeable deviations in the output of the 2f-model, depending on the PEAQ version used to calculate the MOVs for driving the 2f-model.

We therefore provide an additional parameter set for the 2f-model, tailored for use with one specific publicly available implementation of PEAQ to achieve best prediction performance in combination with this PEAQ implementation and to have the possibility to verify own implementations of the 2f-model.

The 2f-model was calibrated the same way using the same dataset of separated source signals as described in the paper [3] but using PQevalAudio, a publicly available MATLAB implementation of the PEAQ Basic measurement scheme by P. Kabal [8] from the McGill University, for computing the MOVS used as input for the 2f-model to derive this additional parameter set.

The adapted 2f-model calibrated on PQEvalAudio is:

$$\rm{MMS}_{est} = \frac{56.1345}{1 + \left( - 0.0282 \cdot AvgModDiff1 - 0.8628 \right)^2} - 27.1451 \cdot ADB + 86.3515$$ Note: The model output must be limited to the value range from 0 to 100.

A set of test signals, reference signals and according model outputs can be downloaded to verify your implementation:

Literature

  1. Thorsten Kastner
    Evaluating physical measures for predicting the perceived quality of blindly separated audio source signals
    In Audio Engineering Society Convention 127, 2009.
    @conference{ksr_measures,
    Address = {New York},
    Author = {Thorsten Kastner},
    Booktitle = {Audio Engineering Society Convention 127},
    Month = {Oct},
    Number = {7824},
    Title = {Evaluating physical measures for predicting the perceived quality of blindly separated audio source signals},
    Url = {http://www.aes.org/e-lib/browse.cfm?elib=15020},
    Year = {2009}
    }
  2. Thorsten Kastner
    The Influence of the Rendering Architecture on the Subjective Performance of Blind Source Separation Algorithms
    In Audio Engineering Society Convention 127, 2009.
    @conference{ksr_BSS_SAOC,
    Address = {New York},
    Author = {Thorsten Kastner},
    Booktitle = {Audio Engineering Society Convention 127},
    Url = {http://www.aes.org/e-lib/browse.cfm?elib=15093},
    Month = {Oct},
    Number = {7898},
    Title = {The Influence of the Rendering Architecture on the Subjective Performance of Blind Source Separation Algorithms},
    Year = {2009}
    }
  3. Thorsten Kastner and Jürgen Herre
    An Efficient Model for Estimating Subjective Quality of Separated Audio Source Signals
    In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA'19), 2019. DOI
    @conference{ksrWaspaa19,
    Address = {New Paltz, New York, USA},
    Author = {Thorsten Kastner and Jürgen Herre},
    Booktitle = {{IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA'19)}},
    Doi = {10.1109/WASPAA.2019.8937179}
    Month = {October},
    Title = {An Efficient Model for Estimating Subjective Quality of Separated Audio Source Signals},
    Year = {2019}}
  4. Emmanuel Vincent, Hiroshi Sawada, Pau Bofill, Shoji Makino, and Justinian P. Rosca
    First Stereo Audio Source Separation Evaluation Campaign: Data, Algorithms and Results
    In 7th International Conference on Independent Component Analysis and Signal Separation (ICA07): 552–559, 2007. DOI
    @conference{sourceSep_ICA07,
    Address = {London, United Kingdom},
    Author = {Emmanuel Vincent and Hiroshi Sawada and Pau Bofill and Shoji Makino and Justinian P. Rosca},
    Booktitle = {7th International Conference on Independent Component Analysis and Signal Separation {(ICA07)}},
    Doi = {10.1007/978-3-540-74494-8_69},
    Month = {Sep},
    Pages = {552-559},
    Title = {First Stereo Audio Source Separation Evaluation Campaign: Data, Algorithms and Results},
    Year = {2007}
    }
  5. Emmanuel Vincent, Shoko Araki, and Pau Bofill
    The 2008 Signal Separation Evaluation Campaign: A Community-Based Approach to Large-Scale Evaluation
    In 8th International Conference on Independent Component Analysis and Signal Separation (ICA): 734–741, 2009. DOI
    @conference{sourceSep_ICA09,
    Address = {Paraty, Brazil},
    Author = {Emmanuel Vincent and Shoko Araki and Pau Bofill},
    Booktitle = {8th International Conference on Independent Component Analysis and Signal Separation {(ICA)}},
    Doi = {10.1007/978-3-642-00599-2_92},
    Month = {Mar},
    Pages = {734--741},
    Title = {The 2008 Signal Separation Evaluation Campaign: A Community-Based Approach to Large-Scale Evaluation},
    Year = {2009}
    }
  6. Valentin Emiya, Emmanuel Vincent, Niklas Harlander, and Volker Hohmann
    Subjective and objective quality assessment of audio source separation
    In Transactions on Audio, Speech, and Language Processing: 2046–2057, 2011. DOI
    @inproceedings{emiya:2011:inria-00567152:1,
    Author = {Valentin Emiya and Emmanuel Vincent and Niklas Harlander and Volker Hohmann},
    Booktitle = {Transactions on Audio, Speech, and Language Processing},
    Journal = {IEEE Transactions on Audio, Speech and Language Processing},
    Doi= {10.1109/TASL.2011.2109381},
    Organization = {IEEE},
    Pages = {2046 - 2057},
    Title = {Subjective and objective quality assessment of audio source separation},
    Volume = {19 (7)},
    Year = {2011}
    }
  7. Rafii, Zafar, Liutkus, Antoine, Fabian-Robert Stöter, Mimilakis, Stylianos Ioannis, and Bittner, Rachel
    The MUSDB18 corpus for music separation
    2017. DOI
    @misc{rafii_zafar_2017_1117372,
    Author = {Rafii, Zafar and Liutkus, Antoine and Fabian-Robert St{\"o}ter and Mimilakis, Stylianos Ioannis and Bittner, Rachel},
    Doi = {10.5281/zenodo.1117372},
    Month = {Dec},
    Title = {The MUSDB18 corpus for music separation},
    Url = {https://doi.org/10.5281/zenodo.1117372},
    Year = {2017}
    }
  8. P. Kabal
    An Examination and Interpretation of ITU-R BS.1387: Perceptual Evaluation of Audio Quality
    Technical Report, McGill University, 2002.
    @techreport{Kabal,
    Author = {P. Kabal},
    Institution = {McGill University},
    Title = {An Examination and Interpretation of ITU-R BS.1387: Perceptual Evaluation of Audio Quality},
    Year = {2002}
    }
  9. Thilo Thiede, William C. Treurniet, Roland Bitto, Christian Schmidmer, Thomas Sporer, John G. Beerends, and Catherine Colomes
    PEAQ - The ITU Standard for Objective Measurement of Perceived Audio Quality
    J. Audio Eng. Soc, 48(1/2): 3–29, 2000.
    @article{thiede2000peaq,
    Author = {Thilo Thiede and William C. Treurniet and Roland Bitto and Christian Schmidmer and Thomas Sporer and John G. Beerends and Catherine Colomes},
    Journal = {J. Audio Eng. Soc},
    Number = {1/2},
    Pages = {3--29},
    Title = {{PEAQ} - The {ITU} Standard for Objective Measurement of Perceived Audio Quality},
    Url = {http://www.aes.org/e-lib/browse.cfm?elib=12078},
    Volume = {48},
    Year = {2000},
    }
  10. Matteo Torcoli, Thorsten Kastner, and Jürgen Herre
    Objective Measures of Perceptual Audio Quality Reviewed: An Evaluation of Their Application Domain Dependence
    IEEE/ACM Transactions on Audio, Speech, and Language Processing, 29: 1530–1541, 2021. DOI
    @article{9388867,
    author={Matteo Torcoli and Thorsten Kastner and Jürgen Herre},
    journal={IEEE/ACM Transactions on Audio, Speech, and Language Processing},
    title={Objective Measures of Perceptual Audio Quality Reviewed: An Evaluation of Their Application Domain Dependence},
    year={2021},
    volume={29},
    number={},
    pages={1530-1541},
    doi={10.1109/TASLP.2021.3069302}
    }
  11. Matteo Torcoli, Jouni Paulus, Thorsten Kastner, and Christian Uhle
    Controlling the Remixing of Separated Dialogue with a Non-Intrusive Quality Estimate
    In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA'21), 2021. DOI
    @conference{torWaspaa21,
    Address = {New Paltz, New York, USA},
    Author = {Matteo Torcoli and Jouni Paulus and Thorsten Kastner and Christian Uhle},
    Booktitle = {{IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA'21)}},
    Month = {October},
    Title = {Controlling the Remixing of Separated Dialogue with a Non-Intrusive Quality Estimate},
    Year = {2021},
    doi={10.1109/WASPAA52581.2021.9632756}
    }
  12. Thorsten Kastner and Jürgen Herre
    The SEBASS-DB: A Consolidated Public Data Base of Listening Test Results for Perceptual Evaluation of BSS Quality Measures, in-press
    In IEEE International Workshop on Acoustic Signal Enhancement (IWAENC'22), 2022.
    @conference{sebass-db-paper,
    author = {Thorsten Kastner and J\"urgen Herre},
    booktitle = {{IEEE  International Workshop on Acoustic Signal Enhancement  (IWAENC'22)}},
    date-added = {2022-07-13 13:09:09 +0200},
    date-modified = {2022-07-13 13:12:49 +0200},
    note = {in-press},
    title = {The SEBASS-DB: A Consolidated Public Data Base of Listening Test Results for Perceptual Evaluation of BSS Quality Measures, in-press},
    year = {2022}}

version 1.2