This appendix presents several 3d-party audio analysis libraries and methodologies, covering various programming languages (including MATLAB-based software). Furthermore, non-audio libraries and packages from the fields of pattern recognition, signal processing, etc, are presented. Although our primary focus is on MATLAB-based code, we also provide a flavour of Python and C/C++ resources
Keywords
Software libraries
Software packages
MATLAB
Python
C++
Pattern recognition
Audio analysis
Data mining
Signal processing
In this appendix we present a number of audio analysis libraries and methodologies for MATLAB and other programming languages. In addition, we present related (non-audio) libraries and packages that could be used in the context of intelligent signal analysis, e.g. numerical analysis, general signal processing, multimedia file I/O, pattern recognition, and data mining resources. Although we have primarily focused on MATLAB-related libraries, we also give an idea of relevant resources in Python and C/C++.
For the MATLAB environment, we present separately the audio-specific and general pattern recognition libraries, while Python and C++ are presented in a single list. We have chosen to include Python-related approaches because of the language’s similarities to MATLAB and because of its wide acceptance in the scientific community. In addition, C/C++ will always be a lower level (and less easy-to-handle) solution for signal processing, which, however, leads to faster programs compared to MATLAB and Python.
Table B.1 presents a list of MATLAB libraries on audio and speech analysis that are available on the Web. Note that two of these libraries are focused on music information retrieval, one is speech-oriented and only one covers generic audio analysis.
MATLAB Libraries—Audio and Speech
Name | Description |
Auditory Toolbox, Version 2, by Malcolm Slaney | A MATLAB library that focuses on representing stages of the human auditory analysis system, https://engineering.purdue.edu/malcolm/interval/1998-010/ |
MIRtoolbox | A MATLAB library that deals with the extraction of musical features such as tonality, rhythm, etc.https://www.jyu.fi/hum/laitokset/musiikki/en/research/coe/materials/mirtoolbox |
VOICEBOX | A speech processing toolbox for MATLAB, http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html |
MA Toolbox | A Matlab toolbox for music analysis, http://www.pampalk.at/ma/ |
Furthermore, in this book we presented a number of pattern recognition methods that reside in MATLAB toolboxes, e.g. support vector machines, decision trees, etc. However, there are also plenty of related libraries on the Internet Web. Table B.2 we present a short list of MATLAB libraries related to pattern recognition and machine learning techniques.
MATLAB Libraries—Pattern Recognition and Machine Learning
Name | Description |
Pattern Recognition Toolbox (PRT) for MATLAB | Wide range of pattern recognition techniques (classification, clustering, outlier removal, regression, etc.). MIT License. Parts of the PRT may require a C/C++ compiler to be installed, http://www.newfolderconsulting.com/prt/download |
LibSVM | An integrated software for support vector classification, regression, and pdf estimation. It supports multi-class classification. It is not written in MATLAB. However, it supports APIs for other languages and packages (MATLAB, Python, Weka, R, Octave, Java, etc.), [146], http://www.csie.ntu.edu.tw/cjlin/libsvm/ |
Hidden Markov Model (HMM) Toolbox for Matlab, MIT | Supports inference and learning for HMMs that emit discrete observations or continuous observations based on Gaussian pdfs and mixtures of Gaussian pdfs, http://www.cs.ubc.ca/murphyk/Software/HMM/hmm.html |
Pattern Recognition with Matlab—Online companion material of the book [40] | Consists of a set of functions that cover various stages of the design of a pattern recognition system. |
Python is a high-level programming language that has been attracting increasing interest in the scientific community during the past few years. A wide range of available packages can be used in order to develop a MATLAB-like functionality in Python. The main reason that Python is an attractive programming language for signal processing applications is that it is characterized by a balance of high-level and low-level programming features. In particular, one can easily write algorithms with less lines of code (compared to C/C++), while at the same time, the initial problem of low speed is partly solved by the application of optimization procedures on higher level objects. For example, it is possible to speed up the execution of Python code by vectorizing the respective algorithm. Another great advantage of Python is that there exists an impressive number of libraries that provide functionalities related to scientific programming.
Unlike MATLAB, Python is non-commercial but emphasizes on portability: On the other hand, MATLAB provides Matlab Component Runtime, which can be a more complicated solution compared to Python’s cross-platform nature. On the other hand, MATLAB, compared to Python, is easier to learn (especially for non-programmers), it provides an easy-to-use integrated development environment (IDE) and easier plotting functionalities. In addition, MATLAB is common in almost every scientific field. Therefore, there exist plenty of available MATLAB-compatible software resources that address the needs of a large scientific community.
Table B.3 presents a list of Python packages and libraries that can be used for the development of audio analysis applications.
A List of Python Packages and Libraries that can be Used for Audio Analysis and Pattern Recognition Applications
Name | Description |
NumPy | A library for scientific computing http://www.numpy.org/, with convenient and efficient N-dimensional array manipulation. |
SciPy | A library for mathematics, science, and engineering http://www.scipy.org/, based on NumPy. Also used for the I/O of WAVE files. |
MLPy | A library for machine learning, built on top of NumPy and SciPy. http://mlpy.sourceforge.net/ |
matplotlib | A library for 2-D plotting in Python. http://matplotlib.org/ |
ALSAaudio | A library of wrappers for accessing the ALSA API from Python and necessary for handling audio input/output. http://pyalsaaudio.sourceforge.net/ |
Yaafe | A Python library for audio feature extraction. http://yaafe.sourceforge.net/ |
Ubuntu users will find most of the Python packages in the Ubuntu repository. For example, NumPy can be installed by simply typing the following command in a terminal: apt-get install python-numpy.
C and C++ are widely acknowledged programming languages used for computationally demanding signal processing applications. For real-time signal processing applications that analyze voluminous data, C/C++ is usually the most suitable solution. Of course, both languages demand for more experienced programmers and additional development costs, compared to the Python or MATLAB approaches. However, it is beyond doubt that C/C++ have been widely used in several signal processing and machine learning applications during the last 20 years. In Table B.4 we present a list of representative C/C++ packages and libraries that can be used for building audio analysis applications.
Representative Audio Analysis and Pattern Recognition Libraries and Packages Written in C++
Name | Description |
CLAM (C++ Library for Audio and Music) | A framework for research/development in the audio and music domain. Provides the means to perform complex audio signal analysis, transformations, and synthesis. Can be used as a library and a graphical tool. http://clam-project.org/ |
MARSYAS (Music Analysis, Retrieval, and Synthesis for Audio Signals) | An open-source library for audio analysis, mostly focused on Music Information Retrieval [147,148]. In existence since 1998, it has been used for a variety of academic and industrial projects. Written in C++, but also ported to Java. Can also be installed with Python bindings. http://marsyas.info/ |
aubio | A tool written in C for basic audio analysis: pitch tracking, onset detection, extraction of MFCCs, beat and meter tracking, etc. Provides wrappers for Python. http://aubio.org/ |
Maaate | A C++ toolkit to parse and analyze audio data in the compressed/frequency domain. http://maaate.sourceforge.net/ |
Synthesis ToolKit in C++ (STK) | Audio signal processing and algorithmic synthesis methods written in C++. Focuses on music synthesis functionality. https://ccrma.stanford.edu/software/stk/ |
Vamp plugin system | A set of audio analysis plugins. Provides a C++ API but also a Python wrapper and an interface that permits Java applications to run native Vamp plugins. For example, a Vamp plugin can be used inside an audio editor (e.g. Audacity) to enhance the visualization of information. http://www.vamp-plugins.org/ |
dlib | A cross-platform C++ library that covers a wide range of machine learning, image processing, linear algebra, and general-purpose algorithms [149]. Licensed under the Boost Software License. http://dlib.net/ |