Cepstrum (complex)

Overview

This analyser conducts cepstral analysis for each window of the audio file. A complex cepstrum is the inverse Fourier transform of the complex logarithm of a Fourier transform. As such, it is very similar to auto-correlation, and can be used for pitch analysis and echo detection. Cepstral analysis is widely used in voice analysis because it provides a simple way to separate formants (due to filtering in the vocal tract) from the vocal source.
Unlike the real cepstrum, a complex cepstrum is reversible, and this allows liftering to be done (filtering in the cepstrum domain prior to reversal). Hence this analyser outputs either the complex cepstrum values, or the liftered spectrum. It may be advisable to examine the cepstrum results prior to selecting the lifter parameters.
Cepstrum output is given as a function of ‘quefrency’, which is very similar to the lag time of auto-correlation. The output level is ‘gamnitude’ (given in dB of dB). Calibration of the cepstrum is not particularly meaningful because of the non-linear nature of the transformation, but the same calibration offset as the FFT Spectrum analyser is used. This makes the liftered spectrum values similar (but not the same) as spectral values from the FFT Spectrum analyser. Note that gamnitude is not complex, with positive and negative values, so an alternative would be to avoid the dB transformation – the disadvantage of this being the large range of values in typical analyses.
The complex cepstrum analyser is implemented using Matlab’s cceps(x) operator.

User Controls

Overlap

This sets the overlap of windows in terms of percentage, milliseconds, seconds or number of samples.

Window Size

This sets the size of the window in samples, which is a power of 2. Values from 2^7 (128 samples) to 2^20 (1048576 samples) are supported (corresponding to window durations of 2.9 ms to 23.8 s for a sampling rate of 44.1 kHz). Longer window durations could be achieved by downsampling the waveform prior to analysis.
A larger window size increases the frequency resolution, but reduces the time resolution. A larger window size is computationally more efficient, but requires greater memory. A non-rectangular windowing function could be thought of as reducing the effective window length.

Windowing Function

This selects the windowing function to be applied to the wave prior to the analysis. Windowing functions provide a ‘fade-in’ and ‘fade-out’ of the windowed waveform, which is helpful in the analysis of arbitrary waves. Select the rectangular window for no effect.

Liftered Spectrum

Selecting this option will allow you to choose the cutoff quefrencies of a rectangular lifter, and to obtain output in the spectrum rather than cepstrum domain. One use for this is to separate the envelope of the sound (which is seen as a steeply descending peak around a quefrency of 0 s) from harmonic content (which is the content immediately above this, perhaps extending to 5 or 10 ms).
Note that while you can enter any upper liftering value, if it exceeds half the window length it will be automatically reduced to that (which is the maximum possible value).

lifter0-2ms.jpg

Example of a liftered spectrogram (quefrency band of 0-2 ms of a solo tenor singer). This spectrogram emphasises the voice formants.

lifter2-8ms.jpg

Example of a liftered spectrogram (quefrency band of 2-8 ms of a solo tenor singer). This spectrogram emphasises the voice harmonics.

Outputs

Cepstrogram

This is the magnitude of the cepstrum (expressed in decibels) as a function of quefrency and time. The actual units are decibels of decibels.

Average Power Cepstrum

This is the power average (over time) of the cepstra as a function of frequency, expressed in decibels.

Cepstral Moments

Standardized and non-standardized moments spectral of the power cepstrum are given as time series.

Liftered Spectrogram

This is the magnitude of the liftered spectrum (expressed in decibels) as a function of frequency and time.

Average Liftered Power Spectrum

This is the power average (over time) of the liftered spectra as a function of frequency, expressed in decibels.

Level

This is the power sum level of the liftered power spectrum.

Liftered Spectral Moments

Standardized and non-standardized moments of the liftered power spectrum are given as time series. The most commonly used of these is the spectral centroid.

Verification

No verification yet.

Code Authors

This analyser was written by the PsySound3 team.

Key References

Digital signal processing and spectral analysis books often have information on cepstral analysis. For example:
Oppenheim, A.V., and R.W. Schafer. Discrete-Time Signal Processing. Englewood Cliffs, NJ: Prentice Hall, 1989

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License