SWIPE Pitch Estimation

Overview

This analyser implements the SWIPE pitch estimation algorithm of Arturo Camacho. The following text describes the algorithm in his words (based on a post to the Auditory List).

The algorithm is similar to autocorrelation, in the sense that it performs an integral transform of the spectrum using a cosine as kernel (recall the Wiener-Khinchin Theorem). However, instead of using the square of the magnitude of spectrum, it uses its square root. Also, it introduces some modifications to the cosine kernel to avoid some of the problems of autocorrelation. First, it zeroes the first quarter of the first cycle of the cosine to avoid the maximum that autocorrelation has at zero lag. Second, it multiplies the kernel by an envelope that decays as 1/f to avoid the periodicity of the autocorrelation function for periodic signals. Third, it normalizes the kernel and uses a pitch-dependant window size to make the width of the main spectral lobes match the width of the positive cosine lobes. This last step is done to avoid the tendency that autocorrelation has to give higher values to periodic signals with high F0 than to periodic signals with low F0. It can be shown that the type of signals that maximizes the inner product between the spectrum and the kernel are periodic signals whose spectral envelope decay as 1/f (e.g., sawtooth waveforms).

User Settings

Currently there are no user settings for this analyser - the default input values are used:

  • Pitch limits are 30 Hz - 5 kHz
  • Time step is 10 ms
  • Pitch search resolution 1/96 semitone
  • No minimum pitch strength threshold

Outputs

  • Pitch (in Hz) time series
  • Pitch strength time series

Key Reference

A. Camacho, SWIPE: A SAWTOOTH WAVEFORM INSPIRED PITCH ESTIMATOR FOR SPEECH AND MUSIC, PhD Dissertation, University of Florida, 2007
The dissertation is available here. A Matlab implementation of the algorithm is available in one of the appendices, which is used in the PsySound3 code without modification.

Usage Note

Processing time estimation is currently not working with this analyser (this is because the PsySound3 process method is bypassed so that the original code can be used without modification).

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License