Background: The throughput of electrophysiological recording is growing rapidly, allowing thousands of simultaneous channels, and there is a growing variety of spike sorting algorithms designed to extract neural firing events from such data. This creates an urgent need for standardized, automatic evaluation of the quality of neural units output by such algorithms. New method: We introduce a suite of validation metrics that assess the credibility of a given automatic spike sorting algorithm applied to a given dataset. By rerunning the spike sorter two or more times, the metrics measure stability under various perturbations consistent with variations in the data itself, making no assumptions about the internal workings of the algorithm, and minimal assumptions about the noise. Results: We illustrate the new metrics on standard sorting algorithms applied to both in vivo and ex vivo recordings, including a time series with overlapping spikes. We compare the metrics to existing quality measures, and to ground-truth accuracy in simulated time series. We provide a software implementation. Comparison with existing methods: Metrics have until now relied on ground-truth, simulated data, internal algorithm variables (e.g. cluster separation), or refractory violations. By contrast, by standardizing the interface, our metrics assess the reliability of any automatic algorithm without reference to internal variables (e.g. feature space) or physiological criteria. Conclusions: Stability is a prerequisite for reproducibility of results. Such metrics could reduce the significant human labor currently spent on validation, and should form an essential part of large-scale automated spike sorting and systematic benchmarking of algorithms.
- Spike sorting
ASJC Scopus subject areas