Spike sorting

Post-publication activity

Curator: Rodrigo Quian Quiroga

Figure 1: Extracellular recordings are usually done by inserting microwires in the brain. The signal from the microwire is amplified and band-pass filtered and the firing of the nearby neurons appears as spikes on top of background activity. Spikes are detected using an amplitude threshold and then sorted according to their shapes. For neurons close to the electrode tip -about 50 to 100 microns (Gerstein and Clark, 1964; Buzsaki, 2004)- the signal-to-noise ratio is good enough to distinguish the activity of each single unit (inner circle). For more distant neurons (outer circle) -up to about 150 microns- spikes can be detected but the difference in their shapes is masked by the noise (multi-unit activity). Spikes from neurons further apart cannot be detected and they contribute to the background noise activity.

Spike sorting is the grouping of spikes into clusters based on the similarity of their shapes. Given that, in principle, each neuron tends to fire spikes of a particular shape, the resulting clusters correspond to the activity of different putative neurons. The end result of spike sorting is the determination of which spike corresponds to which of these neurons.

Single cell recordings

A large amount of research in Neuroscience is based on the study of the activity of neurons recorded extracellularly with very thin electrodes implanted in animals’ brains. These microwires ‘listen’ to a few neurons close-by the electrode tip that fire action potentials or ‘spikes’. Each neuron has spikes of a characteristic shape, which is mainly determined by the morphology of their dendritic trees and the distance and orientation relative to the recording electrode (Gold et al., 2006).

Spike sorting –i.e. the classification of which spike corresponds to which neuron- is a very challenging problem. Then, before tackling mathematical details and technical issues, it is important to discuss why we need to do such a job, rather than just detecting the spikes for each channel without caring from which neuron they come.

It is already well established that complex brain processes are reflected by the activity of large neural populations and that the study of single-cells in isolation gives only a very limited view of the whole picture (Brown et al., 2004; Buzsaki, 2004). Therefore, progress in Neuroscience relies to a large extent on the ability to record simultaneously from large populations of cells. The implementation of optimal spike sorting algorithms is a critical step forwards in this direction, since it can allow the analysis of the activity of a few close-by neurons from each recording electrode. This opens a whole spectrum of new possibilities. For example, it is possible to study connectivity patterns of close-by neurons (Buzsaki, 2004; Harris, 2005), or to study the topographical organization of a given area and discriminate the responses of nearby units (Quian Quiroga et al., 2005; Quian Quiroga et al., 2007). It is also possible to have access to the activity of sparsely firing neurons, whose responses may be completely masked by other neurons with high firing rates, if not properly sorted. Separating sparsely firing neurons from a background of large multi-unit activity is not an easy job, but this type of neurons can show striking responses (Hahnloser et al., 2002; Perez-Orive et al., 2002; Quian Quiroga et al., 2005).

This is not aimed to be a comprehensive review of spike sorting. For such reviews the reader is referred to (Abeles, 1977; Schmidt, 1984; Lewicki, 1998). In the following the basic methods for spike sorting and the main steps of most of these algorithms are described.

Basic spike sorting methods

In principle, the easiest way to separate spikes corresponding to different neurons is to use an amplitude discriminator. This classification can be very fast and simple to implement on-line. However, sometimes spikes from different neurons may have the same peak amplitude but different shapes, as shown in the example of Figure 2. Then, a relatively straightforward improvement is to use ‘windows discriminators’, which assign the spikes crossing one or several windows to the same neuron. This method is implemented in commercial acquisition systems and it is still one of the most preferred ways to do spike sorting. Window discriminators can be implemented on-line, but have the main disadvantage that they require a manual setting of the windows by the user, which may need readjustment during the experiment. For this reason it is in practice not possible to sort spikes of more than a few channels simultaneously with window discriminators1. Another major drawback of this approach is that in many cases spike shapes overlap and it is very difficult to set up windows that will discriminate them. This, of course, introduces a lot of subjectivity in the clustering procedure. Moreover, it is possible that sparsely firing neurons may be missed, especially if the particular input (or a particular behavior) that elicits the firing of the neuron is not present while the windows are set.

Figure 2: Window discriminators. Spike shapes are plotted superimposed and they are clustered using time-amplitude windows manually set by the user. In this case, the two red windows determine the cluster of spikes in red and the blue windows determine the cluster of spikes in blue.

Another simple strategy to do spike sorting is to select a characteristic spike shape for each cluster (e.g. the red and the blue cluster of spikes in Figure 2) and then assign the remaining spikes using template matching. This method was pioneered by Gerstein and Clark (Gerstein and Clark, 1964), who implemented an algorithm in which the user selects the templates and the spikes are assigned based on a mean square distance metric. This procedure can be also implemented on-line, but, as the window discriminator, it has a few drawbacks. First, it requires user intervention, which makes it not practical for large number of channels. Second, the templates may have to be adjusted during the experiment2. Third, when the spike shapes overlap it may not be straightforward to choose the templates or to decide how many templates should be taken. Fourth, as for the window discriminator, it may miss sparsely firing neurons.

Current acquisition systems allow the simultaneous recording of up to hundreds of channels simultaneously. This opens the fascinating opportunity to study large cell populations to understand how they encode sensory processing and behavior. The reliability of these data critically depends on accurately identifying the activity of the individual neurons with spike sorting. To deal with large number of channels, supervised methods as the ones described in this section are highly time consuming, subjective, and nearly impossible to be used in the course of an experiment. It is therefore clear that there is a need for new methods to deal with recordings from multiple electrodes. Surprisingly, the development of such methods has been lagging far behind the capabilities of current hardware acquisition systems. There are three main characteristics that these methods should have: i) They should give significant improvements (in terms of reliability or automatization) over a simple window discriminator or a template matching algorithm. Otherwise, their use seems not justified. ii) They should be unsupervised or at least they should offer a hybrid approach where most of the calculations are done automatically and the user intervention may be required only for a final step of manual supervision or validation. iii) They should be fast enough to give results in a reasonable time and, eventually, it should be possible to implement them on-line.

A few algorithms have been proposed to cope with these requirements. Besides differences in their implementations, most of them follow the same processing steps, as detailed in the next section.

Spike sorting step by step

In general, spike sorting algorithms have four major steps and each of these steps should be carefully considered since its implementation can dramatically change the results. Figure 3 summarizes the basic procedure, starting with the continuous data and finishing with the classified spike shapes.

Figure 3: Basic steps for spike sorting. Step i) The continuous raw data is band-pass filtered, e.g. between 300 Hz and 3000 Hz. Step ii) Spikes are detected, usually using an amplitude threshold. Step iii) Relevant features of the spike shapes are extracted, thus giving a dimensionality reduction. Step iv) These features are the input of a clustering algorithm that performs the classification. Adapted from Quian Quiroga et al. (2004).

Step i) Filtering

Figure 4: Examples of filtering distortions. Note the appearance of a spurious rebound in the spike shapes with causal filtering4. Filtering distortions can also change the shape of the artifacts and make them look similar to real spikes.

The first step when processing continuously recorded data is to apply a band pass filter in order to avoid low frequency activity and visualize the spikes. This step is usually overlooked in the literature, but its implementation can dramatically change the spike shapes. In the example of Figure 3, the continuous data was filtered with a noncausal band pass filter between 300 and 3000 Hz. Frequencies below 300 Hz are filtered to delete the slow components of the raw data seen in the top of Figure 3. The upper cutoff frequency of the filter is to diminish the noisy appearance of the spike shapes. As it is usually the case with filtering, a compromise has to be taken. On the one hand, one would like to have a narrow filter band to better visualize the spikes, but on the other hand, if the band is too narrow, the filter may hinder different features of the spike shapes. Another important issue is that the filter should preferably be noncausal. This can be easily implemented off-line, for example, using the matlab command ‘filtfilt’, which filters the signal in the forward and reverse directions. Causal filters (any recursive or IIR filter) usually introduce phase distortions which may seriously change the spike shapes, as shown in Figure 4. Many acquisition systems already do the filtering on-line by hardware using recursive (i.e. causal) filters, such as Butterworth3. Then, the user gets only the filtered data or the already distorted spikes after setting manually an amplitude threshold for spike detection.

Step ii) Spike Detection

Figure 5: Estimation of noise level used for determining the amplitude threshold. Note how the conventional estimation based on the standard deviation of the signal increases with the firing rate, whereas the improved estimation from equation (1) remains close to the real value. Adapted from Quian Quiroga et al. (2004).

From the filtered data, spikes are usually detected using an amplitude threshold. The choice of the threshold is a compromise between: i) missing spikes if a high threshold is used (Type II error) and ii) getting false positives due to noise crossing a low threshold (Type I error). An adequate threshold can be set manually, as done in most systems with on-line spike detection. However, an automatic threshold is preferable, especially when processing large number of channels. This can be set, for example, as a multiple of the standard deviation of the signal (Pouzat et al., 2002). However, taking the standard deviation of the signal (including the spikes) could lead to very high threshold values, especially in cases with high firing rates and large spike amplitudes. To overcome this limitation it has been proposed to use a threshold (Thr) automatically set to Quian Quiroga et al. (2004):

$\tag{1} Thr = 5 \sigma_n ~ ~ ~ ~ ~ ~ \sigma_n = median \{ {\frac{ {|x|} } {0.6745} } \}$

where $$x$$ is the bandpass filtered signal and $$\sigma_n$$ is an estimate of the standard deviation of the background noise5. Note that using the estimation based on the median diminishes the interference of the spikes (under the reasonable assumption that spikes amount to a small fraction of all samples). To illustrate this, Figure 5 shows the case of 10 sec of simulated background noise with unit standard deviation, where -in successive runs- spikes of a simulated neuron with increasing firing rate were added. For the noise alone (i.e. zero firing rate), both the ‘conventional’ estimate using the standard deviation and the improved one using equation (1) are equal. However, as the firing rate increases, the standard deviation of the signal gives an increasingly erroneous estimate of the noise level, whereas the improved estimate of equation (1) remains close to the real value.

To further illustrate this point with real data, in Figure 6 we show a 10 sec recording with a tetrode in a locust6. Channel 4 has nearly no spikes and both estimates give the same threshold value. The situation is very different for Channel 1, which contains high amplitude spikes. Note in this case that the threshold based on the standard deviation of the signal is too high and misses low amplitude spikes, whereas the improved threshold remains locked to the noise level, as expected.

Figure 6: Ten second continuous recordings with a tetrode. Channel 1 contains high amplitude spikes and the ‘standard’ threshold estimation is too high. The improved estimation using equation (1) is locked to the noise level and it remains at an optimal value. Data kindly provided by Ofer Mazor and Gilles Laurent.

Once spikes are detected, they have to be stored for clustering. There are two issues concerning spike storage that need a brief description. The first one is how many datapoints to store. This of course depends on the sampling frequency and ideally one would like to store the whole spike shape; i.e. about 2 ms of data. With a sampling frequency of 30 KHz, this corresponds to 60 datapoints. Some methods for feature extraction, such as wavelets (using a multiresolution decomposition implementation), require that the number of datapoints is a power of 2. In this case, with 30 KHz sampling, 64 datapoints would be optimal. The second issue has to do with the alignment of the spike shapes. Spikes can be aligned to their maximum. But due to insufficient sampling the maximum can be at different points of the spike shape. To avoid such misalignments, which could lead to overclustering, the spike shapes can be oversampled using interpolated waveforms, for example, using cubic splines. Then, the interpolated shapes can be aligned and later decimated to the original sampling rate.

Step iii) Feature extraction

The third step for spike sorting is to extract features of the spike shapes. This step gives a dimensionality reduction, going from a space of dimension m (with m the number of datapoints per spike) to a low dimensional space of a few features. Ideally, one wants to extract those features that best separate the different clusters of spikes and get rid of all the dimensions dominated by noise. This step saves computational time and it is mandatory for some clustering algorithms that can not handle too many inputs in a reasonable time, or that perform a classification by selecting the cluster boundaries by hand, where no more than 2 or 3 features are visible in the same plot. Furthermore, eliminating inputs dominated by noise can certainly improve clustering outcomes. The major challenge is how to select which are the best features!

A first choice would be to take basic characteristics of the spikes, such as their peak (or peak to peak) amplitude, their width and energy (the square of the signal). However, it has been shown that such features are not optimal for differentiating spike shapes in general (Quian Quiroga et al., 2004).

By far, the most used method for feature extraction is to take the first 2 or 3 principal components (Glaser and Marks, 1968; Abeles, 1977), usually containing more than 80% of the energy of the signal. However, principal component analysis (PCA) selects the directions of maximum variance of the data, which are not necessarily the directions of best separation7. In other words, it may well be that the information for separating the clusters is represented in one (or a combination) of principal components with low eigenvalues, which are usually disregarded.

Figure 7: A wavelet coefficient giving a good (bottom) and bad (top) separation of clusters. The best wavelet coefficients can be selected as the ones having the least Gaussian distribution, using a Kolmogorov-Smirnoff test of Normality.

More recently it has been proposed to use wavelets for feature extraction (Letelier and Weber, 2000; Hulata et al., 2002; Quian Quiroga et al., 2004). The wavelet transform gives a time-frequency (or properly speaking time-scale) decomposition of the signal with optimal resolution both in the time and frequency domains. The advantage of using wavelets for feature extraction is that very localized shape differences can be discerned because wavelet coefficients are localized in time. Moreover, the information about the shape of the spikes is distributed in several coefficients, whereas with PCA most of the information about the spike shapes is captured only by the first 3 principal components, which are not necessarily optimal for cluster identification. In agreement with these considerations, a better performance of wavelets in comparison with PCA was shown for several simulated examples generated with different noise levels (Quian Quiroga et al., 2004).

As previously mentioned, a critical step is how to select the features that best separate the different clusters of spike shapes. A wavelet coefficient -or any other feature- that is good for distinguishing different spike shapes should have a multimodal distribution (see Figure 7), unless there is only one cluster. The selection of these features can be done automatically using a Kolmogorov-Smirnov test for Normality (Quian Quiroga et al., 2004). Note that this criterion does not rely on any particular distribution of the data and it rather looks for a deviation from normality as a sign of a multimodal distribution.

Step iv) Clustering

Figure 8: Superparamagnetic clustering correctly classifies most of the points into 3 clusters. Note that the example is very challenging for any clustering algorithm since the clusters have no Gaussian shapes, the centers are outside the clusters, and the distance within a cluster can be larger than the distance between clusters. Adapted from Quian Quiroga et al. (2004).

The fourth and final step of spike sorting is to group spikes with similar features into clusters, corresponding to the different neurons. A straightforward solution is to delimit clusters manually by drawing polygons in 2-dimensional projections of the spike features (Gray et al., 1995). However, besides being a very time-consuming task, manual clustering introduces errors due to the limited dimensionality of the cluster cutting space and due to human biases (Harris et al., 2000). In fact, in many cases clusters overlap and the manual setting of a boundary is very subjective.

More refined solutions have been proposed using Bayesian classification (Lewicki, 1994) and Expectation Maximization procedures (Harris et al., 2000; Pouzat et al., 2002). These approaches assume a Gaussian distribution of the clusters, based on the claim that for a given cluster the spike variability is determined only by additive and Gaussian stationary background noise. Although this assumption may be plausible in many conditions, it has been argued that in general the background noise cannot be represented as a stationary Gaussian random process (Fee et al., 1996a). In fact, there are several experimental conditions that lead to clusters with non-Gaussian shapes. This can be due to: i) electrode drifts during the recording, ii) variation in the spike shape due to bursting, iii) presence of overlapping spikes, iv) correlations between spikes and local field potentials, v) multi-unit activity, vi) non-stationary background noise, vii) misalignments in the spike detection. To overcome the assumption of Gaussian clusters, a hierarchical clustering algorithm has been proposed, which first sorts the data into an overly large number of clusters and then merges these clusters according to spike shape similarities and the statistics of the ISI distributions (Fee et al., 1996b).

Another approach to avoid the assumption of Gaussian distributions is to use clustering algorithms based on nearest neighbours interactions, which basically group together contiguous set of points given that the local density is larger than a certain value. One of such methods, super-paramagnetic clustering (SPC) (Blatt et al., 1996), has been recently introduced for spike sorting (Quian Quiroga et al., 2004). SPC is a stochastic algorithm that does not assume any particular distribution of the data and groups the spikes into clusters as a function of a single parameter, the temperature. In analogy with statistical mechanics, for low temperatures all the data is grouped into a single cluster and for high temperatures the data is split into many clusters with few members each. There is, however, a middle range of temperatures corresponding to the super-paramagnetic regime where the data is split into relatively large size clusters corresponding to the optimal sorting.

Figure 8 illustrates the performance of SPC with a 2-dimensional example in which 2400 2-dimensional points were distributed in 3 different clusters. This is a very challenging problem for most classifiers since the clusters partially overlap, they have a large variance and the centers fall outside the clusters. In particular, the distance between arbitrarily chosen points of the same cluster can be much larger than the distance between points in different clusters. In spite of all these difficulties, SPC correctly identifies the three clusters and only 102 out of 2400 data points (4%) were not classified because they were near the boundaries of the clusters.

Performance evaluation

Figure 9: Wave-Clus. The upper subplot shows the continuous data and the threshold used for spike detection. The lower subplots show the different spike shapes with their corresponding ISI distributions and the value of the first two wavelet coefficients chosen by the algorithm for all spikes. Clearly there are 4 different clusters. The plot of the bottom left corner shows the size of each cluster as a function of the temperature, which is the main parameter that can be varied if the automatic solution is not optimal.
Figure 10: A simulated recording where 3 different spike shapes are inserted at random positions in background noise. Since the time and identity of each spike is known, the performance of spike detection and sorting algorithms can be quantified and compared. Adapted from (Quian Quiroga et al. (2004)).

Real data usually does not offer ground truth of what is right or wrong and it is therefore very difficult and subjective to compare the performance of different spike sorting methods. One remarkable exception is when simultaneous intra- and extra-cellular recordings are performed and the outcome of spike sorting of the extra-cellular recordings can be compared with the exact information given by the intra-cellular data (Wehr et al., 1999; Harris et al., 2000). But, in general, the availability of simultaneous intra- and extra-cellular recordings is very limited.

A typical question faced when doing spike sorting is whether 2 clusters correspond to 2 different neurons or whether they just appear due to over-clustering of a single unit. Figure 9 shows the automatic output of a spike sorting algorithm "Wave-Clus" with real data. Both the code and dataset are available at http://www.le.ac.uk/neuroengineering. Should, for example, the spikes in red be merged with the ones in light blue? In this case there seems to be a difference in the raising slope of the spike, but is there any way to quantify this observation?

There are some indications that may show when 2 different neurons are merged in a single cluster. For example, if this is the case one can have spikes with inter-spike-intervals below the minimum refractory period of a neuron (about 2 or 3 ms). Moreover, it is likely that a non-uniform variance of the average spike shape will be obtained due to the mixing of 2 different clusters (Pouzat et al., 2002). It is also possible to introduce distance measures to quantify the separations between clusters of spikes and eventually decide whether they should be merged or not. However, all these indications are far from an exact criterion and in most cases the final sorting outcome is given by a subjective decision by the user, with all the consequent biases that this implies (Harris et al., 2000).

In general, the best scenario for a quantitative comparison between different spike sorting methods is to use simulated datasets, where the identity of each spike is known. The challenge in this case is to create datasets that mimic as accurately as possible the characteristics of real recordings in terms of noise distributions, spike characteristics, etc. For example, Figure 10 shows a segment of simulated data in which 3 different spike shapes where introduced at random times. The noise was generated by averaging millions of spikes at random times and with random (but relatively smaller) amplitudes, simulating the activity of neurons far away. This dataset, as well as many others with different spike shapes and noise levels, is available from http://www.le.ac.uk/neuroengineering.

Other issues

Tetrodes

Tetrodes are four close-by micro-electrodes that allow the visualization of single neurons from different positions (see Figure 6). Since the amplitude and shape of the spikes depend on the position of the neuron with respect to the electrode, the fact that the neuronal firing is seen from 4 different sites can dramatically improve spike sorting outcomes (Gray et al., 1995; Harris et al., 2000). A single tetrode allows the identification of up to a couple dozen close-by neurons, which is extremely useful, for example, for the study of connectivity patterns in cell assemblies (Harris, 2005). Moreover, tetrodes usually offer a better sorting quality, since ambiguous results from one channel can be disentangled using the information of the other channels.

In terms of spike sorting, the procedure is very similar as the one described for single channels. Spikes of the 4 channels can be concatenated and taken as a single waveform for sorting purposes. Alternatively, features of the spike shapes of the 4 channels can be extracted separately and then combined -eventually with some weighting given that some channels may have better signal than others- to form the inputs to the clustering algorithm.

With tetrodes, a standard way to visualize the clustering quality is to plot the peak amplitude of the spikes in the 4 channels against each other (Gray et al., 1995). These ‘footprints’ can be also used to determine the stability of the recordings across days, which is very difficult to asses with single electrodes.

Overlapping spikes

If two close-by neurons fire in synchrony or with a small delay, then it is possible to get overlapping spikes; i.e. a spike shape generated by the sum of the spikes of the two neurons. It is relatively easy to identify overlapping spikes when a small delay is present due to the appearance of double peaks. Without a delay, overlapping spikes may look like the firing of a third neuron, which is clearly a much more difficult situation.

How to deal with overlapping spikes is one of the most challenging issues in spike sorting. A few solutions have been proposed, the most standard one being to reproduce the shape of overlapping spikes by using a linear superposition of the spike shapes of the other identified neurons. However, this procedure is still far from optimal and new techniques may hopefully show better results in the future.

It should be stressed the importance of dealing accurately with overlapping spikes, since these may give extraordinary information –or biased results- about the functional connectivity in a given area.

Bursting cells

A burst is the firing of a fast sequence of spikes by a neuron. Bursts can have a variable number of spikes, which appear as concatenated and with decreasing amplitude; i.e.: the first spike is larger than the second, which is larger than the third and so on. Burst can be easily identified visually (see bursting) and with the help of inter-spike-interval histograms due to the appearance of a peak at about 2ms (and multiples of it, if the burst is composed of several spikes).

Concerning spike sorting, one should be careful that spikes in a burst are not taken as separate clusters given that the amplitude of the individual spikes usually differ. It is however possible to rule out this case with the help of cross-correlograms.

Non-Gaussian clusters

Several spike sorting algorithms assume a Gaussian distribution of the clusters (see section Clustering ). These algorithms may usually work quite well, but, as mentioned before, there are several technical and physiological issues that make the assumption of Gaussian clusters not plausible in general. This can be basically due to:

• Electrode drifts during the recording
• Variation in the spike shape due to bursting
• Presence of overlapping spikes
• Correlations between spikes and local field potentials
• Multi-unit activity
• Non-stationary background noise
• Misalignments in the spike detection

These features tend to produce elongated -Non-Gaussian- clusters and methods assuming Gaussian distributions will tend to over-cluster by fitting several multivariate Gaussians. One solution to this problem is to use methods that do not assume any distribution of the data, as the one based on nearest neighbor interactions described in Clustering. Alternatively, Gaussian clusters can be merged by the user in a final processing step (Fee et al., 1996b; Harris et al., 2000). Another solution for the particular issue of non-stationary recordings is to cluster relatively small and quasi-stationary segments of data. However, it may be very complicated to establish the equivalence of the clusters in the consecutive segments or to decide when in a given segment a new cluster appears or another one disappears. This is particularly troublesome for sparsely firing neurons that may be active in only few of these segments.

Conclusions

Spike sorting is a very challenging mathematical problem that has attracted the attention of scientists from different fields. It is indeed an interesting problem for researchers working on signal processing, especially those dealing with pattern recognition and machine learning techniques. It is also crucial for neurophysiologists, since an optimal spike sorting can dramatically increase the number of identified neurons and may allow the study of very sparsely firing neurons, which are hard to find with basic sorting approaches.

Given the extraordinary capabilities of current recording systems – allowing the simultaneous recording from dozens or even hundreds of channels - there is an urgent need to develop and optimize methods to deal with the resulting massive amounts of data. The reliable identification of the activity of hundreds of simultaneously recorded neurons will play a major role in future developments in Neuroscience. In this article we gave a brief description of how to tackle the main issues of spike sorting. However, there are still many open problems, like the sorting of overlapping spikes, the identification of bursting cells and of nearly silent neurons, the development of robust and completely unsupervised methods, how to deal with non-stationary conditions, for example, due to drifting of the electrodes, how to quantify the accuracy of spike sorting outcomes, how to automatically distinguish single-units from multi-units, etc. One of the biggest problems for developing optimal spike sorting algorithms is that we usually do not have access to the "ground truth". In other words, we do not have the exact information of how many neurons we are recordings from and which spike correspond to which neuron. The challenge is then to come up with realistic simulations and clever experiments -as the ones described in the previous section- that allow the exact quantification of performance and comparison of different spike sorting methods.

Notes

• 1 Interestingly, some laboratories use to have a few PhD students supervising an experimental session, each of them taking care of a few channels.
• 2 But it is possible to use simple rules to update the templates, for example, by averaging each new classified spike with the template to track slow shifts.
• 3 However, it should in principle be possible to design an acquisition system with a short buffer to do noncausal filtering nearly on-line.
• 4 Interestingly, this looks like a typical spike shape found in many articles and textbooks.
• 5 The use of these thresholds is standard in denoising algorithms based on the wavelet transform (Donoho, 1995).
• 6 This data was kindly provided by Ofer Mazor and Gilles Laurent.
• 7 Consider 2 clusters in the X-Y plane that have their centers at (x=0, y=0) and (x=1, y=0), respectively. Consider also that both clusters have little variance in the x-axis and are stretched in the y-axis. Then, the direction of maximum variance is in the y-axis, but the direction separating both clusters is in the x-axis.

Software and data

A few authors have made their codes and some data available in the web. Here are the links to some of them.

• Wave_Clus by Rodrigo Quian Quiroga. Unsupervised spike detection and sorting using wavelets and superparamagnetic clustering (includes datasets).
• KlustaKwick by Ken Harris. Unsupervised classification of multidimensional continuous data.
• SpikeOmatic by Christophe Pouzat. Spike sorting using a Gaussian Mixture Model or a Dynamic Hidden Markov Model.
• MClust by David Redish. Manual spike sorting.
• OSort by Uli Rutishauser. On-line spike sorting algorithm.

References

• Abeles, M and Goldstein Jr, M (1977) Multispike train analysis. Proceedings of the IEEE 65:762-773.
• Blatt M, Wiseman S, Domany E (1996) Superparamagnetic Clustering of Data. Physical Review Letters 76:3251-3254.
• Brown EN, Kass RE, Mitra PP (2004) Multiple neural spike train data analysis: state-of-the-art and future challenges. Nature Neuroscience 7:456-461.
• Buzsaki G (2004) Large-scale recording of neuronal ensembles. Nature Neuroscience 7:446-451.
• Donoho D (1995) De-noising by soft-thresholding. IEEE Trans. Infrm. Theory 41: 613-627.
• Fee MS, Mitra PP, Kleinfeld D (1996a) Variability of extracellular spike waveforms of cortical neurons. J Neurophysiol 76:3823-3833.
• Fee MS, Mitra PP, Kleinfeld D (1996b) Automatic sorting of multiple unit neuronal signals in the presence of anisotropic and non-Gaussian variability. Journal of Neuroscience Methods 69:175-188.
• Gerstein GL, Clark WA (1964) Simultaneous Studies of Firing Patterns in Several Neurons. Science 143:1325-1327.
• Glaser EM, Marks WB (1968) On-line separation of interleaved neuronal pulse sequences. Data Acquisition Process Biol Med 5:137-156.
• Gold C, Henze DA, Koch C, Buzsaki G (2006) On the Origin of the Extracellular Action Potential Waveform: A Modeling Study. J Neurophysiol 95:3113-3128.
• Gray CM, Maldonado PE, Wilson M, McNaughton B (1995) Tetrodes markedly improve the reliability and yield of multiple single-unit isolation from multi-unit recordings in cat striate cortex. Journal of Neuroscience Methods 63:43-54.
• Hahnloser RHR, Kozhevnikov AA, Fee MS (2002) An ultra-sparse code underlies the generation of neural sequences in a songbird. Nature 419:65-70.
• Harris KD (2005) Neural signatures of cell assembly organization. Nat Rev Neurosci 6:399-407.
• Harris KD, Henze DA, Csicsvari J, Hirase H, Buzsaki G (2000) Accuracy of Tetrode Spike Separation as Determined by Simultaneous Intracellular and Extracellular Measurements. J Neurophysiol 84:401-414.
• Hulata E, Segev R, Ben-Jacob E (2002) A method for spike sorting and detection based on wavelet packets and Shannon's mutual information. Journal of Neuroscience Methods 117:1-12.
• Letelier JC, Weber PP (2000) Spike sorting based on discrete wavelet transform coefficients. Journal of Neuroscience Methods 101:93-106.
• Lewicki M (1994) Bayesian modeling and classification of neural signals. Neural Comp 6:1005-1030.
• Lewicki M (1998) A review of methods for spike sorting: the detection and classification of neural action potentials. Network: Computation in Neural Systems 9:R53-R78.
• Perez-Orive J, Mazor O, Turner GC, Cassenaer S, Wilson RI, Laurent G (2002) Oscillations and Sparsening of Odor Representations in the Mushroom Body. Science 297:359-365.
• Pouzat C, Mazor O, Laurent G (2002) Using noise signature to optimize spike-sorting and to assess neuronal classification quality. Journal of Neuroscience Methods 122:43-57.
• Quian Quiroga R, Nadasdy Z, Ben-Shaul Y (2004) Unsupervised Spike Detection and Sorting with Wavelets and Superparamagnetic Clustering. Neural Comp 16:1661-1687.
• Quian Quiroga R, Reddy L, Kreiman G, Koch C, Fried I (2005) Invariant visual representation by single neurons in the human brain. Nature 435:1102-1107.
• Quian Quiroga R, Reddy, L., Koch, C., and Fried, I. (2007) Decoding visual inputs from multiple neurons in the human temporal lobe. Journal of Neurophysiology 98: 1997-2007.
• Schmidt EM (1984) Computer separation of multi-unit neuroelectric data: a review. Journal of Neuroscience Methods 12:95-111.
• Wehr M, Pezaris J and Sahani M (1999) Simultaneous paired intracellular and tetrode recordings for evaluating the performance of spike sorting algorithms. Neurocomputing 26: 1061-1068.

Internal references

• Jan A. Sanders (2006) Averaging. Scholarpedia, 1(11):1760.
• Valentino Braitenberg (2007) Brain. Scholarpedia, 2(11):2918.
• Eugene M. Izhikevich (2006) Bursting. Scholarpedia, 1(3):1300.
• Rob Schreiber (2007) MATLAB. Scholarpedia, 2(7):2929.
• Philip Holmes and Eric T. Shea-Brown (2006) Stability. Scholarpedia, 1(10):1838.
• Arkady Pikovsky and Michael Rosenblum (2007) Synchronization. Scholarpedia, 2(12):1459.