# Talk:Signal-to-noise ratio in neuroscience

Don Johnson: I think the article is a little skimpy. You almost imply SNR is useful only if it is related to MI. Why not give guidelines for what high SNR and small SNR values might be. Relation to d' in a detection scenario might be one way of doing this.

More importantly, I don't the distinction "across" stimulus variability and "response" variability works well. For example, a neuron's spontaneous rate may be a constant, but the number of spikes due to background activity varies from trial to trial.

I also don't know exactly how you compute SNR when signal and noise are usually added together; do would you experimentally measure them? You suggest mean square value divided by variance, a quantity guaranteed to be greater than one. SNR can clearly be less than one in engineering problems, but not in neuroscience?

Simon: Thanks for all the very pertinent comments, Don, I believe they are very helpful indeed. I had certainly considered including the relationship to d', and will now do so; this should provide some additional insight for neurophysiologists as to what SNR really means (the main point of the article). I will make detailed changes based on the main text over the coming days, but in the meantime would like to comment on one point above:

"More importantly, I don't the distinction "across" stimulus variability and "response" variability works well. For example, a neuron's spontaneous rate may be a constant, but the number of spikes due to background activity varies from trial to trial."

The point I was trying to make here (while trying to be polite) is that a whole line of work seems to have started where all they do is to measure that *average* spontaneous rate, and call it the noise, and calculate a signal to noise ratio. I fully accept that with a constant spontaneous rate you may have trial to trial variability - my point is simply that it is this variability which should be measured to characterise the noise, as opposed to the average (for one average rate there may be many possible variances and vice versa). I tried to be very careful how I phrased this, out of a desire not to be polemic, but there are a number of results throughout this literature (starting with Sato as far as I can tell) where they do this, and these results are being cited in literature elsewhere.

With regard to measures of SNR when signal and noise are added together, the usual way is to obtain signal power separately by repeating a signal waveform over many trials, and then average out the (assumed iid) noise, then take the power along the average signal waveform. Thus SNR can certainly be less than one. SNR should of course be exactly the same in engineering and in neuroscience problems, the issue is just how to write the problem down correctly. I can see that I have been unclear in several places in the text, and made a couple of errors; I will attempt to fix this.

Next reviewer: I agree that it would be very useful to spell out the relation of SNR to \(d'\) in this context. It would also be good to mention that the SNR values as discussed here all depend on the choice of input signal presented by the experimenter. In other words, they do not characterize a property of the system itself, but the combination of system plus external stimulus (But, for example, as long as the system is linear to good approximation, one may define a normalized SNR, where instead of the physical signal one uses the ratio of signal to input power. I'm not sure you want to go into all of that detail.)

**Section "Discrete Stimuli"**

I find the introduction of the averaged SNR (starting with "If the latter…") a bit confusing. If you want to keep it you should explicitly mention that \(\rm{P(s)}\) is the probability for each stimulus, as the reader may get confused with \(\rm{P_S}\) used earlier. But my main problem is that it breaks the flow, and does not really add much to the discussion.

**Some comments on "Relationship to mutual information":**

I think this section would work better if the order is reversed. In other words, start with the discrete case where \(I=(1/2) \cdot \rm{log_2}(1+SNR)\) is the information per sample *(not an information rate as suggested by the present version)*. In discussing the discrete case it may be worth mentioning explicitly that the \(I=(1/2) \cdot \rm{log_2}(1+SNR)\) result can be directly derived as the difference in entropy of the Signal+Noise distribution with variance
\( \sigma_{tot}^2 = \sigma_s^2 + \sigma_n^2\) (i.e. total entropy) and the Noise distribution with variance \( \sigma_n^2 \) (noise entropy). You could then introduce the notion of information rate through the toy example of receiving an independent sample every \(\Delta t\) seconds, so that you get \((1/2) \cdot \rm{log_2}(1+SNR)/\Delta t \) bits/second. This then begs the question of what happens when the samples are not strictly independent and discrete. That is where the full Shannon result for Gaussian signal and Gaussian noise comes in *(b.t.w. signal and noise should be Gaussian, but neither has to be spectrally white, as suggested for the noise in the present version).* I am not sure how much detail you want here, but one can argue that the power spectrum of a Gaussian process is an ordered list of variances of frequency components. The Shannon result then naturally follows the form for information per symbol (where now a "symbol" is a particular frequency), added over all independent symbols and normalized to express it as an information rate, as is accomplished by integrating over all frequencies.

I think you are on the right track. I did not appreciate the "sensitivity" to people doing erroneous things in the past. Maybe you can fix them!!

Do look at my inline comments as well.

Don Johnson

Simon: OK, I have completed another pass over the article. I have gone through each of the comments made by both reviewers in this section and in the main text, and addressed them. I thank both of the reviewers for their extremely helpful reviews, which I think have improved the article greatly. Reviewer 2: I have unashamedly borrowed a number of things from your review (I hope that's OK), in particular the argument connecting the discrete and continuous Shannon results, which I think makes the connection in a nice intuitive way that can easily be appreciated.

I have also added two figures, based on a simple simulation I did of a "toy version" of the RdeR and SBL 96 paper. I thought that an illustration might help make things much more concrete, and I will place the MATLAB code for these figs on my website, and place a link where it currently says "source code available from curator", so that those who want to play, can. (I am killing two birds with one stone, as I'll also use that code for a lab for our final year computational neuroscience course for bioengineers).

I hope the article is not too complicated now! The main thing according to Eugene's instructions is that people should be able to skim and get the information they need quickly, and then get drawn in more deeply according to their time and inclination. I believe that should still work.

OK, back to the reviewers.

--- Three comments by reviewer A addressed. Citation to Rozell and Johnston added, and associated caveat. Only thing I quibbled with was putting in too much detail on Zohary et al - to do it properly would take much more space, as I'd want to define some quantities slightly differently than they did for consistency. I think the place for that is a review on the role of correlations in population coding, which I intend to write one day. Cheers, Simon.

The article is very much improved. My only comments concern the new section on d'. I would elaborate a little more, mentioning that d' is used in psychophysics and where the Gaussian formula comes from (right now its appearance on the stage is very mysterious). You don't define P_C and erf() may be mysterious to some (much be an article about the Gaussian somewhere).

Onward!!

Done. I note that in the process, I found what I believe to be an error in Spikes (p. 241, should be phi(d'/2). Correct calculation is in the revised main article here.

I think the article is in fine shape, and should become available.

A few minor suggestions:

In Section "Stimuli continuously varying in time":

The mean over trials of the output of this model is the average response shown by the solid line in Fig. 1B -> Change to: ... Fig. 1C

The use of the word "noise" in the term "white noise stimulus" can be confusing to someone not used to the jargon. I would prefer "fluctuating stimulus waveform" or something of that kind and keep the term "noise" specifically for the uncontrollable parts.

In Section "Relationship to discriminability" :

1 - mention that Pc is computed for the case of equal a priori chances for signal present or absent, and mention that therefore the prefactors for the two integrals in the first expression for Pc are (1/2) each.

2 – the factors 2*pi in the denominators of that same equation must each have a square root.