Multiple Sclerosis Research: Assessing Animal Data is it Pants or Not?

EAE #MSresearch #MSBlog
This week ProfB presented to a group of students on EAE and left his slides on the blog. The students also had a session on experimental design. Central to this is how to analyse data. Someone commented on this.

I must admit we are all pretty bad a statistics, but it seems that some EAEologists are worse than others

Someone I know had a grant sent back because the referees were saying that they had to analyse their EAE data in a certain way, that was........statistically wrong.

It is amazing how many people do their EAE analysis in the wrong way.

They often use a test called a Student's t test to measure the severity of a neurological score.

Here signs are given a arbitaory score say 0-5 see in the example below. This test assumes a few things such that the measurement item is continuous (such as height where you could be 1m tall or 2m tall but also everything in between as an example of parametric data), but a few other things too. If those assumuptions are not evident you need to use non-parametric analysis such as the Wilcoxon/Mann Whitney U tests.

It is amazing that 65% of the EAE paper published in Nature/Science/Cell journals assume neurological scores are parametric and of those a whopping 67% of studies use a t test to analyse their data. This is in my humble opinion not correct.

In the picture you can see the crux of the problem. In EAE you get T cells infiltrating the spinal cord and in the picture above the cells are red. In the picture above look at the extra amount of red between 0 = normal and 1 = limp tail. Now look at the difference between 3 = paresis= partial paralysis and 4 = hindlimb paralysis. So the amount of red detween 0 and 1 and 3 and 4 is not the same yet they get an arbitary score of diffeence of 1. But you can see there is more red in 3 than 1 and more red in 4 than 3. So the neurological score is non-linear and non-parametric meaning that non-parametric statistics should be used. This is based on ranking from the smallest to the largest. Does this mak a difference?

It can do. This is an real example that we use to teach, It was a nature paper and the animal experiment was the culmination of the work. In the study they used a t test and showed the drug gave a signficant inhibition (p=0.029)...Yipppie they said. But if only the referees had asked them to do it properly. Look at the graph and the drug drops the neurological score by about a half

Or does it?

They seem to show the actual scores of individual animals and yes if you do a t test it is P=0.029. But looking at the data 5 animals have no disease but the drug does essentialy nothing in the six animals that got disease.

However if you do non-parametric statistics the result is P=0.082 and so the drug does nothing and the value of the Nature paper is flushed down the loo.

So the paper may not get a mention because of the idea but as a teching example of data analysis

Was it a fluke where it just happened that 5 animals failed to get disease. We never know because the work was not repeated. So looks like there is no quality control which is a problem with some EAE studies.

So when you read EAE papers look out for the way the data is analysed. Does it pass the "snack you in the eye test" or are the results pants?

Pharma is having to deposit trial data with the regulators so that it can be re-analysed by others requesting the info. It is a probably only a matter of time before they make people deposit raw data when they publish (people are asking about this), so the data can be re-analysed. This will be fun and games. How many other studies will fail?

Should we not do these posts and pretend it is all great?
It clearly isn't