Correlation for schoolchildren

A few comments on MEI‘s draft “Critical Maths” Curriculum. They list

Glossary of terms which students are expected to know and be able to use […]

Association: A tendency for two events to occur together.

Correlation: An association between two variables which is approximately linear.

This definition of correlation seems rather odd.  If y = x^2  aren’t x and y  correlated?   What does “an association” mean here?  The suggested definition of association given above is for events, not “variables”.   Presumably the authors have in mind random variables.
There is a serious problem here in the use of language.  It needs to be made clear whether the notion being described is an intuitive one or a mathematical definition. I am not a statistician, but it seems to me that there are (at least) three common distinct types of usage of the word “correlation”,  none of which is captured by the “definition” proposed:
(1)  The vernacular usage. The  Merriam-Webster dictionary gives
  “a relation existing between phenomena or things or between mathematical or statistical variables which tend to vary, be associated, or occur together in a way not expected on the basis of chance alone”
which seems to me a reasonable description of the vernacular or intuitive non-mathematical meaning of the term.    This is clearly much broader than the meaning suggested above.
(2)  The intended meaning proposed seems to correspond closest to the use of the  (Pearson) correlation coefficient  in statistics, although even then it is not  accurate, since  the correlation coefficient is not always a  reliable indicator of the existence of a linear relationship.   This meaning is that which tends to be used by a large class of people who have had some minimal exposure to statistics.
(3)  More generally correlation can be used to indicate a variety of mathematical measures of probabilistic interdependence  (e.g. mutual information).
On a separate point the very heavy concentration on statistical reasoning to the exclusion of other mathematics (including perhaps more elementary logical reasoning such as manipulation of quantifiers and logical connectives) rather worries me, since it may encourage the idea that  almost the only practical applications of mathematics are statistical.
Another  serious danger in my opinion is that statistics at this level tends to be more  like cookery than mathematics and it would have to be extremely well taught by a gifted and highly educated teacher if  conceptual precision is not going to be completely lost.  The danger is partially raised by Gowers in Objection 5 listed in his blog (though he doesn’t mention cookery), but I think his own answer is rather optimistic.
Somewhat in this connection there is an interesting passage in Noam Chomsky on Where Artificial Intelligence Went Wrong where Noam Chomsky is interviewed on various topics concerning science, in particular AI and  cognitive science, and what he clearly regards as a modern deviation from the classical scientific method, which has been indirectly caused by the power of modern computers .  The article is quite long, but I found his example of “how to justify the abolition of physics departments” very nice;  it could  equally well used to justify closing down everything in mathematics departments except statistics.

British Academy: “Society Counts”

British Academy published today report Society Counts: Quantitative Skills in the Social Sciences and Humanities (link to full text). A quote:

Statistical literacy for UK graduates

20. The British Academy has frequently emphasised the need for well-rounded graduates, equipped with core skills, if the UK is to retain its status in research and higher education.
21. These core skills start with quantitative methods. The skills standardly deployed, for example, in the natural sciences and engineering are no longer synonymous with or restricted to particular subjects; these skills are now relevant and necessary well beyond traditional science, technology, engineering and mathematics (STEM ) subjects. The changes required to develop these skills in graduates is relevant across the university curriculum. We must therefore seek to apply some of the methods and thinking

that are being used to bring about curriculum change through the STEM initiative.

(Link to full text of the Report)

School maths is failing children – a US and Australian perspective

A post by Jon Borwein and David H. Bailey in The Conversation. A quote:

Pedagogy and mathematics

It is undeniably important that mathematics teachers have mastered the topics they need to teach. The new Australian national curriculum is misguidedly increasing the amount of “statistics” of the school mathematics curriculum from less that 10% to as much as 40%. Many teachers are far from ready for the change.

But more often than not, the problem is not the mathematical expertise of the teachers. Pedagogical narrowness is a greater problem. Telling that there is a correct idea in a wrong solution to a problem on fractions requires unpacking of elementary concepts in a way that even an expert mathematician is not usually trained to do.

One of us – Jon – learned this only too well when he first taught future elementary school teachers their final university mathematics course.

Australian teachers at an elite private school could not understand one of Jon’s daughter’s Canadian long-division method nor her solution techniques for many advanced school topics. She got mediocre marks during the year because of this.

Distribution of abilities

From the Abstract of the paper The best and the rest: revisiting the norm of normality of individual performance, Ernest O’Boyle Jr. and Herman Aguinis:

We revisit a long-held assumption in human resource management, organizational behavior, and industrial and organizational psychology that individual performance follows a Gaussian (normal) distribution. We conducted 5 studies involving 198 samples including 633,263 researchers,
entertainers, politicians, and amateur and professional athletes. Results are remarkably consistent across industries, types of jobs, types of performance measures, and time frames and indicate that individual performance is not normally distributed—instead, it follows a Paretian (power law) distribution. Assuming normality of individual
performance can lead to misspecified theories and misleading practices. Thus, our results have implications for all theories and applications that directly or indirectly address the performance of individual workers including performance measurement and management, utility
analysis in preemployment testing and training and development, personnel selection, leadership, and the prediction of performance, among others.


I am all in marking exams just now and am not able to look into the paper carefully.

Some thoughts though.

The general thesis of non-normality is fine but it is like claiming that the Earth orbits the Sun.

Calibrated measures, such as IQ and many psychological tests, almost by definition are normal in populations (in fact, in the population for which they are developed). There are underlying assumptions, of course, but the point is that they are taylored to be such. Also, exam performance may be calibrated to look like normal by appropriate choice of the set of questions and allocation of marks.

Looking at Study 1, for example, it is ridiculous to talk about normal distribution. Firstly, before looking at the data we know that they are non-negative and generally with small values. The methodology of selecting “leading” journals” makes the numbers even smaller. Before critisising I should think more but a selective procedure invalidates basic assumptions about validity of normal approximation.

The histograms shown towards the end of Study 1 clearly show this. These histograms should have been put at the beginning of the study. The mean and the standard deviation are almost meaningless for this kind of data (counts, maximum at low values). If anything, the histograms suggest starting with exponential or Gamma, and discard the normal outright. Pareto is fine, as well.

Also, using (chi^2) to evaluate the quality of the fit is primitive.

By the way, the fact that Pareto is better than normal does not show that it is any good. Any of the distributions mentioned above will be better than normal.

As it happens, I recently a second year Assignment for Practical Statistics, where students do this kind of thing (fitting exponential distributions, evaluating the fit). They would not have got good marks for using (chi^2) test. QQ-plots and Kolmogorov Smirnov type tests are much better.

Students certainly do not get good marks if they simply fit a distribution and do not evaluate the quality of the fit.