Feb 162018

Today is the birthday (1822) of Sir Francis Galton, FRS, an English statistician and polymath whose mathematical investigations of human variables lie at the heart of quantitative analysis in social science to this day. I apologize for the preponderance of studies of biological variability in recent days. I promise to move on after Galton. Galton created or popularized the statistical concept of correlation, regression toward the mean, and was the first to apply statistical methods to the study of human differences and the inheritance of intelligence, and introduced the use of questionnaires and surveys for collecting data on human communities. He also popularized the phrase “nature versus nurture.” He founded psychometrics (the science of measuring mental faculties), differential psychology, and the lexical hypothesis of personality. He devised a method for classifying fingerprints that proved useful in forensic science. He also conducted research on the power of prayer, concluding it had none by its null effects on the longevity of those prayed for. His quest for the scientific principles of diverse phenomena extended even to the optimal method for making tea. Galton devised the first weather map, proposed a theory of anticyclones, and was the first to establish a complete record of short-term climatic phenomena on a European scale. He also invented the Galton Whistle for testing differential hearing ability. I can’t cover it all, so I’ll make some selections.

Galton was born at “The Larches”, a large house in the Sparkbrook area of Birmingham, built on the site of “Fair Hill”, the former home of Joseph Priestley, which the botanist William Withering had renamed. He was Charles Darwin’s half-cousin, sharing the common grandparent Erasmus Darwin. His father was Samuel Tertius Galton, son of Samuel “John” Galton. The Galtons were famous and highly successful Quaker gun-manufacturers and bankers, while the Darwins were distinguished in medicine and science.

Galton was by many accounts a child prodigy – he was reading by the age of two; at age five he knew some Greek, Latin and long division, and by the age of six he had moved on to adult books, including Shakespeare for pleasure, and poetry, which he quoted at length. Later in life, Galton would propose a connection between genius and insanity based on his own experience. He stated:

Men who leave their mark on the world are very often those who, being gifted and full of nervous power, are at the same time haunted and driven by a dominant idea, and are therefore within a measurable distance of insanity.

Galton attended King Edward’s School, Birmingham, but chafed at the narrow classical curriculum and left at 16. His parents pressed him to enter the medical profession, and he studied for two years at Birmingham General Hospital and King’s College London Medical School. He followed this up with mathematical studies at Trinity College, University of Cambridge, from 1840 to early 1844. A severe nervous breakdown altered Galton’s original intention to try for honours. He elected instead to take a “poll” (pass) B.A. degree, like his half-cousin Charles Darwin. Following the Cambridge custom, he was awarded an M.A. without further study, in 1847. He then briefly resumed his medical studies. The death of his father in 1844 had left him financially independent but emotionally damaged, and he terminated his medical studies entirely, turning to foreign travel, sport, and technical invention.

In 1850 he joined the Royal Geographical Society, and over the next two years mounted a long and difficult expedition into then little-known South West Africa (now Namibia). He wrote a successful book on his experience, Narrative of an Explorer in Tropical South Africa. He was awarded the Royal Geographical Society’s Founder’s Gold Medal in 1853 and the Silver Medal of the French Geographical Society for his pioneering cartographic survey of the region. This established his reputation as a geographer and explorer. He proceeded to write the best-selling The Art of Travel, a handbook of practical advice for the Victorian on the move, which went through many editions and is still in print.

In 1888, Galton established a lab in the science galleries of the South Kensington Museum. In Galton’s lab, participants could be measured to gain knowledge of their strengths and weaknesses. Galton also used these data for his own research. He would typically charge people a small fee for his services. During this time, Galton wrote a controversial letter to The Times titled “Africa for the Chinese.” This paper will set the stage for my general opinion of Galton, namely, he was a brilliant mathematician whose work on the quantitative aspects of human populations is unrivalled, but his social theories themselves are hopelessly inadequate because they are driven by a warped English Victorian colonial mentality. In “Africa for the Chinese” he makes the case for having the overpopulation problem of China solved by having all the surplus population of China emigrate to Africa and displace the indigenous Africans because the Chinese are a superior race. They are inferior to the English, of course, but their current degeneracy was caused by the failures of Chinese dynasties, not their inherent tendencies. With room to move (sound familiar?) they would prosper. He was following the anthropology of the time, notably the work of E. B. Tylor, that saw all cultures as evolving inexorably through three stages: savagery, barbarism, and civilization. The Chinese were barbarians who were stunted in their attempts to become civilized by former governments, so why not have them displace some savages and thereby flourish? No one is going to miss a few savages. The African slave trade itself had been abolished by this time, but slavery was still very much alive and well in the Americas.

Galton devoted much of the rest of his life to exploring variation in human populations and its implications. In so doing, he established a research program which embraced multiple aspects of human variation, from mental characteristics to height; from facial images to fingerprint patterns. This required inventing novel measures of traits, devising large-scale collection of data using those measures, and in the end, the discovery of new statistical techniques for describing and understanding the data. Many of his actual metrics are deeply flawed. For example, there is no statistically valid correlation between skull size and intelligence, yet he ploughed on in this direction regardless, including using inappropriate ways of measuring skulls.

Galton was interested at first in the question of whether human ability was hereditary, and proposed to count the number of the relatives, of various degrees, of eminent men (not women). If the qualities were hereditary, he reasoned, there should be more eminent men among the relatives than among the general population. To test this, he invented the methods of historiometry. Galton obtained extensive data from a broad range of biographical sources which he tabulated and compared in various ways. This pioneering work was described in detail in his book Hereditary Genius in 1869. Here he showed, among other things, that the numbers of eminent relatives dropped off when going from the first degree to the second degree relatives, and from the second degree to the third. He took this as evidence of the inheritance of abilities. The flaw is obvious, bringing up the phrase that he himself popularized: “nature versus nurture.” [He did not coin the phrase, but used it widely.] Take famously musical families, such as the Bachs, Mozarts, and Mendelssohns. Is there a musical genius gene that they all passed on from generation to generation, or were they nurtured in musical households that fostered interest in, and training in, music at a young age? Galton knew nothing about genetics, so his views on inheritability of characteristics were crudely speculative.

Galton recognized some of the limitations of his methods and believed that some nature versus nurture questions could be better studied by comparisons of twins. His method envisaged testing to see if twins who were similar at birth diverged in dissimilar environments, and whether twins dissimilar at birth converged when reared in similar environments. He used the method of questionnaires to gather various sorts of data, which were tabulated and described in a paper “The history of twins” in 1875. In so doing he anticipated the modern field of behavior genetics, which relies heavily on twin studies. He concluded that the evidence favored nature rather than nurture. He also proposed adoption studies, including trans-racial adoption studies, to separate the effects of heredity and environment.

Galton invented the term eugenics in 1883 and set down many of his observations and conclusions in a book, Inquiries into Human Faculty and Its Development. He believed that a scheme of ‘marks’ for family merit should be defined, and early marriage between families of high rank be encouraged by provision of monetary incentives. He pointed out some of the tendencies in British society, such as the late marriages of eminent people, and the paucity of their children, which he thought were dysgenic. He advocated encouraging eugenic marriages by supplying able couples with incentives to have children. On 29 October 1901, Galton chose to address eugenic issues when he delivered the second Huxley lecture at the Royal Anthropological Institute.

Galton’s eugenics needed a firm foundation in understanding the mechanism of the inheritability of traits. Mendel’s work on genetics was available, but buried in obscurity because when he hit upon the gene theory of inheritance there was no use for it. Mendel’s work predated Darwin, and largely contradicted the prevailing view that offspring are blends of their parents. Mendel showed that certain traits in garden peas were either one thing or another, never blends. A seed was either smooth or wrinkled, never slightly wrinkled, for example. That is because he chose traits that are represented by single genes that are either dominant or recessive. Galton experimented with sweet peas using traits that are represented by multiple genes and can also be influenced by environmental factors. Height is an obvious example. A tall and a short parent will likely produce middle height children because height is represented by several genes. It is also influenced by diet in childhood. Galton was particularly interested in why traits, like height, which can be represented by a normal (bell-shaped) curve, remained stable in populations over time. He devised all manner of physical experiments using variously shaped containers through which he passed lead shot, and ultimately came up with a statistical model we call “regression to the mean.” I’ll leave you to explore the details if you are interested.

Galton’s accumulation of data on thousands of humans allowed him to observe correlations between, for example, forearm length and height, head width and head breadth, and head length and height. With these observations he was able to write “Co-relations and their Measurements, chiefly from Anthropometric Data.” In this paper, Galton defined “co-relation” as “the variation of the one [variable] is accompanied on the average by more or less variation of the other, and in the same direction.” The use of co-relation (spelled, now, “correlation”) is invaluable in quantitative social science, as long as you remember that correlation is not the same as causation. If I demonstrate that people who exercise daily are smarter than people who do not, have I shown that exercising regularly makes you smarter, or that being smarter makes you exercise daily?

The method used in Hereditary Genius has been described as the first example of historiometry. To bolster these results, and to attempt to make a distinction between ‘nature’ and ‘nurture’ he devised a questionnaire that he sent out to 190 Fellows of the Royal Society. He tabulated characteristics of their families, such as birth order and the occupation and race of their parents. He attempted to discover whether their interest in science was ‘innate’ or due to the encouragements of others. The studies were published as a book, English men of science: their nature and nurture, in 1874. In the end, it promoted the nature versus nurture question, though it did not settle it. It is settled now. NOTHING IS EITHER ONE OR THE OTHER !!

In an effort to reach a wider audience, Galton worked on a novel entitled The Eugenic College of Kantsaywhere from May until December 1910. The novel described a utopia organized by a eugenic religion, designed to breed fitter and smarter humans. His unpublished notebooks show that this was an expansion of material he had been composing since at least 1901. He offered it to Methuen for publication, but they showed little enthusiasm. Galton wrote to his niece that it should be either “smothered or superseded”. His niece appears to have burnt most of the novel, offended by the love scenes, but large fragments survived, and it was published online by University College London.

In 1906 Galton proposed a method of cutting a cake so that there were never any exposed surfaces to dry out when it was stored. Here are diagrams:

Galton’s method of cutting is economical so I will pair it with Mrs Beeton’s idea of an economical cake:


  1. INGREDIENTS.—1 lb. of flour, 1/4 lb. of sugar, 1/4 lb. of butter or lard, 1/2 lb. of currants, 1 teaspoonful of carbonate of soda, the whites of 4 eggs, 1/2 pint of milk.

Mode,—In making many sweet dishes, the whites of eggs are not required, and if well beaten and added to the above ingredients, make an excellent cake, with or without currants. Beat the butter to a cream, well whisk the whites of the eggs, and stir all the ingredients together but the soda, which must not be added until all is well mixed, and the cake is ready to be put into the oven. When the mixture has been well beaten, stir in the soda, put the cake into a buttered mould, and bake it in a moderate oven for 1-1/2 hour.

Time.—1-1/2 hour. Average cost, 1s. 3d.

Oct 202015


Today is World Statistics Day, celebrated for the first time on 20 October 2010 worldwide in accordance with a declaration by the United Nations Statistical Commission. The Royal Statistical Society in the UK launched its getstats “statistical literacy” campaign to open the celebrations at 20:10 on 20.10.2010. I don’t want to get too technical here; I have plenty of experience watching people’s eyes glaze over when I start talking mathematically. The fact is that statistical analysis can be extremely complex, but the foundational ideas are really easy to grasp. For a set of funny and informative videos I suggest you go here https://worldstatisticsday.org/ Here I’d like to do a couple of things, namely talk about the handling and presentation of statistics, and have a little fun.

Like it or not, statistics rule a big chunk of our lives. I’m a social scientist so statistics are a big part of my professional life, even though a lot of my writing is math-free. “Proof” of assertions concerning social life hinges on good statistical data. You may think something about social life is obviously true, but you need statistical data at your back. Three things are important – (1) data do not speak for themselves, (2) proper presentation of data is vital and (3) data are only as good as their method of collection.


The first point ought to be self evident, but often is not. You cannot show me some data and assume I will see in them what you see. Suppose you show me a graph of rising fuel costs over the past decade. What should I do with it? Does it matter to me? If it does matter to me, how does it matter? Is it a good thing or a bad thing, for me, or in general? It does not speak for itself. Maybe I own a factory and rising fuel costs are eating into my profits. Maybe I am a worker whose salary increases have not kept pace with inflation, so I am having to cut back on non-essentials. Maybe I am a hermit living in a remote cave with no need to buy fuel. Context matters in interpreting statistics.

The second point can also be overlooked. In the 19th century Florence Nightingale discovered that in military hospitals in the Crimea and elsewhere, a great many more soldiers died from preventable diseases than from war wounds. She believed that better sanitation in the hospitals was the answer but she needed to convince bone headed politicians to vote for increased funding. To do so she felt that if the data were graphically presented they would be more understandable than tables and spreadsheets. So she created a type of pie chart now sometimes called the Nightingale Rose – shown here (click to enlarge):

From: David Pogson Sent: 11 March 2005 09:43 To: Emma Goodey Subject: FW: Youngsters meet Princess -----Original Message----- From: pictures@pixmedia.co.uk [mailto:pictures@pixmedia.co.uk] Sent: 10 March 2005 17:42 To: David Pogson Subject: Youngsters meet Princess Hi David - Please see the attached pics from Boarshaw on Monday. If you would be kind enough to credit Pixmedia for the images, I would be grateful. Would you like to be included on the distribution list for pics from future events? Kind regards Simon C Apps Managing DIrector Pixmedia Ltd Caption: Boarshaw youngsters Anthony Leach and Daniel Cooper chatting to The Princess Royal when she officially opened the new YIP. Credit: Pixmedia This image is provided free of charge for editorial use and is approved by Crime Concern PLEASE NOTE: THE ABOVE MESSAGE WAS RECEIVED FROM THE INTERNET AND HAS BEEN CERTIFIED VIRUS-FREE.

It was effective, although it’s debatable whether this chart was more effective than a simple bar graph as shown here (click to enlarge):


You decide. At very least you understand the importance of method of presentation.

The third point can also be overlooked very easily. Probably everyone understands that when you are conducting a survey, the size of your sample and the nature of people in the sample are critical issues. You can’t get a meaningful picture of racism in the U.S. by polling a small group of white people all living in one state. You have to have a large, widely distributed sample of people from all walks of life and all ethnicities. But the quality of the data also depends on the questions asked and the responses allowed. Obviously you can’t just bluntly ask, “are you a racist?” You have to decide what questions will get at the heart of the matter, and that is far from easy. You also have to contend with the fact that many people who answer surveys answer according to their ideal self image, and not necessarily according to the truth.

Here now is a little gallery of amusing statistical charts:

stat9 stat8 stat7 stat5 stat4 stat6 stat3 stat2

As long-time readers know, I like to cook by the seat of my pants most of the time, and it’s something of a strain to come up with precise measurements and instructions. So here is my heuristic/statistical recipe for a pear and passionfruit crumble I made yesterday using percentages. It’s pretty close to how I actually think when I cook.


© Pear and Passionfruit Crumble

With a fruit crumble the correct ratio of fruit to crumble topping is very important. By eye I would say my crumble is 35% topping and 65% filling. Some people may like to have more fruit. The 35% topping is divided thus: 10% rolled oats, 10% plain flour, 10% granulated sugar, and 5% butter, or a ratio of 2:2:2:1. Put the oats, flour, and sugar in a mixing bowl and stir a little until they are mixed. Make sure the butter is very cold and cut it into the smallest pieces you can. Rub the butter into the dry ingredients with your fingers so that the mixture is reasonably homogenous. Set aside.

For the 65% fruit mix Use about 60% pears and 5% passionfruit. These days when I make crumbles I don’t peel the fruit. They have an earthier taste unpeeled. Cut the tops and tails off the pears, then slice downwards to separate the meat from the core. Discard the core and slice the meat thickly. Put the pear slices into a baking dish and sprinkle with sugar. Cut the passionfruit in half and scrape the inside pulp on to the pears. Toss with a wooden spoon.

Pour the crumble topping over the fruit and spread it evenly. Tamp down the top a little to compress the crumble a little but not too firmly. Bake in a 400°F oven for about 45 minutes, or until the top is mottled golden-brown. Serve hot or cold, plain or with custard, whipped cream, or ice cream.