• Skip to main content
  • Skip to search
  • Skip to footer
Cadence Home
  • This search text may be transcribed, used, stored, or accessed by our third-party service providers per our Cookie Policy and Privacy Policy.

  1. Blogs
  2. Breakfast Bytes
  3. MLK Off-topic: The Lady with the Polar Chart
Paul McLellan
Paul McLellan

Community Member

Blog Activity
Options
  • Subscribe by email
  • More
  • Cancel
offtopic
florence nightingale
statistics

MLK Off-topic: The Lady with the Polar Chart

18 Jan 2019 • 5 minute read

 breakfast bytes logo It's Martin Luther King day on Monday, and Cadence is off. I think that this is the first time I have ever got it as a holiday, rather than moving it to somewhere like July 3rd. Breakfast Bytes will not appear on Monday.

So, as is traditional, the day before the holiday is time for an off-topic post. Although this is not completely off topic. In yesterday's post IEDM Short Course: EUV, the Road to HVM and Beyond I talked about stochastics. That's actually just a grand word for random. In the context of semiconductor manufacturing, huge amounts of data are generated and then statistical techniques are used to work out the contributions of various possible factors and thus what might need to be tweaked to increase yield.

Statistics

A lot of statistical techniques were developed before there were computers. It was a way of reducing large (for the time, we're not talking zettabytes here) amounts of data to a few parameters that represented the data in a way that could be used to do analysis with the reduced numbers rather than the raw data, or to display the data in a way that was easily comprehensible. Excel will draw you all sorts of charts based on large amounts of data, but those graphical techniques had to be invented.

Of course, now we have computers, it often makes more sense to re-process the raw data rather than relying on the summary that things like the mean and standard-deviation provide. A lot of "obvious" things done with the summary parameters gives the wrong answer. For more details on some of that, see my post-Labor Day Off-Topic: Almost Everyone Has More Than the Average Number of Legs. You might assume that if David Justice had a better batting average than Derek Jeter in both 1995 and 1996, that if you went back to the original data and calculated their batting averages for 1995 and 1996 combined, that Justice would have the better average over those two years. But you’d be wrong. This is known as Simpson's Paradox. 

Talking of weird sporting statistics, did you know Wayne Gretzky was such a great hockey player, he sets records even in retirement? His points-per-game average of 1.921 when he retired was second only to Mario Lemieux at 2.005. But Mario later came out of retirement and lowered his own average enough that Gretzky became #1.

A lot of the techniques used in mathematical statistics were developed by the British Statistician Ronald Fisher in the early part of the 20th century. In Alders Hald's book A History of Mathematical Statistics, he is described as:

a genius who almost single-handedly created the foundations for modern statistical science

He initially worked on agricultural crop yields and developed the analysis of variance (ANOVA) that every statistics course spends a lot of time on. He then moved on to basically create population genetics and is known for Fisher's Principle, which demonstrates why the sex ratio in a species is almost always 1:1 (and the much more interestingly named "sexy son hypothesis"). If you ever heard that Mendel's original work on peas was too good to be true, it was Fisher who did the analysis that showed it. He is also the person who originated the phrase "correlation is not the same as causation."

So if Fisher was the "the single most important figure in 20th-century statistics" then who was the most important in the 19th century?

Well, one person with a strong claim is the first woman member of the Royal Statistical Society in London, Florence Nightingale.

Wait...what?

Yes, that Florence Nightingale, the lady with the lamp. Or in this case with the polar chart.

Florence Nightingale

Florence was called Florence since she was born in Florence, Italy (which is one of those places that has a different name in English than its real name, Firenze). The part of her story that everyone knows is that she was the lady with the lamp during the Crimean war. But even then she was using statistics to work out what was going on.

She invented many new statistical techniques. For example, the above polar diagram shows the causes of deaths over two years, with each segment representing a month. The blue areas represent deaths from preventable diseases, red from battlefield wounds, and black from all other causes. The overwhelming domination of the blue area showed that more people were being killed by preventable diseases caused by unsanitary conditions than were killed in battle. The circle on the right is 1854-5 before her changes. On the left, after her changes. She reduced the death rate from 42% to 2%. This diagram, novel in using graphics to convey data, was very influential, especially in countering the prevailing belief that most men died in battle, not from disease.

 From 1857, she was largely bedridden, although she continued to write and do hospital planning. She was apparently consulted a lot during the US Civil War about how best to manage field hospitals.

She wrote the book on modern nursing. Literally. In 1859, she wrote Notes on Nursing and the same year she collaborated with author Harriet Martineau to create a book on her findings in Crimea England and Her Soldiers. This so shamed the army establishment that they censored it and would not allow it to be distributed in the libraries in barracks. Then, with her nursing cap on, she established St Thomas’ Hospital and the Nightingale Training School for Nurses (which still exists, as the Florence Nightingale School of Nursing and Midwifery, part of King's College London, in turn part of the University of London).

She not only invented much of modern nursing but also many ideas in statistics that are still influential today.

XKCD

I mentioned above the saying "correlation does not equal causation", and as always there's an XKCD for everything. Have a good long weekend (if you get the holiday) and Breakfast Bytes will be back on Tuesday.

 

Sign up for Sunday Brunch, the weekly Breakfast Bytes email.