1.The validity and accuracy of statistical methods. — Normal Distribution, for example — are entirely dependent upon choosing a random sample from the population of interest. (Not always people.)
If I wanted to determine the most common type of tree within a large, bounded area — US’s New York State, for example — it is not feasible to make a visual inspection of every square foot of the state, record every tree per type, but only within forests and not on residential or business properties (to avoid being shot or arrested for trespassing), tally the data, and declare a winner. This is where a random sample from the state theoretically becomes the only feasible way to conduct the survey. Or is it?
Conceiving a method of achieving a random sample for this project would be a most difficult task and my head hurts just thinking about it. Can anyone here devise a method and be confident that ALL types of tree were counted? If the goal is to determine the most common tree, then an ACTUAL random sampling of the entire state might overlook a tree-type or two, but we would be confident that their numbers are small, that they are uncommon. BUT, any missing types would be a problem if the goal was to take INVENTORY, a total accounting, of all tree-researchers are trying to find which species of tree is carrying a known tree disease? The entire project may fail for the reason described above.
Mercifully, this entry is NOT about this narrow subject. I’ve merely used this context to introduce the concept “random,” as used in statistics (and probability). It’s not an exaggeration that ACTUAL random samples are being achieved when cranking out statistics. I wouldn’t bet the farm. Governmental policies.
2. But probability and statistics weren’t my initial, academic exposure to “random,” which first came in my introductory programming language courses. It was in these that I learned of the concept of random as a problem. Professors tended to give the homework assignment of writing, “random number generator,” programs. The task, as I remember it, was to accept a positive number (N) as input and generate a string of 10 numbers between and N. The program was deemed correct IF the output for run R1 a) showed no pattern; and, b) IF invocations R1 , R2 , … RX showed no repetition when compared to each other. I would test my program by running it 5 times with N=500 and compare the 5 strings of 10 numbers. While coding was easy, it was the algorithm that was not trivial. (I think something called a “seed” is crucial? It was over 30 years ago.)
Since run R can NOT know which seed was used in R-1, then, of course, the process used to select the seed must be random — otherwise, there would be a repetition of the first number in each successive run, which was a a common first error for students.
Relevant to the random number generator, what is one difference between the 32-bit operating system (OS) and the 64-bit OS? Hint: “A man’s got to know his limitations.” — Dirty Harry
3. Choosing an item by sight from a list at random. My proposition: Humans will always fail.
Some here will agree entirely and don’t require persuasive evidence. (No, not “proof.”) For the reflexively intractable and rebellion-addicts, I’ll give my argument, the evidence — adjective, “persuasive,” omitted for you antipodeans, who, apparently, may be constitutionally incapable of agreeing with anyone, except possibly, your mommies.
The scenario, given: (I do not have a random number generator; this is about unaided human attempts to achieve “random.”) I’m in the lobby of a medical building, where there is a single-column listing (vertical list) hanging on the wall. It contains names of primary care doctors, what we used to call “general practitioners.” I need to choose one to become my new doctor.
I want to make a random choice.
Having had many doctors, I know that I have a bias for one of the opposite sex (don’t ask). I want to avoid this bias and choose from the entire list, not a subset by preferred sex. I know that I shouldn’t get close-enough to read any names. The list is length L. If I actually could, then I would choose a random number between 1 and L. I don’t know how to do that — does anyone? People claim to do it at every turn. I decide to write the word “doctor” L times, vertically, on a piece of paper.
Before I attempt to make a “random” choice from this list, I now leave this fictional scenario to tell of personal experience. I’ve attempted to make random choices many dozens of times throughout my life. Here is what I consider evidence that humans can’t do it: A few years ago, I realized that I never once chose the FIRST item in any “random” attempt. Obviously, any random choice from a list MIGHT result in item 1, the first.
My next realization is that I’ve never chosen the LAST item either. I had a bias against the top (first) and bottom (last) items in all those many dozens of lists. I don’t consider myself UNIQUELY biased or part of some small group “3 standard deviations” out from the mean of the bell curve. Therefore, my conclusion: There is a natural affinity to avoid choosing the first and last items in such lists.
I can hear it now: My account is anecdotal and, hence, not evidence of anything! Nice try. The charge of inconclusive solely for the reason of being a personal story is valid if and only when the anecdote is NOT representative of a pervasive human tendency. I’d prefer to take it further, that all humans will fail due to myriad biases, but there’s not need for me to do that. I won’t for this reason: My story is illegitimate if any human can achieve a random choice.
Those of you who think it possible, convince us all by ANY means of your choosing. I bet you can’t. You will win the keys to the kingdom if you do.
(Full disclosure: The story about the list of doctors in the medical building is based on an actual, famous personal account from 1935, telling us that “random” has been misunderstood for at least that long. The list was located in a hotel lobby and contained local churches; the claim in the story is that the man chose one “at random.” I doubt very much that he chose either the first or last church in the list. That list may have been saved for historical reasons, as well as the name of the church chosen. I should find out to add another “anecdote” as evidence.)