1: Collecting Data in Reasonable Ways
To begin statistics, you first must ask question about a population, then collect data
Collecting Data
Types of Studies
Experimental
Experimental studies comprise of experiments. Typical features of an experimental study will comprise of control of different conditions, and typically won’t give *freedom* to the participants
Observational
Examples of observational studies would be polling, surveying, research, or simulations. Typical features of an observational study will comprise of no control, just *observing*, and has a sense of freedom
Sampling
Census
Counts everything, take the U.S. Census for example, which gathers statistics (ideally) for every American. Some cons about this sampling method is that it is too resource intensive ******and not cost effective
Samples
Therefore, if a census cannot be done, you want to choose a smaller set, known as a sample. Note that it must be representative.
Parameter and Statistics
Population Characteristic (Parameter): The number that represents population (the value of the population).
Statistics: The number that represents the sample
Tip
Parameter requires census, while statistic requires the sample
Methods for Sampling
Simple Random Sample (SRS)
Example: Using a computer as a random number generator, names in a bucket, etc.
Cons: By randomness, you may get similar (homogeneous) people, repeats
Warning
Ensure whether you need to include “without replacement”
Stratified Random Sample
Group people/objects into “strata” that are homogeneous, then SRS each group.
Cluster Sampling
Group people/objects into clusters that are non-homogeneous (choose a group where people/objects may inherently be different, then SRS the cluster, census people within the cluster
Examples would include classroom or community
Convenience Sample
A very easy way to sample, but terrible. Consists of using whatever is the most accessible
Systematic Sample
Systematically select people (e.g every 10 people)
Isn’t the most ideal as you could possibility of over-represent people
Biases
Voluntary Response Bias
A type of bias which involves volunteering. The problem with this is is that it will often result with samples dominated by strongly opinionated people
Non-response Bias
People who aren’t responding might have an affect
Undercoverage Bias (Selection Bias)
The way of choosing people intentionally leaves out people
Response Bias (Question Wording)
Twisting the truth, framing the question in such manner so that it may result in bias towards specific response(s)
Experimental Studies
Experimental Design
Terms
Treatments: Experimental Conditions
Explanatory Variable: Variable that is manipulated
Response Variable: Variable which is affected
Experimental Unit: Person/object receiving treatment
Replication: Many experimental units
Confounding Variable: Things that might affect the result
Direct Control
Make variables as much as the same as possible (will consist of variables you can control)
Variables you cannot control would go to random assignment to treatments
Methods for Assigning Treatments
Completely Randomized Design: Everything is randomized
Randomized Block Design: Assign treatments to units (random within each block) in each block
Tip
Blocks are groups that or homogeneous, like strata
Placebo
Treatment that looks real, but should do nothing. This is often, but not always, a form of control group, and is typically used within medicine
Blinding
Single Blind: People do not know which treatment they received
Double Blind: People/person giving treatments do not know
Tip
Not all experiments are able to be blind
Study Conclusions
Experimental Study Conclusion: Cause and Effect relationship, otherwise known as causation
Observational Study Conclusion: Association (correlation)