Skip to content

1: Collecting Data in Reasonable Ways

To begin statistics, you first must ask question about a population, then collect data

Collecting Data

Types of Studies

Experimental

Experimental studies comprise of experiments. Typical features of an experimental study will comprise of control of different conditions, and typically won’t give *freedom* to the participants

Observational

Examples of observational studies would be polling, surveying, research, or simulations. Typical features of an observational study will comprise of no control, just *observing*, and has a sense of freedom

Sampling

Census

Counts everything, take the U.S. Census for example, which gathers statistics (ideally) for every American. Some cons about this sampling method is that it is too resource intensive ******and not cost effective

Samples

Therefore, if a census cannot be done, you want to choose a smaller set, known as a sample. Note that it must be representative.

Parameter and Statistics

Population Characteristic (Parameter): The number that represents population (the value of the population).

Statistics: The number that represents the sample

Tip

Parameter requires census, while statistic requires the sample

Methods for Sampling

Simple Random Sample (SRS)

Example: Using a computer as a random number generator, names in a bucket, etc.

Cons: By randomness, you may get similar (homogeneous) people, repeats

Warning

Ensure whether you need to include “without replacement”

Stratified Random Sample

Group people/objects into “strata” that are homogeneous, then SRS each group.

Cluster Sampling

Group people/objects into clusters that are non-homogeneous (choose a group where people/objects may inherently be different, then SRS the cluster, census people within the cluster

Examples would include classroom or community

Convenience Sample

A very easy way to sample, but terrible. Consists of using whatever is the most accessible

Systematic Sample

Systematically select people (e.g every 10 people)

Isn’t the most ideal as you could possibility of over-represent people

Biases

Voluntary Response Bias

A type of bias which involves volunteering. The problem with this is is that it will often result with samples dominated by strongly opinionated people

Non-response Bias

People who aren’t responding might have an affect

Undercoverage Bias (Selection Bias)

The way of choosing people intentionally leaves out people

Response Bias (Question Wording)

Twisting the truth, framing the question in such manner so that it may result in bias towards specific response(s)

Experimental Studies

Experimental Design

Terms

Treatments: Experimental Conditions

Explanatory Variable: Variable that is manipulated

Response Variable: Variable which is affected

Experimental Unit: Person/object receiving treatment

Replication: Many experimental units

Confounding Variable: Things that might affect the result

Direct Control

Make variables as much as the same as possible (will consist of variables you can control)

Variables you cannot control would go to random assignment to treatments

Methods for Assigning Treatments

Completely Randomized Design: Everything is randomized

Randomized Block Design: Assign treatments to units (random within each block) in each block

Tip

Blocks are groups that or homogeneous, like strata

Placebo

Treatment that looks real, but should do nothing. This is often, but not always, a form of control group, and is typically used within medicine

Blinding

Single Blind: People do not know which treatment they received

Double Blind: People/person giving treatments do not know

Tip

Not all experiments are able to be blind

Study Conclusions

Experimental Study Conclusion: Cause and Effect relationship, otherwise known as causation

Observational Study Conclusion: Association (correlation)