Introduction to Data Collection and Observational Studies

Unit 3, Topics 3.1–3.2: Introduction to Data Collection and Observational Studies

Overview

This lesson introduces data collection to answer statistical questions, distinguishing observational studies (watching relationships without changing anything) from experiments (manipulating to test effects). Observational studies suggest associations but not causation due to confounding (hidden factors). Context, like the population, matters because it affects reliability (e.g., surveying only friends misses broader views).

Data collection starts with questions; observational studies observe existing data sources, but limitations like bias can mislead.

Assignment

Part 1: Guided Practice Activity

Work on your own. Use the example below to practice identifying questions and describing observational studies.

Example: Survey 50 students about phone use and sleep quality. (Data collected: hours slept per night vs. screen time per day.)

Tasks:

  1. Identifying Questions
    • Write one statistical question that requires data collection.
      Example: "Does more phone use relate to less sleep?"

    • Distinguish between an observational study and an experiment.
      Example:

      Observational: Observe students' current habits. 
      Experiment: Assign phone limits to different groups.

    • Write 1-2 sentences explaining associations vs. causation.
      Example: "An association between phone use and less sleep may appear, but causation requires ruling out confounders like stress via an experiment."

    • Extra Practice: For data on smoking and health, write a statistical question. Then, distinguish between study types.

  2. Describing Observational Studies
    • Describe an observational study: observing relationships without manipulation, and note its limitations.
      Example: "It suggests links but can't control for confounders like age that may affect results."

    • Provide a real-world example.
      Example: "Coffee drinkers often report more energy, but diet may confound this."
    • Write 1-2 sentences about data sources and their limitations.
      Example:
      "Surveys are simple but can have voluntary response bias, skewing toward extreme views."

    • Extra Practice: For your smoking example, describe specific limitations.

  3. Reflection
    • Write 2-3 sentences interpreting findings in context.
      Example: "In teens, a phone-sleep association hints at habit impacts, but causation remains unclear without controls."

Part 2: Independent Practice

Scenario: Poll 100 adults about exercise and weight loss. (Data collected: weekly exercise minutes vs. pounds lost.)

Tasks:

  • Write a statistical question. Then, distinguish observational vs. experimental design.
  • Describe the observational approach. Note limits on associations and causation.
    Example: "Observe current routines; diet may confound weight loss claims."
  • Write 2-3 sentences on data sources/limitations and interpreting findings in context.
    Example: "Online polls risk self-report bias; for busy adults, exercise may link to modest loss, but not proven causation."

  • Extra Activity: Invent a scenario (e.g., music and mood). Write a question, describe the study, and interpret findings with context.

Homework Assignment

    • Choose a real-world question (e.g., social media and happiness).
    • Describe an observational study for it.
    • Note limitations and confounding factors.
    • Interpret potential findings in context. Be ready to share for class discussion.
Last modified: Sunday, 9 November 2025, 7:10 PM