Math 1431H

Spring 2014 Project Overview

Spring 2014 Project Overview

Complete this project in groups of 2-3 students. Decide upon a research question about the difference between populations or subpopulations. Gather sample data and use t-tests to test for differences in the means, find linear regression models and co-variate models for the data. Each group will present their results to the class (5-10 minutes). The division of tasks is left up to the team, but each team member should be involved in some way at every phase of the project.

Overall Project Tasks

- Research Ideas: Pick a research topic that can be stratified into two populations.
- Data Collection: Identify a quantitative dataset that relates to the research topic in which you are interested.
- Hypothesis Tests: For each of the quantitative response variables use R to perform a 2-sample t-test to test for difference in the means of the two populations.
- Models: Use R to construct linear regression models and co-variate models for each of the two populations.
- Presentation: Each group must prepare a PowerPoint presentation which each group will present in-class.

Specific Project Tasks

a.-b.Research Ideas and Data Collection

Your group must select two populations (e.g., males/females or pitchers/outfielders) with at least two (2) quantitative response variables that you believe may be linked to each other and may vary between your two populations (e.g., height/weight, games per season/salary, price/weight or volume) and at least five (5) quantitative explanatory variables. Your subjects do not necessarily need to be people. You should use the internet to gather your data. There is plenty of interesting research questions out there, so use your IMAGINATIONS and be ORIGINAL.

Your data must include at least 100 subjects within each of your two populations.

d. Hypothesis Tests

For each of the quantitative response variables, use a 2-sample t-test (or matched-pairs t-test) procedure to test for a difference in the means of the two populations.

e. Regression

Your group should select quantitative variables that you believe may be linked (correlated). One of those variables should be regarded as an explanatory variable and the other as the response variable. You need to explain how you decide which variables are the response and explanatory variables.

Create several scatterplots showing the relationship between the response and explanatory variables for each population. I do NOT want you to construct a scatterplot for all possible combinations of response and explanatory variables. Instead, you need to choose several pairs of response and explanatory variables. You need to explain why you choose the particular pair of response and explanatory variables (and why you did not choose another pair of response and explanatory variables) to graph. You can choose a pair of response and explanatory variables to show there is NO relationship (e.g., stolen bases is not correlated to the number of run in baseball is an important result). You will construct a linear model for each scatterplot. You will need to explicitly state the model, calculate the Pearson correlation coefficient, and calculate the coefficient of determination.

Find several co-variate models between the response variable and two or more co-variates (i.e., explanatory variables) for each population. I do NOT want you to construct a co-variate model for all possible combinations of response and explanatory variables. Instead, you need to choose response variables and the co-variates. You need to explain why you choose the particular set of variables. You will need to calculate the coefficient of determination for each co-variate model.

f. Presentation

The in-class presentation is expected to be a PowerPoint presentation, it should include at least the following.

- Title sheet, names of group members
- Introduction, research questions
- Hypothesis tests
- Null and alternative hypotheses
- Results, p-values

- Regression
- Scatterplots
- linear models for each scatterplot
- r-values for each scatterplot
- Interpretation of r-values
- r
^{2}-values for each model - Interpretation of r
^{2}-values for each model - co-variate models
- r
^{2}-values for each model - Interpretation of r
^{2}-values for each model

- Conclusion
- Summary of significant results
- What did you learn from the project

The following shows how this project will be graded.

NAME_____________________________________________________________

Portion | Up to one third | One-third to half | Full Points | Possible Points | Earned Points |
---|---|---|---|---|---|

Preparation | |||||

Initial Proposal | Proposal missing or missing significant parts. | Proposal turned in half complete. | Both populations and both variables well defined | 5 | |

Formal Proposal | Proposal missing or missing significant parts. | Proposal turned half complete. | Both populations and both variables well defined | 5 | |

Presentation | If you are absent during your group's presentation, then you will NOT receive credit for the any part of the project listed below. | ||||

Introduction and Overview of Research | No statement of research question, variables not defined, hypothesis not stated or explained. | Research question, variables, and hypothesis explained adequately. | Research question, variables, and hypothesis explained exceptionally well and with relevance. | 5 | |

Data Collection | Poor design, description, or implementation or sampling methods; population not defined. | Good design, description, execution of design and sampling methods; population defined. | Exceptional design, description, & execution of survey/sampling; population well-defined. | 10 | |

Statistical Analysis: All Variables | No descriptive statistics provided for demographic and research variables. | Basic descriptive statistics provided for all variables (5 number summary, mean, standard deviation, etc.). | Thorough analysis of each research variable with appropriate graphs and discussion. | 10 | |

t-test | Neither test statistics correct nor interpretation of p-values. | Either test statistics correct or interpretation of p-values. | Test statistic correctly calculated, correct interpretation of p-values. | 10 | |

Scatterplots | Failed to display any scatterplots. | Displayed only one scatterplot. | Displayed all three scatterplots. | 10 | |

Regression | Failed to explain scatter-plot, r, or line of best fit correctly. | Explained scatter-plot, r, and line of best fit accurately. | In addition, explained r^{2}, and an example with prediction equation. |
10 | |

Conclusions | Report demonstrates no real-world understanding of the statistical results; no adequate explanation of findings. | Report demonstrates basic understanding of results and makes reasonable attempt to explain findings. | Report demonstrates thorough understanding of results and offers insightful explanation of findings. | 20 | |

Overview, Clarity, Poise, & Timeliness | Did not explain hypothesis or real world context of results; spoke hesitantly, lacked poise, or did not finish in time allotted. | Adequately described hypothesis and real world context of results; well-spoken, poised, and within time allotted. | Explained hypothesis and real world context of results well and with relevance; polished and professional presentation. | 5 | |

Technical | Trouble with PPT file or graphics, distracting colors or sounds, lacked coordination with speaker. | Good PPT slides, nice graphics and layout, good teamwork with speaker. | Beautiful PPT colors, themes, graphics, layout, teamwork-a pleasant and seamless visual experience. | 5 | |

Collegiality | Missing form or average score less than 3. | Average score between 3 and 4. | Average score of 5. | 5 | |

Total | 100 |

Your name(s):
______________________________

Stats Project Worksheet

Initial Proposal

MATH 1431H

Initial Proposal

MATH 1431H

Complete this form (typed) and turn it in at the beginning of class on 6 February 2014.

I only want 1 per group.

I only want 1 per group.

1. List the members of the group. Did you exchange contact info (cell numbers and/or email addresses - do NOT include these on this form)?

2. What subpopulations are you going to compare? (Examples: males vs. females, NBA salaries vs. NFL salaries)

3. What quantitative variables are you going to use? Have you found a dataset? If yes, then send the dataset to Dr. Weber by email. If no, have you started searching for data? Where have you searched? List the URL (web address) of the sites you plan to use.

Your name(s):
______________________________

Stats
Project Worksheet

Formal Proposal

MATH 1431H

Formal Proposal

MATH 1431H

Complete this form (typed) and turn it in at the beginning of class on 6 March 2014.

I only want 1 per group.

I only want 1 per group.

1. List your group members.

2. What is your research question?

a) What two populations are you studying?

b) What are the quantitative variables of interest you are studying?

3. Send your dataset to Dr. Weber by email.

Your name:
______________________________

Stats Project Worksheet

Collegiality Form

Complete this form (clearly handwritten) and submit it at the beginning of class on 10 April 2014.

I want a separate form from each person in the group.

I want a separate form from each person in the group.

Consider the following questions.

Did each member of the group contribute equally...

- ...to the initial proposal and formal proposal?
- ...to the data collection process?
- ...to the analysis of the data?
- ...to preparing the presentation?
- ...to the OVERALL project?

List the OTHER members of your group and beside each name give them a score from 0 to 5 to reflect how much you believe that person contributed to the project OVERALL (0 = no participation, 5 = equal/satisfactory participation).

Self___________________________ Grade: ______

Name_________________________ Grade: ______

Name_________________________ Grade: ______

Your responses will be kept confidential. Recall, the grading rubric for this project indicated that person's Collegiality Score would be the average of the scores given to that person by the OTHER members of the group. That is unless you fail to return this form; then your Collegiality Score will be 0.