Complete this project in groups of 2-3 students. Decide upon a research question about the difference between populations or subpopulations. Gather sample data and use t-tests to test for differences in the means, find linear regression models and co-variate models for the data. Each group will present their results to the class (5-10 minutes). The division of tasks is left up to the team, but each team member should be involved in some way at every phase of the project.
Your group must select two populations (e.g., males/females or pitchers/outfielders) with at least two (2) quantitative response variables that you believe may be linked to each other and may vary between your two populations (e.g., height/weight, games per season/salary, price/weight or volume) and at least five (5) quantitative explanatory variables. Your subjects do not necessarily need to be people. You should use the internet to gather your data. There is plenty of interesting research questions out there, so use your IMAGINATIONS and be ORIGINAL.
Your data must include at least 100 subjects within each of your two populations.
For each of the quantitative response variables, use a 2-sample t-test (or matched-pairs t-test) procedure to test for a difference in the means of the two populations.
Your group should select quantitative variables that you believe may be linked (correlated). One of those variables should be regarded as an explanatory variable and the other as the response variable. You need to explain how you decide which variables are the response and explanatory variables.
Create several scatterplots showing the relationship between the response and explanatory variables for each population. I do NOT want you to construct a scatterplot for all possible combinations of response and explanatory variables. Instead, you need to choose several pairs of response and explanatory variables. You need to explain why you choose the particular pair of response and explanatory variables (and why you did not choose another pair of response and explanatory variables) to graph. You can choose a pair of response and explanatory variables to show there is NO relationship (e.g., stolen bases is not correlated to the number of run in baseball is an important result). You will construct a linear model for each scatterplot. You will need to explicitly state the model, calculate the Pearson correlation coefficient, and calculate the coefficient of determination.
Find several co-variate models between the response variable and two or more co-variates (i.e., explanatory variables) for each population. I do NOT want you to construct a co-variate model for all possible combinations of response and explanatory variables. Instead, you need to choose response variables and the co-variates. You need to explain why you choose the particular set of variables. You will need to calculate the coefficient of determination for each co-variate model.
The in-class presentation is expected to be a PowerPoint presentation, it should include at least the following.
The following shows how this project will be graded.
|Portion||Up to one third||One-third to half||Full Points||Possible Points||Earned Points|
|Initial Proposal||Proposal missing or missing significant parts.||Proposal turned in half complete.||Both populations and both variables well defined||5|
|Formal Proposal||Proposal missing or missing significant parts.||Proposal turned half complete.||Both populations and both variables well defined||5|
|Presentation||If you are absent during your group's presentation, then you will NOT receive credit for the any part of the project listed below.|
|Introduction and Overview of Research||No statement of research question, variables not defined, hypothesis not stated or explained.||Research question, variables, and hypothesis explained adequately.||Research question, variables, and hypothesis explained exceptionally well and with relevance.||5|
|Data Collection||Poor design, description, or implementation or sampling methods; population not defined.||Good design, description, execution of design and sampling methods; population defined.||Exceptional design, description, & execution of survey/sampling; population well-defined.||10|
|Statistical Analysis: All Variables||No descriptive statistics provided for demographic and research variables.||Basic descriptive statistics provided for all variables (5 number summary, mean, standard deviation, etc.).||Thorough analysis of each research variable with appropriate graphs and discussion.||10|
|t-test||Neither test statistics correct nor interpretation of p-values.||Either test statistics correct or interpretation of p-values.||Test statistic correctly calculated, correct interpretation of p-values.||10|
|Scatterplots||Failed to display any scatterplots.||Displayed only one scatterplot.||Displayed all three scatterplots.||10|
|Regression||Failed to explain scatter-plot, r, or line of best fit correctly.||Explained scatter-plot, r, and line of best fit accurately.||In addition, explained r2, and an example with prediction equation.||10|
|Conclusions||Report demonstrates no real-world understanding of the statistical results; no adequate explanation of findings.||Report demonstrates basic understanding of results and makes reasonable attempt to explain findings.||Report demonstrates thorough understanding of results and offers insightful explanation of findings.||20|
|Overview, Clarity, Poise, & Timeliness||Did not explain hypothesis or real world context of results; spoke hesitantly, lacked poise, or did not finish in time allotted.||Adequately described hypothesis and real world context of results; well-spoken, poised, and within time allotted.||Explained hypothesis and real world context of results well and with relevance; polished and professional presentation.||5|
|Technical||Trouble with PPT file or graphics, distracting colors or sounds, lacked coordination with speaker.||Good PPT slides, nice graphics and layout, good teamwork with speaker.||Beautiful PPT colors, themes, graphics, layout, teamwork-a pleasant and seamless visual experience.||5|
|Collegiality||Missing form or average score less than 3.||Average score between 3 and 4.||Average score of 5.||5|
1. List the members of the group. Did you exchange contact info (cell numbers and/or email addresses - do NOT include these on this form)?
2. What subpopulations are you going to compare? (Examples: males vs. females, NBA salaries vs. NFL salaries)
3. What quantitative variables are you going to use? Have you found a dataset? If yes, then send the dataset to Dr. Weber by email. If no, have you started searching for data? Where have you searched? List the URL (web address) of the sites you plan to use.
1. List your group members.
2. What is your research question?
a) What two populations are you studying?
b) What are the quantitative variables of interest you are studying?
3. Send your dataset to Dr. Weber by email.
Stats Project Worksheet
Consider the following questions.
Did each member of the group contribute equally...
List the OTHER members of your group and beside each name give them a score from 0 to 5 to reflect how much you believe that person contributed to the project OVERALL (0 = no participation, 5 = equal/satisfactory participation).
Self___________________________ Grade: ______
Name_________________________ Grade: ______
Name_________________________ Grade: ______
Your responses will be kept confidential. Recall, the grading rubric for this project indicated that person's Collegiality Score would be the average of the scores given to that person by the OTHER members of the group. That is unless you fail to return this form; then your Collegiality Score will be 0.