Purpose

The purpose of the final paper is to summarize results for two interesting questions using a combination of figures, tables, and modeling techniques. Written communication is an integral part of data science. This is your opportunity to develop a high quality blog post/article that could potentially be published to the web and used in future job interviews.

Requirements

After the exploratory data analysis, your group should have two questions that are interesting, relevant, and worth sharing to the world. These questions should involve multiple variables and should be answerable by predictive modeling techniques. Be innovative and creative. Do not try to answer questions that have obvious solutions or have been extensively studied. Pick questions that would spark a reader or fellow researcher to ask more questions or engage in discussion.

The final paper consists of four sections: Introduction, Data, Results, and Conclusion. Each section will be graded separately and follow a rubric with a combination of objective and subjective requirements. Provided on the course website is a simple Rmarkdown template with predefined headings. The template also contains requirements and suggestions for each of the four sections.

The Deliverer is responsible for compiling all the information into the RMarkdown template provided on the course website. This document should be carefully proofread and submitted as an HTML file via Canvas by the due date. A minimum 2 point penalty will be given, if this document is submitted late. This penalty applies to your entire group.

In the final HTML document, there should be absolutely no R code. The writing and proofreading of the document should be shared by all members of the group. In each section, there are points removed for spelling and grammatical errors. All figures should have appropriate legends, titles, and colors. All tables should be displayed in HTML format using the xtable or kable packages. Each required figure and table is worth 2 points. The first point is for the appropriateness of the figure/table for the situation and the second point is based on aesthetics. You are encouraged to use Markdown syntax for subsections, bold, italic, hyperlink, tables, etc.

Besides the rubric items for each of the individual sections, you will be graded based off the following aspects:

  • Professionalism: There should be no R code anywhere in your final document. All figures and tables should use the same colors for consistency. You should format variable names by bolding or italicizing. If the data was used in other articles or ideas from other authors were used, there should be references or acknowledgments. Writing should be at a high level. You should proofread and edit in multiple drafts.

  • Difficulty, Creativity, Ambition: Based on what other groups did, how does your group compare. Was there any attempt at web scraping or pulling in outside data? Did you attempt modeling techniques not used in class? Did you focus on interesting questions or look at obvious questions? Did you exhibit an ability to create graphics that were not focused on in class? What was the balance between the difficulty of your data and the difficulty of your analysis?

  • Followed Template: You can add subheadings using proper formatting, but you should follow the instructions in the template.

  • Spelling and Grammar: Every spelling or grammatical error will result in you losing points. This paper should be proofread for errors.

Rubric

Requirement Points
Introduction: 2 Questions Clearly Defined 2 Points
Introduction: Am I Interested? 2 Points
Data: Adequately Describes Data 2 Points
Data: 1 Descriptive Table 2 Points
Data: 1 Descriptive Figure 2 Points
Results: Appropriate Methods Using Multiple Models 3 Points
Results: Adequately Explains Results 3 Points
Results: 4 Figures and/or Tables 8 Points
Conclusion: Summarize Questions with Results 2 Points
Conclusion: Do I Want to Learn More? 2 Points
Overall: Professionalism (No R Code, Formatting, Well-Written) 3 Points
Overall: Difficulty, Creativity, and Ambition 3 Points
Overall: Followed Template 2 Points
Overall: Spelling and Grammatical Errors 4 Points
Total 40 Points