(Python) Setting Students Up For Success : Exam Score Analysis
- csgreene9
- Apr 9, 2022
- 3 min read
Updated: Apr 17, 2022
Catherine Greene | April 2022

Every year, students at School A are required to take exams in math, reading, and writing. The principal of this school has provided the students with optional test preparation courses and would like to see if they were helpful.
The analysis aims to answer:
What are the average reading scores for students with/without the test preparation course?
What are the average scores for the different parental education levels?
Do kids who perform well on one subject also score well on others?
Is there a correlation between scores?
What is the difference of average scores between students with/without the test preparation course for different parental education levels in the subgroups?
About the Data
The dataset contains the student information and their test scores. There are 1000 records and 8 fields. It was retrieved from DataCamp.
Sample of the Data

Cleaning the Data
The data was quite clean with zero nulls. Right away it is observed that math test scores are missing for one female student. This could be due to her either not taking the exam or completely failing it. As it is unclear whether or not this student's scores are correct, and there is no way to verify at this time, she will be NOT be removed.
Explore the Data

The scores for the three subjects all have a slightly left skewed distribution. This means the data for each subject has a slightly greater number of larger numbers than smaller numbers.

64 % of students didn't take the test prep courses while 36% did. The average reading scores for students who did take the prep courses was 73.89 while the average for those who did not take the test prep course was 66.53.

The plot above indicates that the higher the parent education, the greater the average reading score of the student.

There is a correlation between the different subject scores. The strongest correlation is between the reading and writing subjects. If students do well in one subject, they are more likely to do well in the others.

The above table show the distribution of grades by race/ethnicity and parent education level. The worst average scores were held in group A with a parent having some high school and students not taking the prep course. The best average scores were in group E with a parent having a master's degree and students who DID take the prep course. Group A had the lowest scores overall aside from the outlier in group C.
Conclusion
Many observations have be made in regards to student scores and student characteristics or attributes. To answer the questions posed by the principal of School A, The average reading score for students who DID NOT take the test prep course was 66.53%. While the average reading score for students who DID take the test prep course was 73.89%. That is a 7.36 percent difference. Average students scores are generally higher the more higher the education of their parent. The average scores of students with parents with some high school education was 66.9%, high school education was 64.7%, some college education was 69.46%, associate's degree was 70.93%, bachelor's degree was 73%, and master's degree was 75.37%. There is a correlation between the three subjects and students who perform well on in one subject area tend to perform better on the others. Finally, the group A students having parents with some high school education that did not take the prep course had the lowest average scores in all subject areas. Group E students having parents with a master's degree that did take the prep course had the highest scores in all subject areas.
Next Steps
A beneficial next step could be to explore the exact value difference between test prep students and non test students grouped by their race/ethnicity and parental education level to see the groups that should be focused on more in offering test prep. Similarly, seeing if there is a correlation between lunch types and scores as well as lunch types and percentage of students who took the test prep course could be shed light on the type of student who takes the course. Maybe students who have free lunches are less likely to take the test prep because of a financial aspect,ie. they might not have the means to stay (or come) for the course, they might have job obligations during that time, etc. Having this insight could allow for School A to offer the course at a more convenient time, or maybe alter the course all together so that the material is acessible at any time.
Comments