Lance Inimgba

Logo

Data science enthusiast who loves getting lost in Jupyter notebooks šŸ˜‹
Love travel and foodie actvities!

View My LinkedIn Profile

View My GitHub Profile

Massachusetts School Project

IMPORTANT NOTE!! You may right click on any of the images and click ā€œOpen image in new tabā€ to enlarge any of the images below

In this project, I will be assuming the role of a new hire on the Massachusetts Department of Education data team. Iā€™ve been asked with doing the following by a fictional superintendent:

This project is something that hits somewhat close to me since I also worked as an educator overseas. Therefore, I was quite curious about the contents of this dataset and the relationships between the columns that could possibly be discovered.

This time, Iā€™ll be using Tableau to work with our already cleaned dataset and will provide the answer to the questions that Iā€™ve been assigned to uncover. Stay tuned!

Data Analysis

Now weā€™re going to look at our data and begin our analysis of it by answering the first of the questions that we were assigned:

What schools are struggling the most?

To answer this question weā€™ll need to look at the ā€œ% Graduatedā€ column and organize our rows by the ā€œSchool Nameā€. Since we know that only the high schools will have non-null values in this column, we then filter out the null values and sort the values in ascending order. Weā€™ll just include the bottom 10 high schools in our analysis. These schools will be highlighted by the blue rectangle in the image below.

Looking at our selection of schools, since the ā€œ% Graduatedā€ column denotes the graduation rate percentage for a given school, we can probably conclude that some of these schools are worth taking a closer look at the values. Upon looking up these schools on Google, the vast majority of these schools are alternative charter schools or low-income schools and in my opinion, shouldnā€™t be viewed with the same lenses as traditional public high schools. To account for these possible misleading red flags I decided to filter the above output by the ā€œTOTAL Enrollment ā€˜ā€™ column and set it so that ā€œTOTAL Enrollment ā€˜ā€™ was equal or greater than 200 students. This was the result as displayed below:

While itā€™s not perfect, I think it portrays a somewhat more realistic picture than our first plot display. But in a real job setting, I think that it would be wise to communicate closely with the superintendent, the officials in the DoE as well as individual school administration staff, so that we actually ensure that weā€™re choosing the right schools when determining which schools are ā€œstrugglingā€ the most.


What are the top math schools in the state?

To answer this question, I have been tasked by a superintendent to focus on 4th Grade level math scores as they felt that particular grade level is a pretty good indicator on whether students would be able to keep on having success in higher grade levels.

In addition, they also want to know which districts are above the desired threshold of 50 percent and would like to collect the names of these districts, so that these teachers can be invited to lead workshops to train the rest of the teachers across the state. So we will create a bar graph and use the aggregated average of the ā€œ% MCAS 4th Math Grade Pā€ and group all of our rows by ā€œDistrict Nameā€ to find out what the highest performing districts are.

According to our findings, these 4 districts met our above 50 percent threshold:


#### How does class size affect college admission?

For our final question, the superintendent has tasked me to see if we can find a relationship between class sizes and college. If we can successfully establish a connection between these two variables, the DoE will consider investing in building more schools with the intent of lowering class sizes and increasing college enrollment. Now letā€™s take the ā€œ% Attending Collegeā€ and ā€œAverage Class Sizeā€ columns and make ourselves a nice scatter plot! As a bonus, weā€™ll also include the average of the ā€œ% Economically Disadvantagedā€ column as another way to distinguish the plot points.

Analyzing this plot, I have to conclude that there isnā€™t really a relationship between smaller class sizes and college attendance. But seeing all the light circles at the top of the graph shows that schools that have a lower percentage of economically disadvantaged students have a definitively higher chance of going on to college.

So at this point, I would inform the superintendent that we should focus on building better schools or improve existing schools for economically disadvantaged students rather than building more schools.

The full dashboard of our data visualizations

Recap/Conclusion

Once again to recap what we were instructed to do:

I feel that I did answer these questions to the best of my ability in a limited scope of time and we can all come away with a much better understanding of the overall picture for the public school system in Massachusetts. But I feel that this is only the tip of the iceberg in insights that we can uncover from just these 3 questions. Given more time and some deeper data-diving, Iā€™m confident that we can find even clearer insights as well. But with what I was able to do, I have a stronger sense of conviction that there is absolutely a correlation between schools that have a large amount of economically disadvantaged students and academic performance and that is my own main personal takeaway from this report.

Thank you so much for your time and reading all of this! This was an absolute pleasure to do and if you have any questions, please comment below or please contact me at lance.inimgba@gmail.com or on LinkedIn!

Iā€™m actively looking for new opportunities in the data science field, so please donā€™t hesitate to contact me if you know of anything out there!