COVID-19 Data Research

Over the past year, I've worked on a few coding projects. Before doing so, I took courses on Python for Data Science and Machine Learning and Python for Web Development. I then used what I learned to compile a comprehensive COVID-19 data set, with data on COVID-19 statistics and demographics all on the county level. Using the data, I performed analysis and published articles detailing key takeaways. Finally, I used the data as a backend data set in a Python-based web application to shed light on the current state of COVID-19 across the 3,000+ counties in the US.

County-Level COVID-19 Data Set and Analytical Trends

In August 2020, I looked to see if the relationship between an area's demographics (minority composition, education, population density, etc). To do so, I first began finding county-level data. Using Python, I found data from the following sources:


  1. COVID-19 Cases and Deaths, Population: USAFacts

  2. Mask/Social-Distancing Policies: AARP

  3. Race Demographics: US Census Bureau

  4. Unemployment, Income, and Education: USDA

  5. Calculated Population Density: USAFacts and ArcGIS

I then combined the data together to produce a data set containing 3,000+ counties with the information above for each. I added additional columns like infection rates and mortality rates using tallies.

Then, I sorted counties by their infection rates and began exploring the characteristics of the counties with the highest and lowest infections.

I summarized my findings in that analysis through an article in Towards Data Science on Medium. Click the image to the right to check it out.

Live County-Level COVID-19 Web Application

My next goal was to make the data present in the data set available to all by making quick information on a county's infection risk, mandates, and vaccination status accessible to all. I began by adding information on vaccinations from the CDC. I then set up an AWS instance to contain my data, where it would be updated every hour with the most up-to-date data from each source.

In order to begin building the web application, I learned a web framework called Flask, which uses a Python backend and an HTML frontend. I created multiple pages, each with its own purpose:

  1. A an individual county's situation (Data)

  2. The entire country's situation visually represented (Stats)

  3. An opportunity for correlation exploration between two county-level attributes (Explore)

I connected my data set with my framework and deployed the application on Heroku. Check out the application here or use the embedded version below.

Article on COVID-19 Web Application

In addition to the app itself, I also published an article describing the process behind making the app and its use cases. It includes an in-depth walk through of the app's three pages with real outputs from the app. Check it out by clicking the image to the right.