26 January 2022: Prevalence Rates Versus Raw Numbers


Today’s Blog post will be covering a fundamental mistake made in the visual representation of data – unfortunately made by my local Santa Monica Malibu Unified School District. In an attempt to provide more transparency about case rates in their schools (note: the district performs weekly PCR surveillance testing, something that has its own pitfalls but is out of the scope of this discussion), the district created a helpful dashboard. This can be accessed at https://www.smmusd.org/Dashboard.

The primary piece of visual data presented is a bar graph of case numbers stratified by school and by week. This is a screenshot from this morning (data current as of 1/21/2022).

Looking at these data, one would reasonably assume that case rates are decreasing rather dramatically in the district in each subsequent week. This seems to be pretty much true across each of the schools. Further, Samohi is most affected.

Unfortunately, the school district has fallen into an Epidemiology 101 pitfall which is to look at unadjusted/raw values rather than accounting for sample size. Samohi has a much larger student population than, say, the middle schools or the elementary schools in the area. So a more accurate way to look at these data is to adjust them by enrollment at each school – therein arriving at a Prevalence Rate. 

[Note: In the graphic below I have removed the District/Itinerant category as it significantly skews the graph axis with prevalence rates of 41.7, 16.7 and 25.0 among its 12 students in each week studied).

This graph looks quite different indeed with John Adams Middle School far outpacing its counterparts followed by McKinley Elementary School. Samohi still has high rates but they are in line with the general population of schools.  SMASH has consistently the lowest rates.  Interestingly, Malibu Middle School went from a January 1-7th prevalence rate of 7.4% to zero the following week and then 0.4% in the most recent week – which suggests extraordinarily effective case identification, contact tracing and quarantine/isolation or a problem with the testing. Either could be true.

This is an error made all too commonly, and is not meant to be a critique of our local school district in any way – as they are trying and making these data available. But you will see similar errors in very reputable news outlets – CNN, MSNBC, NPR – as well as less reputable news outlets. For those really fascinated by learning more, I highly recommend Edward Tufte’s book “The Visual Display of Quantitative Information.” Link: https://www.edwardtufte.com/tufte/books_vdqi

Email Sign Up Form

This Post Has One Comment

  1. Nancy

    I have been hating on the SMMUSD district data presentation since this began. I commented on your Twitter post to say thank you for this. I made my own spreadsheet with schools I am connected with work and family and percent positive based on enrollment with a further adjustment because enrollment was low in the first two weeks back. Malibu Middle looks better because they did a test to return because of an outbreak before winter break. You may see some jumping at other sites that are trying out rapid antigen tests to return for unvaccinated exposed students after 5 days. There is also the issue of not getting back weekly PCR tests in a timely manner where the 5 days of quarantine has passed before the test results return so students have been exposed to the virus unknowingly for longer and are not subject to quarantine.

Leave a Reply

Related Articles

May 2022 Newsletter

https://mailchi.mp/e6479b44fe5b/santa-monica-primary-care-may-2022-newsletter       Newsletter Sign Up   By submitting this form you indicate you have read and agree to our Privacy Policy and Terms

Read More »