From Database to Data Interpretation – Behind the Scenes

Every month, ALLARM’s Stream Team volunteers collect baseline water quality data, and upload data into the Chesapeake Data Explorer where the log of data is kept. These data, while available at all times and a very helpful resource, downloads as a massive excel sheet which can be daunting to look at and interpret. When data interpretation rolls around in the monitoring cycle, in order to make the data more accessible, we at ALLARM developed a streamlined process to set our volunteers up for success when pulling the narratives out of their data. There are various steps that happen in the “background” before volunteers are provided with their own data packets, and this process can be split into into four steps: raw data, summarized data, graphs, and data packets.

The first step in making the data interpretation process more accessible is handling the raw data. After our volunteers collect their data and upload it to the Chesapeake Data Explorer, ALLARM goes in and looks for any data entry errors as the database is not set up to catch these automatically. For example, if a temperature reading of 170 Celsius is in the spreadsheet, ALLARM can check the corresponding volunteer-submitted datasheet, re-verify the number, and change it in the system. In doing so, ALLARM helps to avoid potential errors in the data that our volunteers will be interpreting. After the raw data is inputted, we then go to summarized data using summary statistics. We created summary statistics based off the inputted data that would give volunteers a point to start on when interpreting their data (maximum, minimum, mean). After the data is summarized, we then visualize it into graphs. By creating graphs, we are making the data more accessible as the visual formatting of graphs show trends and outliers easier than a set of numbers. Data visualization in graphs also goes hand-in-hand with maps created using platforms such as ArcGIS Pro. These maps add additional context that would be helpful for data interpretation like land use or geology.

All the previous steps produce our final product, the Stream Team data packets which are sorted by county and then by monitoring site. Thus, each site our volunteers monitor at (with a minimum of one year of data collected) will have a packet including an individualized map and information for each monitoring site, bar graphs, summary statistics, and a consolidated raw data table. Further, each county group receives supplemental maps that show their site in relation to those around them. After the volunteer data packets are sent out, volunteers can attend data exploration meetings, office hours, and open meetings to discuss and present their data findings!