Late-Breaking Work, ACM Computer-Human Interaction (CHI) 2025

Bridging Modeling and Domain Expertise Through Visualization: A Case Study on Bread-Making with Bayesian Networks

Omi H. Johnson, Melanie Munch, Kamal Kansou, Cedric Baudit, Anastasia Bezerianos, Nadia Boukhelifa

What makes a decision support tool effective? How can we ensure that users of a support tool trust the model behind it? In this research, I collaborated with researchers at the INRAE and LISN labs within Université Paris-Saclay to develop a visualization platform for a Bayesian Network model. We focused our research around the concept of "explainability": enabling non-technical users of the network to understand how decisions are made and the accuracy of the model. Explainability is critical to establish trust and usage of decision support tools, especially complex structures like Bayesian Networks.

This research was conducted in Spring 2024 as an M1-level internship. I worked to develop a JavaScript-based web visualization platform that leverages D3.js visualizations and customization toggles to portray a Bayesian Network to domain experts. This research was conducted using an EVAGRAIN Bayesian Network hosted within the pyAgrum.py library. The EVAGRAIN network aims to model the behavior of bread dough as proxy for French wheat quality and climate resilience. To develop the web platform, I conducted an extensive literature review; hosted multiple design sessions with domain expert proxies; and finally led iterative evaluations of the platform with 5 domain experts and 1 Bayesian Network specialist. The resulting platform was well received by experts and uses familiar, interactive visualizations to convey evidence propogation and dataset information. We conclude that the visualizaiton increases explainability for experts and fostered cross-disipline dialogue, with unique considerations for dataset access and complexity.

Decision support tools based on Bayesian Networks (BN) are capable of simply representing complex relationships within data. We use a BN network to model bread-dough behavior, reflecting the bread-making process. Such a network must be validated using domain knowledge, but domain experts often lack the necessary statistical backgrounds to understand BNs. We propose a visualization to enable domain experts without technical background to explain and critically analyze the network. Our platform uses familiar visualizations and focuses on interactive evidence propagation, view customization, and access to the underlying dataset as key features to facilitate understanding. Design workshops and a preliminary evaluation with domain experts and BN specialists showed its potential for exploring and validating the model. We report on lessons learned in creating a visualization for making complex models accessible to domain experts, and how its design can foster interdisciplinary dialogue between modelers and domain experts.

View the official landing page at CHI website.

Since this conference is upcoming, the manuscript is not yet available.

Image Gallery

In press at Elementa: Science of the Anthropocene

Suitability of foraging habitat for Eubalaena glacialis under future climate scenarios in the Northwest Atlantic

Omi H. Johnson, Stéphane Plourde, Caroline Lehoux, Camille H. Ross, Benjamin Tupper, Christopher D. Orphanides, Harvey J. Walsh, Nicholas R. Record

Machine learning can be a powerful tool to anticipate and counteract rapid phenomena like climate change. In the case of the critically endangered North Atlantic Right Whale (Eubalaena Glacialis), recent climate shifts in the Gulf Stream have disrupted historical foraging patterns and challenged conservation tactics. In this research, conducted within the Bigelow Laboratory of Ocean Sciences, we sought to leverage boosted regression trees and modeled environmental covariates to predict E. glacialis foraging suitability for 2055 and 2075. This research is unique in that it covers the entire, cross-boundary foraging range of E. glacialis at a finely detailed 1/12° resolution. The research also leverages a bioenergetics-threshold approach to deliver more insightful predictions based on multiple prey species. This research can be used to inform survey effort and help protect E. glacialis from anthropogenic threats such as ship strikes and entanglement in fishing gear.

This research began as an internship project in Spring 2022 and continued part-time through Spring 2025. To reach the finished paper, I built 100+ experimental models in R using a variety of modeling algorithms such as GAMs, Neural Networks, Random Forests, and GLMs to determine an optimal algorithm. I also developed a comprehensive, modular modeling pipeline to easily facilitate version creation and plot generation. The final forecasts are based on two independent, 100-fold boosted regression tree models representing the two most abundant prey species in the region, Calanus finmarchicus and C. hyperboreus. The completed models displayed high performance, with AUCs > 90%. The research manuscript, now in press at Elementa: Science of the Anthropocene, thoughtfully details results such as model performance, caveats like overfitting and spatial autocorrelation, final forecast results, and implications for right whale conservation.

This work has been presented at the 2022 Ecological Forecasting Initiative Conference; the 2024 Northeastern University RISE Conference; and the 2024 North Atlantic Right Whale Consortium

The critically endangered North Atlantic right whale (Eubalaena glacialis) faces significant anthropogenic mortality. Recent climatic shifts in traditional habitats have caused abrupt changes in right whale distributions, challenging traditional conservation strategies. Tools that can help anticipate new areas where E. glacialis might forage could inform proactive management. In this study, we trained boosted regression tree algorithms with fine-resolution modeled environmental covariates to build prey copepod (Calanus) species-specific models of historical and future distributions of E. glacialis foraging habitat on the northwest Atlantic shelf, from the Mid-Atlantic Bight to the Labrador Shelf. We determined foraging suitability using E. glacialis foraging thresholds for Calanus spp. adjusted by a bathymetry-dependent bioenergetic correction factor based on known foraging behavior constraints. Models were then projected to 2046-2065 and 2066-2085 modeled climatologies for representative concentration pathway scenarios RCP 4.5 and RCP 8.5 with the goal of identifying potential shifts in foraging habitat. The models had generally high performance (area under the receiver operating characteristic curve > 0.9) and indicated ocean bottom conditions and bathymetry as important covariates. Historical (1990-2015) projections aligned with known areas of high foraging habitat suitability as well as potential suitable areas on the Labrador Shelf. Future projections suggested that the suitability of potential foraging habitat would decrease in parts of the Gulf of Maine and southwestern Gulf of Saint Lawrence, while potential habitat would be maintained or improved on the western Scotian Shelf, in the Bay of Fundy, on the Newfoundland and Labrador shelves, and at some locations along the continental shelf breaks. Overall, suitable habitat is projected to decline. Directing some survey efforts towards emerging potential foraging habitats can enable conservation management to anticipate the type of distribution shifts that have led to high mortality in the past.

Image Gallery

Related Projects

Tidymodels Tutorial Website

A website thoroughly detailing the capabilities of Tidymodels, a set of R modeling packages designed to increase modularity and align modeling paradigms with "tidy" data principles. This package was instrumental in the development of my modeling pipeline. The walkthrough is built using RMarkdown and is specialized for ecological forecasting, including thorough coding examples and links to outside resources.

Saving the North Atlantic Right Whale

What does it mean to save the North Atlantic right whale? How does reducing ship strike collisions and fishing gear entanglements complicate the work of other industries in the Gulf of Maine? In this ArcGIS StoryMap, I use peer-reviewed sources and current events news to construct a narrative about effective strategies for right whale conservation, recent protests from the American Lobster Industry, and technological innovations driving the story forward. Forecasts like my published research are just one piece of the puzzle.

Brickman Data Walkthrough

My research relies on modelled environmental covariates sourced from Brickman et al. 2021, which features 1/12°, high-resolution projections of key covariates across the Northern Atlantic. This dataset has plenty of potential beyond my own work for use in other ecological forecast scenarios. This walkthrough presents a set of prepackaged functions and clear explanations for how anyone with a presence/absence dataset can build their own high resolution forecasts using the Brickman covariates.