Loading...
Thumbnail Image
Item

Advancing Data-driven and Hybrid Modeling for Environmental Biotechnology: Principles and Practices

Cheng, Zhang
Research Projects
Organizational Units
Journal Issue
DOI
https://doi.org/10.34944/c1fr-r881
Abstract
Environmental biotechnologies are of paramount importance for the removal of contaminants and the recovery of resources in engineered processes and environmental management. However, the involved bioprocesses encounter challenges due to the variability and complexity of microbial populations, which result in operational uncertainties. Mathematical models, including mechanistic and data-driven approaches, are indispensable tools for optimizing these processes. While mechanistic models offer insights into microbial kinetics, they are constrained by challenges associated with parameter calibration and limited applicability to novel conditions. This study introduces a data-driven model, capable of handling diverse datasets, for identifying phytopathogenic fungal conidia in agricultural settings. The integration of Raman spectroscopy with machine learning enabled the rapid classification of the data. Among the evaluated models, XGBoost demonstrated superior performance, achieving 95% accuracy and addressing key limitations, such as loss of significant features and compromised interpretability, of conventional methods for dimensionality reduction in fungal pathogen identification. However, data-driven models face challenges in extrapolation and often lack interpretability. To address these limitations, hybrid models have been developed to combine the strengths of both approaches in order to deliver robust and interpretable predictions. Hybrid modeling strategies were devised for illustrating microbial dynamics in bioelectrochemical systems (BES) and activated sludge (AS). For BES, Bayesian networks incorporated with a mechanistic framework demonstrated a 72% Bray-Curtis similarity for microbial community predictions and an RMSE of 0.8 for current production. For AS, a sliding-window transformation was conducted for the expansion of the dataset. The core populations were classified into guilds using Bayesian networks, and the ANN validators demonstrated an improvement in community structure predictions following the incorporation of kinetic parameter inputs (Bray-Curtis similarity: 0.70 vs. 0.66). The modeling strategy was then adapted for real-time BES performance prediction by integrating mechanistic and artificial neural network models, and achieved the following: internal resistance (RMSE=3.9%), organic removal (R²=0.90), and current production (R²=0.94). A guideline was finally proposed to standardize the modeling processes, promoting applications of robust hybrid strategies in environmental biotechnologies.
Description
Accompanied by 6 .xml files: 1) Table S3.xlsx 2) Table S4.xlsx 3) Table S5.xlsx 4) Table S6.xlsx 5) Table S8.xlsx 6) Table S9.xlsx.
Citation
Citation to related work
Has part
ADA compliance
For Americans with Disabilities Act (ADA) accommodation, including help with reading this content, please contact scholarshare@temple.edu
Embedded videos