Our lab has developed many data analysis workflows adapting and integrating sophisticated statistical methods to evaluate complex molecular datasets that we obtain with MS technologies. We are particularly interested in using logistic regression and machine learning techniques such as the least absolute shrinkage and selector operator (Lasso) to develop statistical classifiers and identify predictive molecules within our data.
The use of the Lasso technique provides key advantages in selecting molecular features that are indicative of disease state, and in building sparse classification models that are predictive of cancer and can thus be used for tissue diagnosis.
We have also adapted significant analysis of microarrays (SAM) statistical techniques to identify significant molecular markers using MS imaging datasets. Using SAM, we can identify specific markers that are significantly altering in biological tissues. These approaches have been broadly employed by the scientific community to evaluate MS imaging data, and we continue to explore and develop new methods to improve rigor and robustness in our data analysis.