A Model for Teaching Machine Learning, Deep Learning, and Research Computing to Domain Scientists on HPC Resources
A week-long workshop successfully trained biologists in HPC, ML, and DL, enhancing their skills and understanding of advanced computational techniques for scientific discovery. The curriculum's application-driven approach, leveraging accessible HPC technologies like Jupyter Notebooks and containers, proved highly effective.
This paper presents outcomes and insights from a one-week workshop designed to teach biologists essential skills in high-performance computing (HPC), machine learning (ML), and deep learning (DL). Participants with little or no prior experience with HPC learned how to navigate file systems via a command-line interface, launch jobs with SLURM, and apply ML and DL techniques to real-world biological datasets. Hands-on activities were delivered with accessible technologies such as Jupyter Notebooks, graphical desktop interfaces (DCV), and software containers, all deployed on HPC systems with minimal user setup required. We propose this workshop model as an adaptable framework for training domain scientists how to effectively use HPC resources to advance scientific discovery, and we present survey data demonstrating its effectiveness in improving participant skills.
Key Takeaways
Our comprehensive workshop delivered quantifiable improvements in participant skills and understanding, demonstrating the immediate benefits of focused training.
Deep Analysis & Enterprise Applications
Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.
The workshop effectively taught core HPC concepts through hands-on ML/DL exercises. Participants learned to navigate file systems, launch jobs with SLURM, and use containerized environments on HPC systems like Frontera and Vista. This practical approach demystified HPC for domain scientists.
Enterprise Process Flow
The workshop curriculum was designed to equip biologists to apply modern ML/DL techniques to real-world research questions. Hands-on activities covered supervised classification, linear regression, k-nearest neighbors, logistic regression, and decision trees, using pre-configured Jupyter environments and real biological datasets.
| Traditional Approach | Workshop Approach | |
|---|---|---|
| Environment Setup |
|
|
| Data Handling |
|
|
| Computational Resources |
|
|
The workshop used real-world biological datasets, such as coral images for CNN classification, and explored applications of Large Language Models (LLMs) in biology for hypothesis generation and protein design. This practical relevance fostered strong engagement and highlighted the power of HPC for life sciences research.
Coral Species Classification
Workshop attendees successfully trained a Convolutional Neural Network (CNN) to classify images of three coral species (Acropora cervicornis, Colpophyllia natans, and Montastraea cavernosa). Using advanced HPC resources, participants achieved an overall accuracy rate of 87%. This hands-on experience demonstrated the practical application of DL to real biological research, garnering positive feedback.
Calculate Your Potential AI ROI
Estimate the annual savings and reclaimed productivity hours for your team by integrating advanced computing and AI solutions.
Your AI Implementation Roadmap
A structured approach to integrating machine learning and deep learning into your research workflow.
Initial HPC Setup & CLI Fundamentals
Connect to TACC systems, navigate file systems, and submit basic jobs.
Exploratory Data Analysis & ML Basics
Analyze biological datasets with Python (pandas) and implement supervised ML models.
Deep Learning Model Development
Train ANNs and CNNs using TensorFlow/Keras on real-world biological image data.
Containerization & HPC Integration
Deploy ML/DL models at scale using Docker/Apptainer on HPC clusters.
Advanced Topics & Custom Solutions
Explore LLMs for biological applications and discuss specific research needs.
Ready to Transform Your Research with AI?
Book a free consultation to discuss how our expert team can help you leverage HPC, ML, and DL for breakthrough discoveries.