Skip to main content
Enterprise AI Analysis: A Model for Teaching Machine Learning, Deep Learning, and Research Computing to Domain Scientists on HPC Resources

A Model for Teaching Machine Learning, Deep Learning, and Research Computing to Domain Scientists on HPC Resources

A week-long workshop successfully trained biologists in HPC, ML, and DL, enhancing their skills and understanding of advanced computational techniques for scientific discovery. The curriculum's application-driven approach, leveraging accessible HPC technologies like Jupyter Notebooks and containers, proved highly effective.

This paper presents outcomes and insights from a one-week workshop designed to teach biologists essential skills in high-performance computing (HPC), machine learning (ML), and deep learning (DL). Participants with little or no prior experience with HPC learned how to navigate file systems via a command-line interface, launch jobs with SLURM, and apply ML and DL techniques to real-world biological datasets. Hands-on activities were delivered with accessible technologies such as Jupyter Notebooks, graphical desktop interfaces (DCV), and software containers, all deployed on HPC systems with minimal user setup required. We propose this workshop model as an adaptable framework for training domain scientists how to effectively use HPC resources to advance scientific discovery, and we present survey data demonstrating its effectiveness in improving participant skills.

Key Takeaways

Our comprehensive workshop delivered quantifiable improvements in participant skills and understanding, demonstrating the immediate benefits of focused training.

0 Increased ML/DL Understanding
0 Improved DL Skills Post-Workshop
0 HPC System Usage Improvement

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

The workshop effectively taught core HPC concepts through hands-on ML/DL exercises. Participants learned to navigate file systems, launch jobs with SLURM, and use containerized environments on HPC systems like Frontera and Vista. This practical approach demystified HPC for domain scientists.

Enterprise Process Flow

Connect to HPC
Navigate File Systems
Launch Jobs (SLURM/Idev)
Manage Software (Containers)
Execute ML/DL Workloads
0 Participants reporting 'good' or 'great' improvement in HPC system usage.

The workshop curriculum was designed to equip biologists to apply modern ML/DL techniques to real-world research questions. Hands-on activities covered supervised classification, linear regression, k-nearest neighbors, logistic regression, and decision trees, using pre-configured Jupyter environments and real biological datasets.

Traditional Approach Workshop Approach
Environment Setup
  • Complex dependency management
  • Inconsistent reproducibility
  • Manual library installation
  • Pre-configured Jupyter environments
  • Containerization for reproducibility
  • Hassle-free experience
Data Handling
  • Limited by local resources
  • Manual data preparation
  • Leverages parallel file systems ($SCRATCH, $HOME)
  • Tools for exploratory data analysis (EDA)
Computational Resources
  • CPU-bound processing
  • Limited scalability
  • Access to GPUs (Frontera, Vista)
  • Scalable job submission (SLURM)
  • Interactive and batch jobs
0 Participants with 'good' or 'great' understanding of ML concepts after the workshop.

The workshop used real-world biological datasets, such as coral images for CNN classification, and explored applications of Large Language Models (LLMs) in biology for hypothesis generation and protein design. This practical relevance fostered strong engagement and highlighted the power of HPC for life sciences research.

Coral Species Classification

Workshop attendees successfully trained a Convolutional Neural Network (CNN) to classify images of three coral species (Acropora cervicornis, Colpophyllia natans, and Montastraea cavernosa). Using advanced HPC resources, participants achieved an overall accuracy rate of 87%. This hands-on experience demonstrated the practical application of DL to real biological research, garnering positive feedback.

0 Accuracy Rate
0 Species Classified
0 Participants interested in learning about Large Language Models (LLMs).

Calculate Your Potential AI ROI

Estimate the annual savings and reclaimed productivity hours for your team by integrating advanced computing and AI solutions.

Estimated Annual Savings $0
Productivity Hours Reclaimed 0

Your AI Implementation Roadmap

A structured approach to integrating machine learning and deep learning into your research workflow.

Initial HPC Setup & CLI Fundamentals

Connect to TACC systems, navigate file systems, and submit basic jobs.

Exploratory Data Analysis & ML Basics

Analyze biological datasets with Python (pandas) and implement supervised ML models.

Deep Learning Model Development

Train ANNs and CNNs using TensorFlow/Keras on real-world biological image data.

Containerization & HPC Integration

Deploy ML/DL models at scale using Docker/Apptainer on HPC clusters.

Advanced Topics & Custom Solutions

Explore LLMs for biological applications and discuss specific research needs.

Ready to Transform Your Research with AI?

Book a free consultation to discuss how our expert team can help you leverage HPC, ML, and DL for breakthrough discoveries.

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs


AI Consultation Booking