Scientific Computing Training Series

Cornell University's Center for Advanced Computing together with Weill Cornell Medicine's Scientific Computing Unit, ITS, and Clinical and Translational Science Center are pleased to offer a new Scientific Computing Training Series.

Training sessions are held via Zoom. They are open to all workforce members and students of Cornell University, WCM, WCM-Q, and Cornell Tech. They are free of charge and there is no need to apply. Click the title of each course below to find course information.

Register for one or more topics with this Zoom registration link

All sessions are on Wednesdays
All sessions are 60 minutes followed by an additional 30 minutes optional discussion time
All sessions begin at 9am ET except for the March 12 session at noon ET

Download a Scientific Computing Training Series flyer:

Winter/Spring 2025

Winter/Spring 2025

Feb. 5: Five Ways to Get Started with GPUs

Time: 9am-10am ET, with 30 mins optional discussion time immediately afterward.

Instructor: Kaleb Smith, PhD, NVIDIA

Description: This introduction to GPU acceleration outlines the 5 ways to accelerate computationally intensive code using GPUs. This session is a great starting point for those who would like to begin leveraging the benefits of accelerated computing. We offer a variety of easy methods to get started and also touch on the more advanced methods.

Level: Introductory

Target audience: Any researcher/developer looking to leverage accelerated computing capabilities

Slides Five Ways to Get Started with GPUs (WCM/WCM-Q users)
Slides: Five Ways to Get Started with GPUs (Cornell users)
Recording: Five Ways to Get Started with GPUs

Feb. 12: Introduction to Python

Time: 9am-10am ET, with 30 mins optional discussion time immediately afterward.

Instructor: Chris Myers, CAC

Description: This lecture will introduce the Python programming language and the Python software ecosystem, with which one can create software pipelines for data analysis, simulation, and scientific computing in a wide variety of application areas. The talk will introduce some key concepts in computer programming, and how those concepts are implemented in Python. Some prior programming experience would be helpful, but this lecture is intended to provide some useful information to those just getting started, those who have done some programming but are new to Python, and those who are looking to learn more about how to leverage the Python language and ecosystem.

Slides Introduction to Python (WCM/WCM-Q users)
Slides: Introduction to Python (Cornell users)
Recording: Introduction to Python

Feb. 26: Data Transfer for HPC

Time: 9am-10am ET, with 30 mins optional discussion time immediately afterward.

Instructor: Ben Trumbore, CAC

Description: This session provides an introduction to several tools that can transfer data between computers and into cloud storage. Different tools offer varying transfer speeds, security, and ease of use. Some tools also automatically recover from failures or allow syncing data between systems. Tools covered in this session include SCP, rsync and Globus. Examples of using these tools with Red Cloud and the Cayuga HPC cluster will be provided.

Slides: Data Transfer for HPC (WCM/WCM-Q users)
Slides: Data Transfer for HPC (Cornell users)
Recording: Data Transfer for HPC

Mar. 5: An Overview of AI

Time: 9am-10am ET, with 30 mins optional discussion time immediately afterward.

Instructor: Chris Myers, CAC

Description: Artificial Intelligence (AI) is an interdisciplinary area of study that connects computer science, neuroscience, engineering, mathematics, physics, philosophy, psychology, and other fields, with broad aims that include: investigating the underpinnings of human intelligence and cognition; simulating human intelligence in computers and other machines; engineering software and/or hardware systems that can "learn" from their experiences in the world around them; and building systems that can perform as well as or even better than humans on a variety of tasks. While the field of AI has been around for decades, the products of AI have recently generated both a lot of excitement and a lot of concern. This lecture will present an overview of some of the key concepts and elements associated with recent developments in AI, as well as some of the connections between AI applications and the machine learning and deep learning techniques that are used to implement those applications.

Slides: An Overview of AI (WCM/WCM-Q users)
Slides: An Overview of AI (Cornell users)
Recording: An Overview of AI

Mar. 12: Accelerated General Data Science in Medicine with RAPIDS, CuPy and Numba

Time: 12pm-1pm ET, with 30 mins optional discussion time immediately afterward.

Instructor: Huiwen Ju, PhD, NVIDIA

Description: We will discuss how to accelerate and how to tackle common bottlenecks (e.g., memory management, data loading, data format conversion across different frameworks) in general medicine data analytics pipelines by using GPU-accelerated python libraries like RAPIDS. Additionally, we will demo an end-to-end GPU-accelerated pipeline for an electrocardiogram AI application.

Slides: Accelerated General Data Science in Medicine with RAPIDS, CuPy, and Numba (WCM/WCM-Q users)
Slides: Accelerated General Data Science in Medicine with RAPIDS, CuPy, and Numba (Cornell users)
Recording: Accelerated General Data Science in Medicine with RAPIDS, CuPy, and Numba

Mar. 19: Getting Started with R

Time: 9am-10am ET, with 30 mins optional discussion time immediately afterward.

Instructor: Zilu Wang, CAC

Description: RStudio is a common R interface used by researchers. This overview is designed to familiarize new users with the interface, features and best practices so they are ready to conduct their own analyses. New content includes an overview of Quarto — an evolution of Rmarkdown — for documenting and sharing data analysis and research.

Slides: Getting Started with R (WCM/WCM-Q users)
Slides: Getting Started with R (Cornell users)
Recording: Getting Started wth R

Mar. 26: Data Analysis with R

Time: 9am-10am ET, with 30 mins optional discussion time immediately afterward.

Instructor: Christopher Cameron, CAC

Description: This presentation includes several examples of data analysis and visualization in R intended to help researchers determine if learning R is a good investment for their research. Prospective attendees are welcome to submit their own examples for consideration and possible inclusion.

Slides: Data Analysis with R (WCM/WCM-Q users)
Slides: Data Analysis with R (Cornell users)
Recording: Data Analysis with R

April 9: SCU HPC Job Optimization Methods and Techniques

Time: 9am-10am ET, with 30 mins optional discussion time immediately afterward.

Instructors: Jonathan Parziale and Eugene Dedits, WCM SCU

Description:SCU HPC clusters are powerful tools; to take full advantage of their computational power we will present techniques and methods to analyze job requirements and properly scope and define your job parameters in SLURM to make efficient use of the SCU HPC clusters. While tailored for WCM SCU systems, the presentation is applicable to other Slurm clusters.

Slides: SCU HPC Job Optimization (WCM/WCM-Q users)
Slides: SCU HPC Job Optimization (Cornell users)
Recording: SCU HPC Job Optimization

Fall 2024

Oct. 2: Introduction to Python

Time: 9am-10am ET, with 30 mins optional discussion time immediately afterward.

Instructor: Chris Myers, Cornell University Center for Advanced Computing

Description: This lecture will introduce the Python programming language, the Python software ecosystem, some key concepts in computer programming, and how those concepts are implemented in Python. The Python ecosystem contains a rich set of packages and tools to support research and data analysis in several different application areas. The ability to use Python to customize computing workflows enhances researcher productivity and capability. The material is intended both for people new to programming or new to Python who want to get started, and for more experienced Python programmers who would like to get a different perspective on how Python supports a variety of programming tasks.

Introduction to Python Slides (WCM/WCM-Q users)
Introduction to Python Slides (Cornell users)
Introduction to Python Recording

Oct. 9: Introduction to JupyterLab for Python/R

Time: 9am-10am ET, with 30 mins optional discussion time immediately afterward.

Instructor: Christopher Cameron, Cornell University Center for Advanced Computing

Description: Jupyter Notebooks are a popular format for scientific communication that intermingle descriptive text, code, statistical analysis, results, and visualizations in a single document. This workshop showcases the features of Jupyter notebooks, demonstrates how to use and share notebooks effectively, and explains how to address common pain points. This workshop is useful for people who send, receive or use Jupyter notebooks.

Introduction to JupyterLab for Python/R Slides (WCM/WCM-Q users)
Introduction to JupyterLab for Python/R Slides (Cornell users)
Introduction to JupyterLab for Python/R Recording

Oct. 23: Python for Computational Science, Data Science, and Machine Learning

Time: 9am-10am ET, with 30 mins optional discussion time immediately afterward.

Instructor: Chris Myers, Cornell University Center for Advanced Computing

Description: The Python software ecosystem provides a rich collection of tools and packages, developed by many programmers with expertise in different areas, that support computation in many different application areas. The Python ecosystem is especially useful in empowering researchers and developers in the areas of computational science, data science, and machine learning, and has long been embraced by people working on such problems. This lecture will provide a broad overview of some of the key components of the Python software stack that provide support for numerical computing, scientific algorithms, data processing, data visualization, and machine learning, along with some examples of their use.

Level: Intermediate

Prereqs: Some familiarity with the Python programming language or other languages used for scientific computing (e.g., R, MATLAB) would be useful, but is not required.

Slides Python for Computational Science, Data Science, and Machine Learning (WCM/WCM-Q users)
Slides: Python for Computational Science, Data Science, and Machine Learning (Cornell users)
Recording: Python for Computational Science, Data Science, and Machine Learning

Oct. 30: Getting Started with R

Time: 9am-10am ET, with 30 mins optional discussion time immediately afterward.

Instructor: Zilu Wang, Cornell University Center for Advanced Computing

Level: Introductory

Prereqs: None

Slides Getting Started with R (WCM/WCM-Q users)
Slides: Getting Started with R (Cornell users)
Recording: Getting Started with R

Nov. 6: Data Analysis with R

Time: 9am-10am ET, with 30 mins optional discussion time immediately afterward.

Instructor: Christopher Cameron, Cornell University Center for Advanced Computing

Level: Introductory/Intermediate

Prereqs: Some familiarity with R

Slides Data Analysis with R (WCM/WCM-Q users)
Slides: Data Analysis with R (Cornell users)
Recording: Data Analysis with R

Winter/Spring 2024

Feb. 6: Introduction to Python

Time: 9am-10am ET, with optional discussion time immediately afterward.

Instructor: Ben Trumbore, Cornell University Center for Advanced Computing

Description: This lecture will introduce the Python programming language, the Python software ecosystem, some key concepts in computer programming, and how those concepts are implemented in Python. The Python ecosystem contains a rich set of packages and tools to support research and data analysis in several different application areas; being able to use the Python programming language to customize computing workflows that leverage those tools enhances researcher productivity and capability. The material is intended both for people new to programming or new to Python who want to get started, and for more experienced Python programmers who would like to get a different perspective on how Python supports a variety of programming tasks. This will be an encore presentation of material from a previous SCTS talk.

Level: Introductory

Prereqs: None

Introduction to Python Slides (Requires WCM CWID to access)
Introduction to Python Slides (Requires CU NetID to access)
Introduction to Python Recording

Feb. 13: Intermediate Applied R

Time: 9am-10am ET, with optional discussion time immediately afterward.

Instructor: Chris Cameron, Cornell University Center for Advanced Computing

Description: Practical R techniques and best practices for researchers. Topics include R as a programming language, performance and optimization, and parallelizing R code.

Level: Intermediate

Prereqs: Some familiarity with R

Intermediate Applied R Slides (Requires WCM CWID to access)
Intermediate Applied R Slides (Requires CU NetID to access)
Intermediate Applied R Recording

Feb. 20: Intermediate Python

Time: 9am-10am ET, with optional discussion time immediately afterward.

Instructor: Chris Myers, Cornell University Center for Advanced Computing

Description: Python is both a programming language and a software ecosystem that is widely used to support many tasks. Programmers new to Python are able to begin working productively for simple tasks and for well-documented pipelines. This lecture will address some more advanced features of both the language and the ecosystem that are useful for tackling more complex and numerically intensive computations that arise in scientific computing.

Intermediate Python Slides (Requires WCM CWID to access)
Intermediate Python Slides (Requires CU NetID to access)
Intermediate Python Recording

March 5: Python Use Cases

Time: 9am-10am ET, with optional discussion time immediately afterward.

Instructor: Chris Cameron, Cornell University Center for Advanced Computing

Description: A collection of use cases for Python based on examples solicited from WCM researchers and CAC involvement in research projects.

Python Use Cases Slides (Requires WCM CWID to access)
Python Use Cases Slides (Requires CU NetID to access)
Python Use Cases Recording

March 12: AI, Machine Learning, and Deep Learning with Python

Time: 9am-10am ET, with optional discussion time immediately afterward.

Instructor: Chris Myers, Cornell University Center for Advanced Computing

Description: Artificial intelligence, Machine Learning, and Deep Learning with neural networks comprise a powerful set of tools and techniques to enable computer programs to learn from data. Many popular packages supporting these approaches are available for use with the Python programming language, due in part to Python's expressive syntax, rich ecosystem, and ability to link to code written in other languages. This lecture will provide an overview of Machine Learning and Deep Learning, an introduction to some key Python packages supporting work in those areas, and some examples of their use.

Level: Intermediate

Prereqs: Some familiarity with Python

AI, Machine Learning, and Deep Learning with Python Slides (Requires WCM CWID to access)
AI, Machine Learning, and Deep Learning with Python Slides (Requires CU NetID to access)
AI, Machine Learning, and Deep Learning with Python Recording

March 19: Deep Learning and Generative AI Use at Research Hospitals

Time: 9am-10am ET, with optional discussion time immediately afterward.

Instructor: Bennett Wineholt, Cornell University Center for Advanced Computing

Description: Hospitals are already deploying deep learning and generative AI technologies to benefit patients. How are they deriving benefits and safeguarding quality of patient care while deploying these cutting-edge technologies? We will focus on two specific use cases which illustrate how to deploy and supervise deep learning techniques and large language models in health care settings. Image segmentation of multi-channel brain MR images can assist radiologists in identifying areas of significant medical interest. Hospital patient readmission prediction from reading providers' unstructured notes can save patient lives by keeping needful patients in the hospital, while allowing lower risk patients the convenience and time savings to head home.

Level: Intermediate

Prereqs: Familiarity with Python and Linux

Deep Learning and Generative AI Use at Reserach Hospitals Slides - (Requires WCM CWID to access)
Deep Learning and Generative AI Use at Reserach Hospitals Slides - (Requires CU NetID to access)
Deep Learning and Generative AI Use at Reserach Hospitals Recording

April 9: AWS 101

Time: 9am-10am ET, with optional discussion time immediately afterward.

Instructor: AWS Solutions Architect Team

Description:

In this session, we will cover the AWS services which are most commonly used in research: EC2 and AWS storage.

1. EC2 - What is EC2? Types of EC2 instances? how to save costs when using EC2?
2. Storage on AWS - What is S3, EBS, EFS? When to use? Various features?
3. The Cornell team will cover the process and policy for provisioning AWS accounts.

Slides

CU Net ID: AWS 101 slides
WCM: AWS 101 slides

Recording

AWS 101

April 16: SageMaker 101 (AI/ML on AWS)

Time: 9am-10am ET, with optional discussion time immediately afterward.

Instructor: AWS Solutions Architect Team

Description: Amazon SageMaker is the AI/ML service used to build, train and deploy models. It encompasses a broad set of tools like Studio, Canvas, notebook instances, debuggers, profilers, pipelines, MLOps, and more. In this session, we will cover the below:

1. Introduction to AI/ML on AWS
2. Introduction to SageMaker and its capabilities.

Level: Introductory

Prereqs: None

Slides

AWS SageMaker (Requires Cornell University Net ID to access)
AWS SageMaker (Requires WCM CWID to access)

Recording

AWS SageMaker

April 23: SageMaker Studio

Time: 9am-10am ET, with optional discussion time immediately afterward.

Instructor: AWS Solutions Architect Team

Description: Amazon Sagemaker Studio, a part of the SageMaker umbrella of services, is an IDE for ML that provides a single unified interface for all the tools, including Jupyter notebooks and RStudio, you need to take your models from experimentation to production and boost your productivity. SM Studio provides access to all ML resources in one place.

In this session, we will provide an overview of SM studio and its features: Data preparation and Feature Engineering using Data Wrangler, Build ML models using Jupyter notebooks, Train and Deploy models.

Level: Introductory

Prereqs: None

Slides

AWS SageMaker Studio (Requires Cornell University Net ID to access)
AWS SageMaker Studio (Requires WCM CWID to access)

Recording

AWS SageMaker Studio

April 30: GenAI on AWS

Time: 9am-10am ET, with optional discussion time immediately afterward.

Instructor: AWS Solutions Architect Team

Description: In this session, we will introduce you to the Generative AI landscape on AWS. We will discuss the AWS services used to build applications for summarization, image generation, chatbots, and fundamentals of Retrieval Augmented Generation for Generative AI.

1. Introduction to Bedrock, Knowledgebase and Agents and how to use them to build Gen AI applications.
2. Introduction to Amazon Q for Business to build a virtual assistant.

Level: Introductory

Prereqs: None

Slides

GenAI on AWS (Requires Cornell University Net ID to access)
GenAI on AWS (Requires WCM CWID to access)

Recording

GenAI on AWS

Fall 2023

Oct. 4: Introduction to Python

Instructor: Chris Myers

Description: This lecture will introduce the Python programming language, the Python software ecosystem, some key concepts in computer programming, and how those concepts are implemented in Python. The Python ecosystem contains a rich set of packages and tools to support research and data analysis in several different application areas; being able to use the Python programming language to customize computing workflows that leverage those tools enhances researcher productivity and capability. The material is intended both for people new to programming or new to Python who want to get started, and for more experienced Python programmers who would like to get a different perspective on how Python supports a variety of programming tasks.

Level: Introductory

Prereqs: None

Introduction to Python Slides (Requires WCM CWID to access)
Introduction to Python Slides (Requires CU NetID to access)
Introduction to Python recording

Oct. 18: JupyterLab (in the Cloud) for Python

Instructor: Christopher Cameron

Description: Jupyter is the most common Python interface used by researchers. Cloud computing providers, like Amazon and Google, offer their own Jupyter-alike interfaces. This lecture provides an overview of the JupyterLab interface and cloud-based derivatives. It is designed to familiarize new users with common Jupyter-like interfaces, features and best practices.

Level: Introductory

Introduction to Python Slides (Requires WCM CWID to access)
JupyterLab (in the Cloud) for Python Slides (Requires CU NetID to access)
JupyterLab (in the Cloud) for Python Recording

Nov 1: Scientific Computing with Python (with hands-on)

Instructor: Chris Myers

Description: This lecture will provide an overview of select core components of the Python software ecosystem for scientific computing and data science, with a particular focus on numpy, scipy, pandas, and matplotlib. The lecture will include both descriptions of the overall design and structure of those packages and their key components, and numerous code examples that demonstrate some of the important functionality. Opportunities for live, hands-on exercises using these packages will be integrated throughout the lecture, all as part of a Jupyter notebook that will include both the lecture content and the hands-on exercises. The Python ecosystem for scientific computing and data science ecosystem enables researchers to use proven and widely-used tools that are easily customized for specific problems using the Python programming language.

Level: Intermediate

Prereqs: Some familiarity with the Python programming language or other languages used for scientific computing (e.g., R, MATLAB) would be useful, but is not required. The hands-on exercises will be coordinated through the use of an online cloud environment providing support for running Jupyter notebooks, although participants should feel free to use their own local machines if they are familiar with running Jupyter and installing whatever additional packages might be necessary. Instructions about these details will be circulated in advance of the lecture, but participants should be prepared to set up and/or sign up for accounts in those environments before the lecture so that they are ready to run the hands-on exercises during the allotted time.

Scientific Computing with Python (hands-on) Slides (Requires WCM CWID to access)
Scientific Computing with Python (hands-on) Slides (Requires CU NetID to access)
Scientific Computing with Python (hands-on) Recording

Nov. 15: Getting Started with R

Instructor: Christopher Cameron

Description: RStudio is a common R interface used by researchers. This overview is designed to familiarize new users with the interface, features and best practices so they are ready to delve into conducting their own analyses. New content includes an overview of Quarto — an evolution of Rmarkdown — for documenting and sharing data analysis and research.

Scientific Computing with Python (hands-on) Slides (Requires WCM CWID to access)
Scientific Computing with Python (hands-on) Slides (Requires CU NetID to access)

Getting Started with R Recording

Nov. 29: Data Analysis with R

Instructor: Christopher Cameron

Description: This lecture presents several examples of data analysis and visualization in R. It will demonstrate a variety of analyses intended to help researchers determine if learning R is a good investment for their research, including new data analysis examples drawn from the WCM community.

Level: Introductory/Intermediate

Prereqs: Some familiarity with R

Data Analyis with R Slides (Requires WCM CWID to access)
Data Analysis with R Slides (Requires CU NetID to access)

Data Analysis with R Recording

Winter/Spring 2023

Feb. 7: Data Management in Science Research

Instructor: Adam Brazier

Description: An overview of managing data workflows for scientific computing, starting with data collection and aggregation, through processing and storing in an accessible form. We will cover some issues relating to security policy, integration of Identity and Access Management and retention policy (but this is not a security policy workshop!), possible storage venues and formats, models for aggregating and distributing data such as the pub/sub model, and modes of data storage such as relational database, file system, cloud, noSQL, Data Lake, etc.

Level: Introductory/Intermediate

Prereqs: Some knowledge of software and data processing

Time: 9am-10am EST

Data Management in Science Research Slides (Requires WCM CWID to access)
Data Management in Science Research Slides (Requires CU NetID to access)
Data Management in Science Research Recording

Feb. 14: R Basics

Instructor: Christopher Cameron

Description: Learn to read R analysis scripts in this introduction to the R language. We will examine language fundamentals like built-in in data types, conditional execution, flow control, and indexing, then look at some basic data summary and modeling functions with an emphasis on how R is meant to be used.

Level: Introductory

Prereqs: Some experience with statistical concepts and tabular data formats or spreadsheet software.

R Basics Slides (Requires CU NetID to access)
R Basics Slides (Requires WCM CWID to access)
R Basics Recording

Mar. 14: Python for Scientific Computing and Data Science

Instructor: Chris Myers

Description: An examination of the core components of the Python software ecosystem for scientific computing and data science, with a particular focus on numpy, scipy, and pandas. This lecture will describe the overall design and structure of these packages and some of their components, complemented by code examples that demonstrate some of the key functionality. Also addressed will be issues of performance and the integration of these core packages in the larger Python ecosystem.

Level: Intermediate

Prereqs: Some familiarity with the Python language would be useful but is not required.

Time: 9am-10am EDT

Python for Scientific Computing and Data Science Slides (Requires CU NetID to access)
Python for Scientific Computing and Data Science Slides (Requires WCM CWID to access)
Python for Scientific Computing and Data Science Recording

Mar. 21: Python for Digital Humanities and Social Science

Instructor: Christopher Cameron

Description: Humans generate messy data. While statistics-focused environments like R and Stata are great for data analysis, these specialized tools can be difficult to use with data that defies tabular representation. Human data, like written language, social relationships, images, and social media content, require flexible tools that can handle complexity. In this talk, we will provide an overview of Python, highlight how this free and open-source programming language supports digital humanities and social science research, and discuss Cornell and web-based resources to help you get started using Python in your research.

Level: Introductory

Prereqs: Some experience working with tabular data formats or spreadsheet software is helpful but not required.

Time: 12:15pm-1:15pm EDT

Python for Digital Humanities and Social Science Slides (Requires CU NetID to access)
Python for Digital Humanities and Social Sciences Slides (Requires WCM CWID to access)
Python for Digital Humanities and Social Sciences Recording

Mar. 28: Creating the Best Visualizations for your Data

Instructor: Ben Trumbore

Description: An introduction to choosing the best type of chart to use for the data you have and the message you want to convey. Includes a breakdown of the different types of data you might have and descriptions of the main types of 2D data visualization. Does not include instruction for any particular visualization tool.

Level: Introductory

Prereqs: None

Time: 9am-10am EDT

Creating the Best Visualizations for your Data Slides (Requires CU NetID to access)
Creating the Best Visualizations for your Data Slides (Requires WCM CWID to access)
Creating the Best Visualizations for your Data Recording

Apr. 11: Revision Control with Git

Instructor: Steve Lantz

Description: Git is a widely used tool for revision tracking and collaborative code development. The talk introduces Git and how to use it effectively in conjunction with a repository hosting service like GitHub.

Level: Intermediate

Prereqs: Programming ability and activity at a level to warrant revision tracking and (possibly) collaborative development of codes.

Revision Control with Git Slides (Requires CU NetID to access)
Revision Control with Git Slides (Requires WCM CWID to access)
Revision Control with Git Recording

Apr. 25: Python for Data Visualization

Instructor: Chris Myers

Description: An examination of some of the Python packages that support data visualization for various use cases, providing both a general discussion of capabilities and multiple code examples demonstrating specific functionality. This lecture will address the generation of both static images suitable for inclusion in publications and presentations, and interactive data visualizations useful for exploring complex datasets and steering computations. Packages examined include matplotlib, pandas, seaborn, plotnine, bokeh, plotly, and possibly others.

Level: Introductory/Intermediate

Prereqs: Some familiarity with the Python language would be useful but is not required.

Python for Data Visualization Slides (Requires CU NetID to access)
Python for Data Visualization Slides (Requires WCM CWID to access)
Python for Data Visualization Recording

May 2: Research Project Software Continuity

Instructor: Adam Brazier

Description: While producing long-lasting software in academic research domains shares many of the same problems as commercial development, the environment is often different. In particular, the number of coders is often smaller, the people writing code may be learning as they go, development of software is often not their main career goal, and the funding model is different. This means that industry approaches to producing, maintaining, and operating software may not apply, or may have to be modified for the research environment. In this talk we will see some ideas, based on experience of research software at a variety of scales, to suit the different situations in which researchers develop software.

Research Project Software Continuity Slides (Requires CU NetID to access)
Research Project Software Continuity Slides (Requires WCM CWID to access)
Research Project Software Continuity Recording

May 9: Working with Excel Files in Python and C#

Instructor: Ben Trumbore

Description: An introduction to working with Excel spreadsheets from within computer programs and scripts. Python and C# examples will be given for reading Excel files and accessing their contents, as well as populating, formatting, and writing new Excel files.

Level: Intermediate

Prereqs: Familiarity with Excel files and modest experience programming in Python, C# or Java.

Working with Excel Files in Python and C# Slides (Requires CU NetID to access)
Working with Excel Files in Python and C# Slides (Requires WCM CWID to access)
Working with Excel Files in Python and C# Recording

May 23: Case Study - Scripting ImageJ and PowerPoint with Python

Instructor: Christopher Cameron

Description: Do you have a workflow with elements that can be automated? Sometimes the hardest part is knowing what might be possible. This case study involves using Python to process multichannel confocal microscopy images with ImageJ and then organize the output into PowerPoint slides.

Level: Introductory

Prereqs: None

Case Study - Scripting ImageJ and PowerPoint with Python Slides (Requires CU NetID to access)
Case Study - Scripting ImageJ and PowerPoint with Python Slides (Requires WCM CWID to access)
Case Study - Scripting ImageJ and PowerPoint with Python Recording

June 6: Using the Whole Processor

Instructor: Steve Lantz

Description: Parallel processing is no longer just a concern for supercomputers--these days, it takes place in nearly all computing devices down to laptops and cell phones. This presentation describes parallel computing capabilities that are found within single processors and how applications can access them through techniques such as multithreading and vectorization.

Level: Introductory/Intermediate

Prereqs: Familiarity with programming in any language and with using a command-line interface

Slides - Case Study - Using the Whole Processor (Requires CU NetID to access)
Slides - Case Study - Using the Whole Processor (Requires WCM CWID to access)
Recording - Using the Whole Processor

June 20: Using Relational Databases for Research

Instructor: Adam Brazier

Description: An introduction to the use of relational (SQL) databases, with a brief overview of database structure then covering SQL queries, some information on best practices, and development tools. We will mostly deal with ANSI SQL which will run on most Relational Database Management Systems (RDBMs), noting some important inter-RDBMS differences. Covered will be SQL queries for data retrieval, insertion and deletion, correlated subqueries and how to construct a complicated query. We will also discuss the interface between the database and the code, including the use of Object-Relational Model tools and stored procedures.

Level: Introductory/Intermediate

Prereqs: This session does not require having knowledge of how to write SQL queries for data extraction, insertion, and deletion, but is a convenient companion to such a workshop or pre-existing knowledge.

Slides - Using Relational Databases (Requires CU NetID to access)
Slides - Using Relational Databases (Requires WCM CWID to access)
Recording - Using Relational Databases

Fall 2022

Nov. 1: Data Transfer Tools

Instructor: Ben Trumbore

Description: An introduction to the numerous tools that can be used to transfer data between computer systems and to cloud storage. Different tools offer different transfer speeds, security, and ease of use, and some automatically recover from failures and allow syncing between systems. Tools covered in this session include FTP, SCP, SFTP, rsync, rclone and Globus.

Prereqs: Familiarity with basic Linux commands.

Time: 9am-10am EST

Slides + Recording:

Data Transfer Tools Slides (requires WCM CWID to access)
Data Transfer Tools Slides (requires CU Net ID to access)
Data Transfer Tools Recording

Nov. 9: Introduction to Modern R Data Analysis

Instructor: Christopher Cameron

Description: Overview of R Studio interfaces (scripts, console and Rmarkdown notebooks) and popular modern data manipulation and visualization packages in R (especially ggplot and dplyr). This is primarily useful for people seeking an entry point into the R ecosystem for research and data analysis tasks.

Prereqs: Some experience with statistical concepts and tabular data formats or spreadsheet software.

Time: 9am-10am EST

Introduction to Modern R Data Analysis Slides (requires WCM CWID to access)
Introduction to Modern R Data Analysis Slides (requires CU Net ID to access)
Introduction to Modern R Data Analysis Recording

Nov. 15: Introduction to Jupyter Lab for Python

Instructor: Christopher Cameron

Description: Jupyter Notebooks are a popular format for scientific communication that intermingles descriptive text, code, statistical analysis, results, and visualizations in a single document. This workshop showcases the features of Jupyter notebooks, demonstrates how to use and share notebooks effectively, and explains how to address common pain points. This workshop is useful for people who send, receive or use Jupyter notebooks.

Prereqs: A basic familiarity with interactive Python is helpful but not required.

Time: 9am-10am EST

Introduction to Jupyter Lab for Python Slides (requires WCM CWID to access)
Introduction to Jupyter Lab for Python Slides (requires CU Net ID to access)
Introduction to Jupyter Lab for Python Recording

Dec. 6: Introduction to Python

Instructor: Chris Myers

Description: An introduction to both the Python programming language and the broader Python software ecosystem of packages that support different sorts of tasks, for those interested in learning the language or deciding if Python is something that they want to learn more about. Where pertinent, connections to other programming languages and technical computing environments will be highlighted.

Prereqs: No knowledge of the Python language is assumed. Some prior programming experience in any language would be helpful, since there will be some expectation of familiarity with basic programming concepts.

Time: 9am-10am EST

Introduction to Python slides (requires WCM CWID to access)
Introduction to Python slides (requires CU NetID to access)
Intoduction to Python recording

Dec. 13: Linux for Researchers

Instructor: Steve Lantz

Description: Presents an introduction to using Linux operating systems. Includes practical techniques for working with the file system, descriptions of common commands and information about customizing a user’s environment. Can be tailored to a specific flavor of Linux.

Prereqs: Some familiarity with hierarchal file systems and a modern computer operating system (macOS, Linux, or Windows).

Time: 9am-10am EST

Linux for Researchers Slides (Requires WCM CWID to access)
Linux for Researchers Slides (Requires CU NetID to access)
Linux for Researchers Recording

Winter/Spring 2025

Feb. 5: Five Ways to Get Started with GPUs

Feb. 12: Introduction to Python

Feb. 26: Data Transfer for HPC

Mar. 5: An Overview of AI

Mar. 12: Accelerated General Data Science in Medicine with RAPIDS, CuPy and Numba

Mar. 19: Getting Started with R

Mar. 26: Data Analysis with R

April 9: SCU HPC Job Optimization Methods and Techniques

Fall 2024

Oct. 2: Introduction to Python

Oct. 9: Introduction to JupyterLab for Python/R

Oct. 23: Python for Computational Science, Data Science, and Machine Learning

Oct. 30: Getting Started with R

Nov. 6: Data Analysis with R

Winter/Spring 2024

Feb. 6: Introduction to Python

Feb. 13: Intermediate Applied R

Feb. 20: Intermediate Python

March 5: Python Use Cases

March 12: AI, Machine Learning, and Deep Learning with Python

March 19: Deep Learning and Generative AI Use at Research Hospitals

April 9: AWS 101

April 16: SageMaker 101 (AI/ML on AWS)

April 23: SageMaker Studio

April 30: GenAI on AWS

Fall 2023

Oct. 4: Introduction to Python

Oct. 18: JupyterLab (in the Cloud) for Python

Nov 1: Scientific Computing with Python (with hands-on)

Nov. 15: Getting Started with R

Nov. 29: Data Analysis with R

Winter/Spring 2023

Feb. 7: Data Management in Science Research

Feb. 14: R Basics

Mar. 14: Python for Scientific Computing and Data Science

Mar. 21: Python for Digital Humanities and Social Science

Mar. 28: Creating the Best Visualizations for your Data

Apr. 11: Revision Control with Git

Apr. 25: Python for Data Visualization

May 2: Research Project Software Continuity

May 9: Working with Excel Files in Python and C#

May 23: Case Study - Scripting ImageJ and PowerPoint with Python

June 6: Using the Whole Processor

June 20: Using Relational Databases for Research

Fall 2022

Nov. 1: Data Transfer Tools

Instructor: Ben Trumbore

Nov. 9: Introduction to Modern R Data Analysis

Nov. 15: Introduction to Jupyter Lab for Python

Dec. 6: Introduction to Python

Dec. 13: Linux for Researchers

Need Help?