Module 1
Syllabus | Module 1
Recordings
Recordings from each session will be made available as soon as possible following the close of the session (generally within 24 hours). You can find links to recordings below. The home directory where the recordings are hosted is available here.
Day 1 | Monday, February 12, 2024
Course Introduction (45 min)
- Welcome
- Introductions
- What is Data Science?
- What is Earth System Data Science in the Cloud?
- Course Goals and Objectives
- Module Goals and Objectives
- Course Logistics
Introduction to Command Line (60 min)
- What is Bash?
- What is an Environment?
- Navigating your Environment
- Manipulating your Environment
- Command Line Text Editors
Introduction to Python (60 min)
- Launching Python
- Versions
- Python as a Calculator
- Variable Assignment
- Beginning Data Structures
Lunch & Learn (60 min)
- Introductions
- Personal Goals
- Ask Me Anything
Meet your Programming Assistant (60 min)
- Intro to ChatGPT3 & 4
- Code Completion
- Best Practices for Communicating:
- Generation
- Correction
- Documentation
- Translation
Day 2 | Tuesday February 13, 2024
Introduction to Git (60 min)
- Version Control
- Git Architecture
- Add, Commit, Push
Leveraging LLMs (60 min)
- Large Language Model (LLM) Landscape
- Accessing and using LLMs
- Learning with and from LLMs
- Prompt Engineering
Learning about Learning (30 min)
How do we learn programming languages?
- Read, Evaluate, Print, and Loop (REPL)
- Iteration
- Random Contextual Interference
- Programming Assistants
Programming in Python (30min)
- Leveraging REPL for fast iteration
- Best Practices for Script Construction
- Comments and In-line Documentation
- Beginnings of Workflow Management
Lunch & Learn (60 min)
1200-1300 February 13, 2024
- Progress Check-In
- Questions
Fundamentals of Computing (60 min)
- What is a programming language?
- Types of Programming languages
- Compiled and Scripted
- Interpreters
- Advantages and Disadvantages
- The Programming Language Landscape
- What is a Computer?
- Storage
- Memory (RAM)
- Compute
- Networking
- Python
- History
- Advantages and Disadvantages
- Parallelization (Thread Lock)
- Support for Different Data Types
- Support for Different Types of Analysis
Programming Paradigms (60 min)
- Objects
- Functions!
- Classes!
- Object Oriented vs Functional Programming
Day 3 | Wednesday, February 14, 2024
Collaborative Git (30 min)
- GitLab & Git
- Cloning, pulling, and pushing
NCICS Coffee Break (30 min)
0930-1000 February 14, 2024
- Celebrations
Collaborative Git (30 min)
- GitLab & Git
- Cloning, pulling, and pushing
Data Types | Python (30 min)
- Scalars
- Vectors
- Arrays/Matrices
- Data Frames
- Indexing
Dependency Management in Python (60 min)
- Install Libraries/Packages
- Importing Libraries/Packages
- Managing Libraries/Packages
- Environment Management
- Python Library Ecosystem
Lunch & Learn (60 min)
1200-1300 February 14, 2024
- Personal Programming Habits
- Ask Me Anything
Collaborative Git (30 min)
- GitLab & Git
- Branching, Merging, and Issues
Control Structures (60 min)
- Loops
- If/Then
- Case
- Try/Except
- Decorators
Principles & Programming Paradigms (30 min)
- R
- Javascript
- Rust
Day 4 | Thursday, February 15, 2024
Introduction to Production Machine Learning (60 min)
- Models
- APIs
- Deployment
GitLab for Project Management (60 min)
- GitLab & Git
- Branches, Merges, and Issues
- Communication & Project Structure
Accessing Data: Importing and Querying (30 min)
- Local & Cloud
- Tabular & Gridded
- Data Volume, Velocity, and Variety
Team Kickoff (30 min)
- Introductions
- Research Theme Definition
- Personal and Project goals
Lunch & Learn (60 min)
1200-1300 February 15, 2024
- Team Project Discussion
Intro to Data Viz in Python (60 min)
- Plotting Libraries
- Introduction to MatPlotLib
- Saving Figures
Day 5 | Friday, February 16, 2024
Foundations of Parallel Computing (30 min)
- What is parallel computing
- Units of parallelization
- MapReduce
- Why Map Reduce changed the world
- The two key types of Parallel Computing
Beginning a Project (30 min)
- Defining a Project
- Choosing a Language
- Finding Data
- Accessing Data
- Introduction to Data Formats
Working in Teams (60 min)
- Communication
- Organization
- Roles
- ESDS Team Projects
Closing Team Exercise (60 min)
- Closing Exercise
- Module Wrap Up
- Next Steps