I'm beginning another term today, and the majority of what I'm teaching this time is Cloud Computing using AWS. I spent a good part of 2021 researching and developing this course, and taught it for the first time during the winter semester. Now that I've arrived at the "teach it again" stage, I wanted to reflect on it a bit.
We offer a number of courses on cloud computing already, but nothing tailored to developers. As is so often my motivation, I wanted a course that provided a ready-made path for programmers to take, one which avoided the meandering, haphazard way that I had to learn it.
I decided to begin by asking friends in industry what they thought I should include and avoid. I reached out to colleagues, former students, and friends working at big companies (FAANG), startups, and in government. I spoke with people working in media, e-commerce, banking, the energy sector, and social media. It was fascinating to hear the different perspectives they had, and where they agreed or disagreed.
"What should I teach a junior developer about the cloud?"
Here' some of what I heard:
- "Everyone uses the cloud." Having cloud experience is really important for being able to go after good jobs in tech.
- "The cloud is enormous. You can't teach all of the cloud in a single course. Your students are going to be overwhelmed". Everyone is overwhelmed by it. Focus on breadth over depth.
- Focus on a single cloud. Don't bother with multi-cloud
- "The cloud is primarily Linux." Make sure they know how to use it. The cloud is glued together and automated with command-line scripts.
- "The programming language you choose doesn't matter." Use node, python, Go, whatever you want, they are all fine, but pick one you already know so you're not learning two things at once (our students know node.js the best, so I use that)
- "Everything in source control. Period. Always." Knowing git and GitHub is critical, and also that the entire lifecycle of software changes happens in git (proposal, implementation, testing, deploying). Force students to work entirely in git/GitHub for everything.
- "The cloud is cattle, not pets." As quickly as possible, move them away from thinking about logging into machines to do manual tweaks, and instead think about code and automation
- A lot of people said some version of "'It works on my computer' isn't useful," "Your code isn't useful if it isn't running in production," or "Cloud is what happens after you write your code." Everyone said some version of "CI/CD pipelines are critical for a junior dev to understand."
- "Most cloud workloads are run in containers." Almost everyone told me to focus on containers vs. manually using cloud instances, and to learn how to use them in dev, testing, CI/CD, and production. "Docker and compose are good choices at this stage"
- "Kubernetes is really important" and also "By no means should you teach Kubernetes in this course!" since it's too much ("even for industry"). Leave it for later in their journey
- "Help them understand the cloud's secret sauce: managed services." Learn how to leverage them in your applications vs. running your own.
- Security becomes a central concern in the cloud. Understand the principle of least privilege, the importance of the software supply chain, how to cope with dependencies, etc. Learn to use tools to help manage the complexity.
- Similarly, privacy matters more because all your code and data are literally in the cloud now. Understand the importance of limiting the data you collect/store (what if there's a breach?), and why Personally Identifiable Information (PII) is suddenly a concern in things like log messages.
- Make sure they know how to manage configuration and secrets properly
- Use structured logging everywhere and log aggregation/observability tools to deal with things at scale
- Because "everything is always failing" in the cloud, you have to write your software with different expectations
- You have to understand the pricing structures of your choices and how to avoid a massive bill. The paradox of the cloud is: "The cloud is cheap" but "The cloud is expensive." You can fix things by throwing money at your problems, or you can understand and use better designs. Tagging helps you figure out costs later on.
- Almost everyone I spoke to de-emphasized serverless, which surprised me--I thought it would be near the top of their list, but no one I spoke to thought it was critical to learn at first. I've come to the conclusion that it should almost be its own course vs. something I do in this one (maybe it should be the next one I make)
- Show them how to manage resources manually via the console, but also how to use Infrastructure as Code (IaC) to automate it
- "Learn AWS" - most people agreed that AWS isn't the easiest option, but is the most valuable to learn.
Based on the feedback I got, I developed a course based on AWS that works through the following major topics:
- Cloud Computing and AWS
- Using the AWS Console, CLI, and SDK to manage AWS resources
- Securing apps with Amazon Cognito User Pools, OAuth2
- Configuring apps with Environment Variables and Secrets
- Using git and GitHub to manage source code
- Using GitHub Actions to create a Continuous Integration (CI) workflow that runs Static Analysis, Unit Testing, and Integration Testing
- Using and Managing EC2 instances
- Working with docker, authoring Dockerfiles, docker-compose, and Docker best practices
- Working with public and private Docker registries to push, pull images (Docker Hub and Elastic Container Registry)
- Using GitHub Actions to create a Continuous Delivery (CD) workflow (build and push images to registry, automatic deploys)
- Deploying and running containers on AWS (manually and automatically as part of CD workflows)
- Running containers in CI/CD for integration testing, and simulating AWS with docker-compose (localstack, dynamodb-local, etc)
- S3 for object storage
- DynamoDB for NoSQL
- Infrastructure as Code and CloudFormation
Along the way, I have them build an HTTP REST API microservice, and slowly integrate more and more pieces of AWS, evolving their approach as they go. Over 10 labs and 3 assignments, they get to work with nearly a dozen AWS services and maintain a single app for 14 weeks.
AWS Academy Learner Lab
When I started my course development, I decided to target AWS Educate. It promptly disappeared a few months before I was set to do the first offering ("...everything failing all the time" right?). I had to quickly pivot to Amazon's new offering, AWS Academy.
The majority of what's offered through AWS Academy is pre-canned, lab-based courses that can be delivered at any academic institution. I'm not sure who the audience is, because I don't know too many professors who work this way (I always develop and create my own courses). However, one of the "courses" is called the Learner Lab, and it lets students access AWS resources without a particular course pathway.
To use AWS Academy, an academic institution first has to become a member (luckily, my institution already was). Then, you have to get "nominated" by an existing AWS Academy Member before you are allowed to create an Educator account. After this, you have to work through a number of Educator Orientation and On-boarding modules (these took me 1/2 day).
Once you've jumped through the necessary hoops, you can start to create Classes and invite your students to create Student accounts. You essentially get a Learning Management System on top of AWS. I didn't use any of its features (we have our own LMS), but you could, and it seemed well made.
What's nice about the Learner Lab is that students don't need to create their own AWS Account and never need to enter a credit card (this is huge). Upon creating their account, each student is given $100 credits to use during the course. If they are enrolled in multiple courses, they get $100 per course (i.e., vs. per student). Free tier spending doesn't get counted against this $100, so it goes pretty far.
A student's credits cannot be increased or renewed. And, students being students, it's something to be aware of since any number of things can happen that might mean a student gets locked out of the lab before the course is over. However, students being students, you also aren't going to wake up to a $10K bill in the middle of the term. It's a trade-off, but I think it mostly works.
The Learner Lab is essentially a sandboxed AWS account. You log in to AWS Academy and "Start" the lab environment. Doing so activates a pre-made AWS Account, which runs for 4 hours before being shut down. If you need to extend your time, click "Start" again and you get another 4 hours. While the lab is running, you can use the AWS Console, or other AWS APIs like you normally would. When the lab is stopped, services like EC2 instances are paused (they get restarted when the lab is restarted). However, many services still keep working. For example, S3 buckets, DynamoDB tables, even EC2 instances that are being managed by other services stay up (e.g., Elastic Beanstalk). It's a little hard to say what is and isn't running when you stop the lab, and therefore what is and isn't costing you credits.
This simplicity is also one of the downsides. Since you have almost zero ability to drill into an account and figure out what is currently running or where your cost is coming from, you only know that you've spent "$23," and that's it. I had one student come to me in a panic when he noticed he'd suddenly spent $70 in two days. "What's causing this!?" Great question! All of the usual ways you'd figure this out in AWS are not accessible in the Learner Lab, so good luck tracking it down. Thankfully a professor can connect to a student's AWS workarea, and look around (also useful for evaluations, where you need to check how things are being used).
The Learner Lab account has access to ~50 AWS services in one of us-east-1 or us-west-2. This includes things like CloudFormation, Cloud9, CloudWatch, EC2, EBS, ELB, Lightsail, Rekognition, S3, SageMaker, RDS, SNS, SQS, etc. which covers a lot. But it also leaves out some strange things, for example, no IAM, only 1 of the 17 ways run containers, no way to do API Gateway with Lambda, no Route53, etc. If what you want to do is available, it generally works great, but some services have extra limitations.
For example, with EC2 you can only run Amazon Linux or Windows AMIs, and the largest instance type you get is r5.large (2 vCPU, 16 GIB RAM). However, you can run up to 32 vCPUs in parallel, so you can run quite a few instances at once.
The setup works, but it's not perfectly aligned with how most CS departments think about using computing resources. Most profs I know don't only give isolated labs. You have project work that builds week to week, and the ability to work with long-lived resources over the term is important. There was one point in the winter where all of the Learner Lab AWS resources got deleted (I mean for everyone, not just my students!). The AWS Academy mailing list of other professors around the world came alive as all kinds of people talked about term work being lost and what a disruption it was. It was pretty clear that people assume you can do term-based work in addition to discrete labs.
I think Amazon imagines a world where you use CloudFormation templates to work with stacks per lab. That's one way to solve this, but you can't start learning AWS with CloudFormation, at least I don't know how you'd teach it that way. Students need to work in the console manually for weeks or months before they can be expected to automate everything.
Another thing making this harder than it needs to be is the fact that that many third-party IaC or other automation tools are hard to use with the Learner Lab because your credentials get rotated every time you start/stop the lab environment. Imagine you need to use AWS credentials in a CI/CD pipeline, but they change every time you do your work. I found ways around it through careful ordering of topics, and adding non-AWS services into the mix, but it felt like an unnecessary limitation. My requests to Amazon to fix it were met with, "We'll look into it."
The Learner Lab gives you some limited analytics. Using these reports, I can see that the average spend per student during the winter was $8.55 (total, for the whole term), and the average lab time was ~120 hours. Only one student hit $80 (he accidentally reserved a dedicated macOS instance for a few days without understanding what that meant), and another spent 336 hours in the lab. Time in the lab doesn't cost more per se, but it means resources are running longer. I think it's great to see people being curious and exploring.
The majority of what we did fit easily within the Free Tier. I was pretty nervous about how what I wanted to do would translate into per-student cost, since a professor can make recommendations (do this, please don't do that), but you never know what your students will do in reality.
I've learned that I could be more aggressive with what we spend and not run out of room. Even with everything I did, I only managed to spend $20 (the Learner Lab includes a Test Student account, which professors can use to work like a student). I'll see if this term's numbers match up with this conclusion, then slowly start turning up the volume.
Overall, I'm pleased with the whole thing. The course prep and research was fascinating, but the development was overwhelming. I wasn't sure what the students could and couldn't handle, but thankfully my first group proved that the idea will work.
I'm looking forward to updating the labs and projects in the coming terms to add different services, and expose the students to new corners of AWS. I'm also looking at ways to add AWS and the Learner Lab to other courses I teach. It's an obvious fit for my open source students, who need access to ephemeral development environments. I'm not sure if it would be too complicated for my web students. We'll see.
Wish me luck as I take a new (larger) cohort up the AWS mountain!