This week I had a chance to talk with Philip Guo, an associate professor at UC San Diego. He's been interested in how CS/programming instructors are dealing with the influx of LLMs, like ChatGPT/Copilot, in their classrooms. After talking with more than a dozen people in the US, he saw my "CheatGPT" post on Hacker News and reached out for my Canadian perspective.
I wanted to write down some of what we discussed, since the many AI conversations I've been having recently are forcing me to ask myself important questions–I still lack answers, but the process is helping me get closer to where I want to be.
We started with how I've used these tools. What I've found most compelling so far as been their ability to help me accelerate difficult tasks. For example: writing distractors to multiple-choice questions or developing similar types of problems for students to work on, without repeating any of the core ideas (i.e., "I need a question like this one, but make it different"). Philip talked about the idea of something being easy to recognize, while hard (i.e. tedious) to generate. I also talked about my experiments building an infinite API backend, which strictly adheres to JSON formats. I've also been playing with the new Chat completions API to see what I can do with the extra System Prompt–so far I've been having good success building a chatbot that helps teach students about coding concepts, but never gives them actual code snippets.
I also talked about one of my big worries with early semester CS students, namely, that they will overlook the value of going slow while learning, mistaking AI's speed with their own ability. I don't have a great way to describe this idea yet. I'm fascinated with how tasks that seem tedious can feel slow for different reasons. I asked Philip if he ever uses AI when coding: "I tried Copilot when it came out, but at some point I lost access and never felt compelled to go restore it." This is largely how I feel, too. I don't really need AI to write code. It's neat that I can, but in languages where I'm proficient, I can already code at about the same speed that I can type.
So what about a language where I'm not proficient? What if I put myself back into the position of my students? Here I run into a new problem. How do I judge the quality of the code I'm being presented? I see lots of people on Twitter enjoying themselves as they prototype new apps in an afternoon with LLMs: "I don't know any Python and I just wrote this!" I think that's fantastic, and I need to try doing it myself. But I also know that any software you plan to use beyond prototyping requires all kinds of verification, contextual and domain knowledge, security considerations, bug fixes, etc. ChatGPT and Copilot are just as likely to make things up and miss important details when writing code as we see with prose hallucination. I saw someone online discussing how Copilot had written code with a division by zero error, and another showing how it had confused seconds and milliseconds in an algorithm. There's a reason why even the best programmers get their code reviewed by each other. Writing a correct, secure, fast program is hard!
Speaking of code review, I already do a lot of it with my students. What I've experienced in my open source courses is that, for the most part, students don't know how to do it (yet), and are largely afraid to try (for fear they'll miss things or reveal their own shortcomings). Running your eyes and mouse over a piece of code isn't code review. However, a lot of the reason why students struggle to do this well is that they are taught to write, not read code. We haven't prepared them for it.
Philip and I talked about how literary educators often rely on having their students read texts much more complex than anything they expect them to write. I've always subscribed to this approach, requiring my students to work on and in code that is beyond their current skill level. I've always believed that if you want to become a better writer, you should focus on reading more than writing. But if our current students aren't being taught to wrestle with large code as a reading exercise, how will they cope with AI generated code that may or may not be valid? The reality is that new developers don't have the capacity to debug problems in code they already don't understand. "Look at how much of the code it just wrote for me!" Amazing. How many bugs did it include for free? Without a major shift in the way we teach, moving from write-only to read-and-write, I don't see how learning to program is compatible with AI dropping 25-50 lines of code in your editor. Text isn't a reusable component. No one can be handed a bunch of text as-is and hope to combine it with what they already have.
We also discussed the loss of traditional approaches to assessment. One of my favourite assignments for my programming students is to ask them to write various functions, giving them a set of tests and a comment block that describes how the function should work in prose. The difficulty I see now is that what I've just described is a prompt for learning, and hence, a prompt for ChatGPT and Copilot. As a result, marking assignments has become a Turing Test: did you or the AI write this code?
Back to what I was saying about tedium. I need to try and understand how to differentiate "slow" as in "I'm learning how to do this and I'm not very fast yet" from "slow" as in "this takes forever, despite my skill level." We're told that AI is here to free us from the latter. But in an educational setting, where students are slow in both senses, I need to find compelling ways to help them understand how to build patience as they struggle to acquire new skills. And I need to do this while not withholding the power of AI to free them from working on pointless exercises. To pretend that AI has no place in the classroom is ridiculous. But what are its limits?
I told Philip that I'm planning to have my open source students work with the ChatGPT API in the fall (or whatever it's called by then), building programs that leverage LLMs to solve their own problems. I want to see what students would do if they were allowed to work with this technology outside of issues of academic integrity. I think the only way I'm going to get closer to understanding how to sort this out is to plunge myself into something beyond my own skill level. I also think I need to do it with my students vs. coming up with a plan on my own.
I'm enjoying the many chances I've had lately to talk with people about these issues. There's so much to consider.