CheatGPT
This week I stepped out of one world and landed in another. It started while I was marking assignments in a few of my programming courses.
The quality of the code I was reading was amazing! My students were doing the sorts of things that my open source peers do: using modern syntax, using functional approaches instead of creating a lot of unnecessary state, composing deeply nested function calls, and employing interesting patterns for solving problems.
I praised the code I was seeing on a number of assignments. But then I started to notice how often I was writing these same comments. Despite the fact that professors like to think that they are becoming better teachers over time, it's hard to imagine that I've improved this much since last term, or the term before that, or any of the dozens of times I've taught this same course.
I don't think I've changed, nor has the subject. It's also hard to imagine that these students are an order of magnitude better. Yes, student cohorts can be vastly different; but this different? I think there's another explanation.
For a long time, I've been wondering aloud how higher education is going to deal with the sophistication of ChatGPT, Copilot, and other AI tools. Well, wonder no longer, because this is where I live now.
"Dave, you're making assumptions. Can you prove any of this?" I can, actually, since some submissions that required screenshots also included ChatGPT browser tabs, which helpfully included the initial text of the prompt. Apparently, it's not even something students feel they need to hide.
I love GPT, and I've been programming using Open AI's GPT-3 API for a long time. It's been breaking my understanding of what is and isn't possible, and forcing me to think about new ways to solve technical problems. I get why everyone is excited to use it.
However, I'm not sure how to understand it as part of student learning. My daughters, who are seeing the same thing happening among their peers at university, encouraged me to think about this outside the context of my own courses: "Nothing you do in your class is going to change what's happening–this is a global phenomenon."
Of course they're right, so let's be clear: this is happening. People are using large language models (LLM) to write their code. From GitHub's own marketing:
Back in June 2022, we reported that GitHub Copilot was already generating 27% of developers’ code. Today, we’re seeing this happen more and more with an average of 46% of code being built using GitHub Copilot across all programming languages, and 61% among developers using Java.
Think of all the time saved! But what if I rephrase this slightly: "an average of 46% of students' online tests and assignments [are] being built using GitHub Copilot." If we add ChatGPT to the mix, we could substitute words like "essays" and "reports" for "code." Is that good or bad? Innovative or regressive? Helping or hurting our students?
A lot of people who don't have to deal with the practical implications of this in their workaday lives have given me all kinds of helpful ideas: use oral or paper-based testing with no internet; "just" stop using assessments that people can cheat on; or, have students write about the responses from the AI, discussing why they are good or bad.
The first solution is hard for lots of reasons, not least that the current funding model of post-secondary institutions, which does not prioritize the ratio of faculty-to-students necessary for ever more personalized or real-time assessment methods. Larger and larger classes make many of these good ideas impractical. Faculty have zero control over this, but by all means, please talk to our senior leadership. It would be great.
The second solution is challenging with students early in their studies. I teach two types of students: 1) some who are learning to program in their first year; 2) the rest in their final year, after they've learned half-a-dozen languages and written many programs.
In my opinion, the students learning to program do not benefit from AI helping to "remove the drudgery" of programming. At some point, you have to learn to program. You can't get there by avoiding the inevitable struggles of learning-to-program. "Dave, maybe you had to learn that way, but that world is gone!" It's true. I've been programming continuously for ~40 years. Following my inefficient, meandering path today would have zero benefit. It's impossible to go back, and I wouldn't want to. The potential of modern tooling, including AI, is too significant. We can't ignore or forbid AI in education. But having students use it to literally write their assignments isn't going to work either.
The second group of students, those in their final year, are in a different position. I'm (usually) not trying to get them to understand OOP, different ways of enumerating collections, or how and when to write a function. I'm also not assessing this. At this stage of their development, it's assumed that they can program.
For these students, AI provides something valuable. For example, this week I was reviewing code that one of my open source students had written. As I read, I needed to know more about how an API was being used. I did a Google search for the answer, and the only thing that came back was this same student's blog post (a common occurrence at this level). My problem had caused me to run into the limits of "search," which was no longer sufficient. So I went to ChatGPT and asked it about my ideas, and if there was a better way to use this API. Sure enough, ChatGPT spit out 25 lines of code that confirmed what I thought, and demonstrated an alternative way to write this code.
Except that half the code was made-up. One of the API calls that it was using doesn't exist (I wish it did!). It didn't matter to me, because I wasn't looking to use this code as-is (I went and read the official docs): I only wanted to read about other approaches. I wanted to sketch out an idea.
Text returned from LLMs requires an incredible level of discernment and close reading. The fact that you can sometimes copy/paste things direct from ChatGPT can mislead people into thinking that you always can. You can't. But what if that's all you're trying to do? What about students who don't really know what they're doing and just need to get something submitted for this damned assignment before 11:59 pm? I promise you that this use case is more common than people realize.
I don't know what the right answer is. I don't want to be like so many of the teachers I had, who were skeptical of the ability for students to safely incorporate computers, then CD-ROMs, then the Internet, into their learning.
It's a heavy burden to have to figure this out, and no other change I've seen in the past 20+ years of teaching has really compared with this one. Students copy/pasting from Stack Overflow seems trite by comparison. We've entered a new world and it's going to require a thoughtful approach.
I've appreciated people like Ethan Mollick writing about how to embrace vs. fear AI in education. This is absolutely how I want to approach things. However, as I've written above, it's not easy (for me) to see how to do it with all of my students.
One of the things I'm thinking about is that I might need to start teaching students how to use GPT/Codex and other LLMs in their programming. Instead of "Thou shalt not," I could try to model some ethical and pragmatic uses of AI in programming. What does programming, and learning-to-program, look like post-GPT? We need to talk about this. I hope that more of my colleagues will write about this, both in education and industry.