How to grade open source work

This fall our Open Source Mozilla course at Seneca will be attended not only by Seneca students, but will also include virtual participants from schools around the world. I'll say more about this next week, but today I wanted to respond to an email I received from one of the remote schools.

The question is this: "How do we evaluate open source work in an academic setting?"

I'll tell you what we do at Seneca, and others can add to this on their blogs. The short answer is that you focus on process over product, which allows you to favour individual experience vs. canned, standardized assessments, which are incapable of leading to meaningful student output.

Now the long version--It starts by replicating open source methods in the structure of the course. In the Mozilla context, this means heavily using IRC, writing blog posts, using blog planets, wikis, bugzilla, and mailing lists. The order is specific to the Mozilla community, and other communities reorder it (e.g., more geographically dispersed communities tend not to use IRC as much, instead relying more on mailing lists). This year I will also likely add Twitter to the mix, because it's another place the community shows up.

By adopting these tools and ways of working--essentially creating a telecommuting or pure online environment--we do two things. First, we put the students into the same collaborative space as the open source community, thus making it easy to get feedback, find projects, and work together. Let me emphasize the word 'same' because it's important: we don't create our own versions of these things and isolate the students; we join the community.

I said there were two things this approach accomplishes. The second one is it captures participation in a way you can measure and evaluate as an instructor. The other day at our Seneca faculty start-up meeting, the issue of evaluation and student participation came up. Professors love to complain about how students don't participate in class (I noticed that fewer than 2% of those faculty in the room were participating in the discussion). Our experience is that students who don't actively participate in class are more likely to participate online.

The complaint is, "How can I mark students who don't participate?" and our answer is, "By moving them into the online environment, their participation there becomes something you can easily evaluate." Did they blog? Does their wiki page's history show changes in the past month or week? Were they active on IRC? Did they file and comment on bugs? Did they respond to reviews in bugzilla? Open source is very picky about two things: 1) working in the open; and 2) keeping a history of what happened. Use that to your advantage.

What about marking product? One of the most important realizations we had on this front was that we couldn't expect students to "release early, release often" if the grades didn't encourage this style of work. The first time I taught the course, I basically had one large mark for the final project, and students responded by doing most of their work in the last few weeks of the course. The community didn't like this, and neither did I. That's impossible now that we've started asking students to make "dot releases."

In our course projects we ask students to make releases 0.1 to 0.3 in the first course, and 0.4 to 1.0 in the second. We find that students need more time to get 0.1-0.3 done then they do for subsequent releases (how do I build? how do I use all these new tools? what's my project?). However, we still force them to release regular versions of their code. What's a release? If it's a patch they are working on, we'd expect there to be at least a WIP (Work In Progress) patch posted to the bug, hopefully with a review request. If they are building something new, we expect a working program. It needs to work and it needs to be released (e.g., they have blogged it, the code is available, others can try it). We don't mark things sent by email. We go look on the web like any other user or developer, and if it isn't there, it doesn't exist as far as we're concerned.

Since students are now doing such regular releases, we have a much easier time marking them. Did they get it done on time? Have they made sufficient progress from the previous release? Is the release backed by enough supporting documentation and other info to allow someone to try it? Does it work? Is it done well?

Is it done well--I need to say something about that one for a minute. On the quality of the code, another question I hear is, "How do I mark code in an open source product that I myself don't understand?" This is a valid question. The answer is that you teach your students how to get into the open source review pipeline, and then let the community's code review process aid you.

Here's an example. This is a piece of code that a student did and was then shipped in Firefox 3. In the bug you can see his repeated updates to the code (his "dot releases") as well as the reviewers comments. Notice how focused a bug is: do this, don't do that. The learning doesn't happen in the bug. Behind the scenes, there is all sorts of work to understand and implement the reviewer's comments (irc, wiki, blog, etc). What the bug does is help you gauge and evaluate the quality of the code. If it's not good code, it won't pass review, and progress is measured in terms of the work to address comments and fix issues.

When you combine evaluation of participation in the community with quality of code measured by reviews, you have 80% of what we look for and grade. The other 20% is something we loosely call "Contribution." Open source is not just about people writing code. It's also not just about you writing code in the open. It's about participating in a culture of collaboration. The way you get help in the open source world is by being helpful yourself. I've seen lots of people quickly run out of good will in an irc channel or on a list when all they do is take. In order to get your questions answered and find help repeatedly, you have to be helping too.

Because this is such a necessary part of open source culture, we formalized it into a course requirement. Every student is mainly responsible for a project or bug. Secondarily, every student is responsible to find ways to contribute to other projects, or to the community as a whole. We intentionally leave this requirement vague so that students can interpret it. It's not unlike Google's 20% time. We want the students to be watching for, and taking opportunities to get involved in things. Maybe it's testing some build for a developer. Maybe it's helping with documentation. Maybe it's offering to write a small fix for some code. Whatever it is, it's contributing to the health of the community, and it's letting the students follow their passion.

So, in summary, this is what we do:

Mark process over product
Insure product development happens in open source ways
Use community participation models and tools so things "show up"
Erase the boundary between student and community
Encourage experiments and passion through a 20% rule
There are no tests. There is no exam. Quite frankly, when our students go to interviews, they aren't asked for their last exam mark. The interviewers are more interested in seeing bugs, blogs, and code that has shipped. That's how you mark open source.