This week I had a discussion with my open source students in order to get a sense of where people were at with their work. One of the questions I had for them was, "What are you struggling with, what are you finding tricky?"
Here's some of what I heard from a class of 40 undergrad students who have been working on open source with Mozilla since September:
Specifically things like keeping their trees up-to-date, rebasing, squashing, etc. This didn't surprise me. Getting comfortable with common, local git workflows (
add, commit, branch, checkout) is one thing; but becoming comfortable with using remotes, tracking fast moving upstream branches, re-aligning work you've done with modifications to the main project--it's a lot when you're just starting out. Unfortunately, the way open source uses GitHub, you have to deal with all of it at once.
I need to spend a bunch of time showing them how to accomplish things I'm seeing in their pull requests. I think it's easier to receive this when you've got a need for it.
2. finding bugs
I hear this one a lot, and it's hard to solve in the general case. On the one hand, there are literally millions of open issues on GitHub. For example, here's 1.6 million open issues with the word
fix in the title. GitHub published numbers for 2017 indicating that ~69 million issues were closed this year alone.
GitHub's same numbers from 2017 indicate that there are over 500K students on GitHub (I bet the number is a lot higher, since most of my students don't self-identify as such in their profiles). How is it, then, that it is so hard to match people who want to learn, contribute, and build things collaboratively with potential work in existing open source projects? My teaching for the past decade can basically be summed up as "find bugs for all my students." Some terms I do better than others, and it's always a challenge.
What makes a bug "good" for one student will also make it "no good" for another, whether due to issues of timing, current technical ability, fit with academic goals, etc. To get this right, you really need to take a hybrid approach here: one foot in the open source project, another in the classroom, and help to connect people to projects by knowing something about both. Much like trying to sell your house by putting a "For Sale" sign in the window, it's really hard to advertise project work and then magically have the right people locate it. Similarly, students randomly scrolling through issues rarely achieve a favourable outcome. I've had way more success by talking to people on both sides, and then matching them. Mozilla's Jason Laster has been excellent at this during the past months. "I've got a student who wants to work on something like X," I'd tell him. "OK, I've got an idea, I'll file a bug she can work." Amazing.
Another issue that comes up a lot is that not all communities can or want to engage with new contributors. This is fair, and to be expected as teams deal with various release cycles. This term alone I've been told by three groups I've approached that they didn't want me to bring students into their project area because they couldn't afford to spend the time mentoring. I think projects are getting better about knowing their limits, and being vocal about what they can and can't do.
I'm also experiencing an interesting attitude with my students this term: "I want to write code," I'm hearing a lot of them say. Many view adding new code to be more important (or desirable) to debugging, refactoring, or removing existing code. I tend to prefer the latter myself, and think they would do better if they spent more time reading and less time writing code (we all would, to be honest). Part of this might be discomfort with #3.
3. how to read code and code history
There is such an art to this, and almost no one is taught to do it. We learn to write code, not read it. Part of the trick is knowing where to start, what to ignore, what's critical, and how to progress, since you don't start at
main() and work your way to the end. Code has more in common with Choose Your Own Adventure books than it does with literature.
Reading code is almost always goal oriented. Sometimes the goal might be enjoyment, but that's not usually the case in my experience. Rather, you're usually trying to fix something, or understand how systems work. Unlike a novel that one might devour, you need to interrogate code. You can't trust it. It's almost certainly telling you lies, and often in the documentation! Something is failing. Something is doing what it shouldn't. You have to approach it carefully, from a distance, and be willing to change your mind as you uncover more facts.
To teach this, I think you have to have some goals, whether made-up or real, things you want to understand or fix. I'll look for a big example bug to work on and show them how I'd progress through the code, how to search for things, how to deal with code you don't understand, how to move out from an arbitrary point of understanding toward something more general.
Sometimes I've added a feature with my students in order to show how this works, and other times I'll take apart some existing code to try and figure out how it works. Both have advantages and disadvantages. One thing I might do this time is compare how a few different browsers implement some common feature, and go spelunking through Gecko, WebKit, and Chrome, learning as we go.
4. how to ask questions
We all need help. It sounds obvious, but it's not. It feels like we're the only one who doesn't understand what's going on, especially in a big open source project where there's a flurry of activity all around us from people who seem to be so much smarter and more talented than us. It's easy to examine individuals statically vs. in motion along a timeline: you are progressing, you are learning, you are moving forward, some people ahead of you, some behind. The rate doesn't matter. Your current position doesn't matter. That you're moving forward is all that matters.
And if your progress forward is being impeded by some issue you don't understand, you need to be brave and ask for help. If we can normalize this, it makes it easier. One of my students put it this way on Slack:
knowing how to ask questions in open source is something I feel like I need to get better at. In a way it's good to know other people in the class feel the same way
So how do you ask a question? Respectfully. Begin by respecting yourself. Don't devalue yourself or self-deprecate: you aren't dumb because you don't know something, so don't imply (to yourself or others) that you are. Next, respect the people you're asking. You've got something you need to know in order to progress--how much of the research could you do on your own? You're not respecting other peoples' time when you don't do any work and expect others to do it all for you. But if you've spent time wrestling with a problem, and come out the loser, it's wise to ask for help. Finally, when that help comes, you should be thankful and give respect to the person who has taken the time to help you. This can be as simple as an acknowledgement of the impact they've had, how they've helped you move forward. It doesn't need to be big, but it needs to happen.
5. understanding dev environments
One of my students kept hammering on this point, saying: "We're taught a dozen programming languages, but not how to use all these environments." It's true, learning new programming languages is easy in comparison to how we use them within ecosystems of tools and frameworks. Also, not everything about a dev environment is something you can see in code: lots of things are invisible practices we have as a community. It can be hard to learn these things on your own, because having access to a tool doesn't necessarily mean you have access to the knowledge of how it should be used.
This is why I encourage my students to get involved in the virtual community spaces a project uses. Whether that's Slack, irc, a mailing list, a weekly call--whatever it is, we need to have the ability to watch people use their tools, and observe them doing things we haven't seen before. Just as chefs travel the world to work in great kitchens, and young artists apprentice at the studios of established artists, it's a great idea to join these spaces, and observe. No one will think it odd that you're listening vs. talking. You'll see lots of people ask for help, talk about problems they're having, and also reveal how they work, even if indirectly.
In much rarer cases, they'll show you directly. This is why I really love what Mike Conley has done with The Joy of Coding. As I write this, there are 120 episodes available, where Mike works on Firefox code, and talks about what he's doing as he does it. What an incredible role model Mike is, and what a gift to the community. Webpack's Sean Larkin is another guy I think does a great job here, inserting himself into the learning process of the community.
6. overcoming the feeling that you need permission to do things
"I noticed something confusing in the docs, should I file an issue?" It takes time to gain the confidence necessary to move from "I must be wrong" to "this seems wrong." These are two similar sentiments, but where one one assumes the limitations of the self, the other begins with the belief that everything is broken, including me.
I think it was easier to overcome this 20 years, even 10 years ago, before we all became digital consumers. Open source functions via the web, and the web was built on a foundation of permissionless publishing. I'm writing this now without first getting permission. The web enables us, all of us, to create content and put it beside content from individuals, institutions, governments, and corporations large and small. As such, the web is malleable, editable.
This means that you can and should work on things in projects on the web. You can comment on issues where you have something to say. You can draw attention to shortcomings in documentation, code, tests, or designs. Even better, you can fix these shortcomings. You can insert your ideas, your passion, your gifts, yourself into the web. You don't always need to, nor is it wise to give too much of yourself. But you can, and you should know that you have this power and right. You're important. Your ideas matter. You're welcome.
The other side of this permissionless approach is that no one will give you permission: no one will invite you in. It takes some courage. You have to decide you're going to belong somewhere, then start being active. You have to start saying we when you talk about the project and the code. You have to start believing that you're part of it. Because by virtue of the fact that you're working on it, you are part of it. You don't need permission, or if you still feel like you do, consider this to be your permission to get started.
Just tell them "Dave said I could."