In December I wrote about an experiment I'd started to build a blog aggregation project with my open source students:
In the winter term I'm going to continue working on [Telescope] in the second course, and take [the students] through the process of shipping it into production. Moving from fixing a bug to fixing enough bugs in order to ship is yet another experience they need to have.
Later today, a little over four months later, we’re going to ship Telescope 1.0.0. It's been a big project, and I wanted to take a moment to mark the occasion with a blog post and reflect on what we’ve done.
Little Contributions Add Up
The fundamental unit of Open Source is the individual contribution, be it filing an issue, adding documentation, creating a test, or building new features. As a community, we favour small, isolated changes that can be easily read, understood, reviewed, tracked, and debugged.
This approach can seem at odds with what software ends up being, from the user perspective. We work with huge operating systems, applications, web sites, and servers, and it's hard to imagine them being decomposed down into small commits and pull requests. How can adding 10 lines of code to a massive project have any effect?
In DPS911 and OSD700 I try to teach this lesson, slowly, over weeks and months, and by example. I could try to say it at the beginning of the term, but students wouldn't believe me. However, after spending a long time working together, and seeing something amazing assemble itself out of a million tiny pieces, I usually don't need to say it at all.
I started the Telescope project on October 27, 2019. As of today, April 17, 2020, here's what's been done:
- 450 Pull Requests (4 are still open as I write this)
- 542 Issues (483 have been Closed, and 59 remain Open)
- 1,248 Commits (50% of which are new this term)
- 4 community contributors, 59 student contributors, and 1 professor
- Over 100K words written in blog posts, which is roughly equivalent to a 400 page book, and another 52K words in our technical documentation (another 200 page book!)
What's most impressive to me is what got done by the small group of students that took the second course this term:
- 600+ commits
- 225 files changed with 9K+ lines added and 6K+ lines removed
- 16 contributors, of which 9 were students in the course doing most of the heavy lifting
It's really incredible to look back at what this group achieved together in four months. When there were 64 students working on it last term, they didn't get nearly as much done.
What We're Shipping
When I started Telescope, I wrote a document explaining the goals I had for the project. In it I listed dozens of features I'd like to see get built, and tried to order them into stages, MVP vs. 1.0 vs. 2.0. Reading this list back again, we've done just about everything in the MVP and 1.0 goals (sorry, Kubernetes), and even a few of the ideas from 2.0.
Fundamentally, Telescope is a modern replacement for our previous Blog Planet. Chris hosted this in his personal account for 15 years, and it's always been something I loved (so much so that I recreated it on top of Telescope's new backend). The planet let us watch hundreds of students build confidence and have success, sharing their ups and downs as they went. To work in the open, it's necessary to write on the web. At a certain point, Googling for answers doesn't work anymore, and you need to start adding to the collective knowledge of the web. The real lesson of our open source courses is that you can move from being a consumer to a creator of technology, that what you have to contribute is important and worthy of being shared.
Telescope 1.0.0, our new planet, lets the Seneca community share and discover each other's writing about working open. People log in using their Seneca accounts and add their blog feeds. Telescope does the rest, automatically getting updates and presenting a timeline of all the writing that's happening in our community. It's meant for people writing about software development, and knows how to automatically format code to look amazing. You can also do full-text searches to find previous posts and authors.
Part of why I chose to do this project is that it used a lot of the tools and techniques our students need to learn to do modern web and fullstack development. It gave them the chance to experiment with dozens of cutting edge technologies.
I'll go into detail in a moment, but at a high level, Telescope is made up of the following apps and servers:
- A node.js backend web server providing REST APIs and GraphQL
- A node.js queue service for parallel feed processing
- A GatsbyJS frontend web app using Material UI React components
- SAML2 based Single Sign On Authentication
- A Redis database for caching feeds and posts
- An Elasticsearch database for full-text search of posts
- An Nginx reverse proxy and HTTP cache server
- Certbot for managing SSL certificates with Let’s Encrypt
- A node.js based GitHub Webhook Service to automatically manage deployments based on GitHub push events
This is a moderately complex but realistic application, which gave good opportunities for students to work with new technologies, and see how to connect things.
We used the Twelve-Factor App approach, and this helped a lot. Much of it was new to the students, but the value of these choices becomes clear as you start deploying, which we’ve been doing frequently in the later months.
We used Docker and Docker Compose to manage all the separate apps and services. This worked remarkably well for deployment, but was also a source of pain for many of our developers, especially those on Windows. "Complex systems are complex," and running a bunch of backend services is not easy, especially if you haven’t done it before. We tried solving this with documentation, mock infrastructure layers, and writing ways to connect some but not all pieces of the system to our staging box. It got better over time, but it never got easy. Some students only wanted to work on the frontend CSS, for example, and this was nearly impossible to do without some level of understanding about the backend and dependent services.
Continuous Integration and Deployment
We made use of multiple CI/CD platforms and tools, and also built our own.
Circle CI was used for linting and testing PRs on Linux. We found it to be much, much faster than Travis CI, and it was wonderful to have a backup CI for times when one or the other failed, or didn’t report to GitHub (frequent with TravisCI). However, we found CircleCI less intuitive to configure, and changes often took a few tries to get right.
Travis CI was used for linting and testing PRs on Linux, Windows, and macOS. It was wonderful to have access to all three OSes for testing. The Windows builds used the Chocolatey Package Manager, and sometimes network issues would cause it to fail, which was annoying. The macOS builds took forever, due to needing Homebrew to install various OS dependencies, but were reliable.
We used Zeit Now to build Pull Request Previews of the frontend on every commit to a PR and master. This was a game changer! Once we’d taught our frontend to use different backends via an environment variable, we could use a frontend on Zeit with our backend staging data to get a full experience. Reviewing frontend PRs went from painful to pure joy. The only thing that wasn't possible to test this way was authenticated parts of the frontend, but everything else worked perfectly. In retrospect, we probably should have found a way to do something similar with our backend on Heroku or another cloud provider.
We used GitHub Webhooks to automate our staging and production deployments: on merge or release, automatic rebuilds get started, shutting down the old instance, and starting the newly built apps when done. We had wanted to try doing a Green/Blue deployment pattern, where the old app could keep running until the new one was built, then cut over to the new one and spin down the old one. We had also hoped to leverage Kubernetes to handle a lot of this. However, we didn’t manage to get that part done for 1.0. Maybe in 2.0?
Speaking of staging and production, we set up two live boxes at https://dev.telescope.cdot.systems/ and https://telescope.cdot.systems/. It took some time to get live, auto deployed systems working, but once we did, it was fantastic. Testing became so much easier, and if the staging box ever went down, you heard about it right away from developers who were trying to test things.
Finally, we used release-it to manage our release process, with tagging, creating GitHub Releases, changelogs, etc. It took us a few releases to get all the bugs worked out (we even found a few bugs in release-it with our testing), but it’s running smoothly now, and I love it. I was able to let the students manage the release process without me, which is wonderful.
Most of the student software projects I see being built in our courses are a mess. People don't have enough experience to design, architect, and manage a software project at the same time that they are learning to code and use new frameworks.
On Telescope, I played the role of senior developer, writing some code, but mostly focusing on doing reviews and helping in issues. I think this apprentice model works better than expecting students to do everything on their own. If anything, it would have been better to have more senior/community people involved. When my former student, Ray Gervais, joined the project, it helped a lot. I hope that some of the current students will stay on the project in the coming months to help mentor the next batch of open source students.
We held weekly triage meetings, first in-person and then online. These were an excellent way for us to keep track of bugs, and make sure that nothing got lost in the backlog. I had the students take turns leading these, and taking notes. As much as possible, I wanted to distribute the role of "project manager" so that everyone got to feel a bit of what this entails.
I made every student a GitHub Collaborator, and in the second term, gave them all Admin access to the repo. I also required that every PR get two reviewers to approve it, and that new commits invalidated old reviews. This worked well, since it meant that I didn’t have to do 100% of the reviews. I still reviewed almost everything at least once (and some PRs 10, 20, or 30 times); but in the later months it became common for me to see things getting merged all the time that I’d never read and had been written and reviewed entirely by the students. This felt like success, and I enjoyed seeing people take charge. It was very rare that
master ever got broken. I was impressed with how seriously the students took managing a shared tree.
We used a bunch of GitHub project tooling to try and manage our releases. I think the Project board looks cool, but didn't provide me a lot of insight I couldn't get from the Issues and PRs alone. We used Issue and Pull Request Templates, which I found annoying, but I think it helped to raise the level of commenting by developers. However, nothing we did fixed the autoclose “Fixes #...” feature in GitHub. No matter what I do, people still get it wrong regularly. I have no idea why this isn’t baked into the PR UI by now, it’s completely undiscoverable to non-experts.
Developer Experience and Tooling
We used Prettier with Husky to reformat all of our code on every commit. It’s hard to overstate how much this improved the experience of working on this code with new and student developers. No matter what I did, it was impossible to fix all the line ending issues using any other git tooling or setting (I’ve tried everything). With Prettier, the problems simply vanished, and I never had to do style reviews again. It was very freeing, since I reviewed close to 500 pull requests. However, even this setup wasn’t perfect, and we found that we had to add a CI check to make sure that the styles in a commit were correct, since developers sometimes didn’t install all the necessary tools locally.
Just about everybody used VSCode, so we added custom
.vscode settings with recommended extensions and settings. We mainly used this to integrate and automate ESLint and Prettier in developers’ editors. It worked surprisingly well in most cases. However, we had a few developers who never managed to get it setup properly for a variety of reasons, not all of which I understand. I also added debug launch configs for our servers, and wish I'd done it earlier, so students could have debugged more in VSCode.
As I mentioned above, we wrote 52K words of technical documentation, covering all different setup, environment, workflow, and other technical aspects of the project. For the people who wrote and reviewed these, they were an invaluable resource. Even though we wrote this much, a lot of project knowledge still “lives in networks” vs. documents, with tools like Slack, blog posts, and code reviews doing as much work as our docs.
We used Jest for our testing framework. I’ve mostly worked with Mocha in the past, but I really enjoyed it. They’ve thought of everything I could want, and built it into the tool and documentation. For example, we relied on Jest’s ability to write mocks for modules, which made it easy to write tests for complex parts of our system (e.g., authentication, simulating backend services like elasticsearch and redis).
In addition to Jest, we also used jest-fetch-mock, nock, and supertest to work with network requests in tests. All of them worked really well, and I’d use them again.
In total, we wrote more than 120 tests in 20 files, covering 67.47% of all code in our backend (Jest has lovely test coverage tooling built in, which is great). We have zero tests for the frontend. In general, I made some progress convincing people to write tests for their code, but it’s still a very new idea to the students, and it will take more time before they learn the value of increasing test coverage. Part of their value comes in maintaining code, doing big refactors, and updating dependencies with confidence you didn’t break expected behavior, none of which the students have had to do yet.
async functions, Promises, going between callbacks and Promises, working with
Promise.all(), etc. I spent many, many hours reviewing and commenting on code that got it wrong. This is one of the harder parts of modern web development, and it takes a lot of work to understand it fully. I wish there were more static analysis tools we can add to our linting steps to catch bugs.
We built our feed processor on Bull Queue, a Redis-based queue for node apps. It was great to work with, and gave the students a chance to try doing a queue-based, producer/consumer style server, which is different from the usual CRUD style apps they encounter in other classes. It also let them work with Redis, which none of them had done before. It's unclear to me whether they leave the project loving it as much as I do.
We built full-text search into the app via Elasticsearch. This was ridiculously easy compared to the value it added in the frontend. The only downside to it was that it added another backend service dependency, which complicated the local setup, and required a lot more RAM on developer machines. I think if I was doing this again, I'd host password protected Redis and Elasticsearch instances that the developers could use remotely, and not require them to run these services locally.
Speaking of passwords, we dealt with lots of security related issues, from securing Docker containers (we got hacked), to setting up SSL and HTTPS with Let's Encrypt, to hardening our web app routes, to user authentication and authorization using SAML2 (I wrote about this in detail previously).
We built out our backend API using a mix of both REST and GraphQL. I don’t think any other courses at Seneca deal with GraphQL yet, and it was nice to give the students a chance to experiment with it. My general takeaway is that it’s probably too complicated and overkill for what we were building, and my hopes of leveraging it in our frontend didn't work out as well as I'd hoped.
Part of why I suggested we try using GraphQL is because the students wanted to use GatsbyJS. I think this was likely the wrong choice for the way that we architected our app, since we don’t really take advantage of Gatsby's build step to generate static pages, and instead rely on dynamic data in the app when it runs. Likely Create React App would have been a better fit, but we didn’t know that until we’d built it. I think it was great to have a chance to really dig into GatsbyJS and understand what it is and isn’t good at.
And finally, we used Material UI for our React layout and components. This greatly accelerated our frontend work. I think choosing a UI framework like this was a disappointment to some, who had visions of creating everything from scratch. But the reality of frontend work is that if you don’t have a team dedicated to this, it’s pretty much impossible for a group of student developers to moonlight as frontend designers. I’ve given up on bespoke UI components with teams at this level because the amount you need to understand in order to make them work cross-browser and cross-device, be fully responsive, and have proper accessibility built in is way too much. When people talk about wanting to be in control, they often mean the colours, but don’t understand all the other million things you need to deal with if you start from zero. I love to cook, but I don’t grow and grind all my spices by hand before I make a curry.
So far I've focused on technology, and I wanted to end with something more important. What keeps me doing this year after year are the people I get to work with in order to ship big software. That includes people at Seneca and in the larger open source community (as I write this, I'm being messaged on Twitter with a future project idea). I rarely need any of the technology I build (I hate computers most of the time), but I love working on a big project with other people. It's incredible how you can reliably build a solid piece of software if you leverage the energy and contributions of a dedicated team of individuals, and support them as they learn and grow.
I actually found it a bit emotional to work on Telescope, especially when we got the Search feature to work. As I was reviewing the code, and typed in various queries for the first time, suddenly I was looking at blog posts from hundreds of other students who have shipped "1.0"s with me over the past 15 years. I've always loved getting to work with smart, hard-working students and helping them achieve something great.
This term was no different, and I really enjoyed getting to know the team, and exploring so many interesting problems together. They were:
- Cindy Le
- Rafi Ungar
- Ana Garcia
- Josue Quilon Barrios
- Calvin Ho
- James Inkster
- Miguel Roncancio
- Krystyna Lopez
- Julia Yatsenko
All of them owe me another blog (you, too, Ana), so go read what they have to say as well. I'm sure it will add things I've forgotten.
This was a very challenging time to be doing the work that we did. We had big plans for our 1.0.0 release (cake! stickers!), none of which can happen now with social isolation. I'm thankful that this group didn't give up on our goals, and kept pushing code right til the end. All of us will remember the Winter of 2020 for the rest of our lives, and for me, Telescope was something positive to distract my mind from the world closing down around us.
I'm looking forward to having my future open source students use Telescope. I've never built open source software with my students and for my students before. It will be cool to show the next group how good my previous students were, and give them something to which they can aspire.
Now go add your feed to Telescope and write me a blog post.