<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:media="http://search.yahoo.com/mrss/"><channel><title><![CDATA[Bread & Circuits]]></title><description><![CDATA[The blog of David Humphrey]]></description><link>https://blog.humphd.org/</link><image><url>http://blog.humphd.org/favicon.png</url><title>Bread &amp; Circuits</title><link>https://blog.humphd.org/</link></image><generator>Ghost 2.15</generator><lastBuildDate>Fri, 23 Feb 2024 02:24:27 GMT</lastBuildDate><atom:link href="https://blog.humphd.org/rss/" rel="self" type="application/rss+xml"/><ttl>60</ttl><item><title><![CDATA[In 2023]]></title><description><![CDATA[<p>I've been enjoying the break between semesters and the quiet that only seems possible in the days between Christmas and New Years.  This fall has been one of the busiest most challenging I can remember and I've needed this time to pause and recover.  However, since I'm not capable of</p>]]></description><link>https://blog.humphd.org/in-2023/</link><guid isPermaLink="false">658c6ef14a8ffa03eff4d796</guid><dc:creator><![CDATA[David Humphrey]]></dc:creator><pubDate>Fri, 29 Dec 2023 21:57:01 GMT</pubDate><content:encoded><![CDATA[<p>I've been enjoying the break between semesters and the quiet that only seems possible in the days between Christmas and New Years.  This fall has been one of the busiest most challenging I can remember and I've needed this time to pause and recover.  However, since I'm not capable of relaxing as such, I thought I'd add a bit of year-end reflection.  What was 2023 for me?</p><p>Professionally, 2023 was the year I fell in love with AI.  I've actually <a href="https://github.com/humphd/have-fun-with-machine-learning">tried</a> to break into AI a number of times before, but the technology wasn't yet geared to programmers like me.  I'm not a data scientist or statistician, and to be honest, I don't care that much about algorithms for optimizing loss, the merits of various model architectures, or the many other things that go into creating machine learning models; nor do I want to become any of this.  Instead, I want to <strong>use</strong> AI, to bolt it into programs and workflows the same way I use open source libraries and tools.</p><p>Amazingly, in December 2023, the version of me that is writing this post is doing exactly what I always hoped would be possible and I'm now immersed in all kinds of technologies that make this way of working with AI possible.  I can finally do the kind of work I've always wanted to, without needing to transform myself into something I'm not.  Ironically, my ML-adjacent skillset is allowing me to go faster than many who were focused solely on ML-proper.</p><p>One such skill has been my previous investment in learning AWS, which has been an excellent foundation, enabling me to build things much larger than before.  While teaching open source still occupies the larger place in my heart, I think <a href="https://blog.humphd.org/teaching-aws-and-cloud-computing-for-programmers/">my AWS course</a> has likely become the more popular of my advanced courses at Seneca. I get why, too, since it's helped many former students jump into great jobs immediately upon graduation--something I find very satisfying.  Being able to scale-up ideas in code is becoming an important skill.</p><p>At the start of last year, I was very much still finding my way with AI, trying to square it with my teaching practices, and feeling skeptical about its utility for programming.  I spent many <a href="https://blog.humphd.org/cheatgpt/">months writing</a> about it, <a href="https://blog.humphd.org/ai-emerging-technologies-committee/">sitting on academic committees exploring it</a>, <a href="https://blog.humphd.org/on-tedium/">discussing it</a> with peers, and <a href="https://blog.humphd.org/adventures-with-chatgpt/">trying to use it myself</a>.  Those posts, and the discussions they began (both online and offline) were formative in my transition toward embracing the possibilities of AI in my work. However, by far the most significant outcome of that period was that it reconnected me with an old friend, and launched me in a direction I couldn't have anticipated.</p><p>Since my Mozilla days, I've understood the importance of writing about the work you do.  It might seem pointless, since it's often unclear how a thing you're thinking about or building will connect with someone else's work.  Writing in this way is a kind of intellectual fishing, whose aim is to lure interesting conversations and relationships into your life.  Over 15 years ago I was doing the same thing, writing about my work, when I <a href="https://gcc.gnu.org/legacy-ml/gcc/2008-01/msg00280.html">bumped into Taras Glek</a> and started <a href="https://blog.humphd.org/vocamus-557/">DXR</a> .</p><p>In early 2023, Taras and I were both finding it hard to engage our peers on the possibilities we saw with AI and programming.  Each of us was writing about it in our blogs, and seeing the other do the same.  For me, what was starting to feel obvious, was still too far removed from the familiar patterns and "best practices" of academia.  In industry, Taras was having similar frustrations.  I think it was our lack of community, combined with a prior understanding of how to work on experimental things quickly, that allowed us to rebuild our relationship so effectively around exploring AI together.</p><p>Taras had a baby AI web project he was having fun building, and wanted me to contribute.  I really hated the idea of getting sucked into a front-end programming project on the side of everything else I was doing (CSS, really?), but I agreed to help him improve it.  <a href="https://blog.humphd.org/chatcraft-org/">Together</a> we built <a href="https://chatcraft.org/">ChatCraft.org</a> and used it throughout the summer as a laboratory for exploring the many evolving features of LLMs, often in real-time as they were released.</p><p>Since then, ChatCraft has become the most productive tool I've ever built. I use it daily, as do hundreds of other people, amazingly, given how saturated this space is.  I checked just now and I have nearly 900 separate chats in my local ChatCraft database!  I realized earlier this week that I use ChatCraft more than Google and Stack Overflow combined.  It's shocking how deeply integrated it has become into my software development workflows. Surprisingly, I don't use any other AI tools (e.g., Copilot).  Being able to shape how my AI tooling works has been extremely powerful.  When I've needed ChatCraft to do something new, I modify the code (usually with ChatCraft's help) and it does it.  I can't think of another software project that I've worked on that has had this much value to me personally.</p><p>AI is part of what made ChatCraft so successful, but it's only one ingredient.  Like all successful applications of AI, how well you connect it back to humans is what ultimately determines its usefulness.  ChatCraft is AI, yes, but it's also the working dynamic I have with Taras.  He and I are completely different in our skills, approaches, preferred tech stacks, backgrounds, etc.  If I was going to engineer a partnership, I'd never think to start it this way.  And yet, our different approaches, when combined, have been the key to our success.</p><p>For example, I'm obsessed with doing things <em>The Right Way</em> and Taras revels in being completely unorthodox.  Where I <em>know</em> that you're <em>supposed</em> to build things <em>this</em> way using <em>these</em> prescribed technologies, Taras uses whatever will let him go the fastest, mercilessly switching when he reaches the limits of an earlier choice.  Where I rule out using tools I don't know, Taras is fearless in adopting previously unknown tech.</p><p>But my careful, buttoned-up, read-the-manual approach has also influenced him, and the effect we each have on the other has moved us both to an amazing new productivity plain.  The same pattern happens over and over: one of us will start a branch in git, get stuck, and ping the other, who will push 5 more commits to unblock it, and throw it back. What I can't do, Taras is an expert at, and vice versa. All of this happens asynchronously, remotely, across multiple timezones.  In fact the effectiveness of this style of working is amplified by our timezone differences--you go to bed stuck and wake up to a solution in your inbox. Lots of people talk like remote work can't be as productive as in-person, but this way of working is the most productive either of us has ever been.</p><p>A few times in my career, working on software projects has felt more like playing in a band than doing "computer science" or "software engineering." In 2023, my work on AI with Taras has very much been "making music with a friend."  The fuzziness of AI has also made the experience more creative.  I've enjoyed it all immensely and learned so much.  Thank you, Taras.</p><p>Ultimately what this work made possible was for me to explore more areas of interest by expanding my abilities.  Many of the tools and technologies I've been using this fall look vastly different to what I've used in the past:</p><ul><li>TypeScript is now my favourite programming language (sorry JavaScript).  I never thought I'd say that, but working on and maintaining larger code with TS has been a dream.  TS + pnpm + vite/vitest has become my typical starting point.</li><li>I can finally use Python productively, and have used it to fulfill a personal goal and build multiple conservation AI projects for <a href="https://blog.humphd.org/a-month-with-audio-moth/">audio</a> and <a href="https://blog.humphd.org/releasing-action/">video</a> analysis of acoustic recorder, and camera trap data.  A secret weapon in the fight against Python dependency problems has been <a href="https://pixi.sh/">pixi</a>, which I love and always use now.</li><li>IaC tools like <a href="https://www.terraform.io/">Terraform</a>, <a href="https://terragrunt.gruntwork.io/">Terragrunt</a>, <a href="https://www.ansible.com/">Ansible</a>, GitHub Actions and many more have allowed me to build reliable cloud infrastructure more quickly.  I'm eager to find ways to bring more of these into my teaching this year.</li><li>Docker and Containers in general have been increasingly important as I've worked on deploying AI models.  I've been using containers in <a href="https://github.com/just-containers/s6-overlay">non-traditional ways</a> and even in development with <a href="https://containers.dev/">Development Containers</a> (this has been a fascinating way to share knowledge across a dev team).  In 2024 I'm going to be adding Kubernetes to this list, which is another one of those "not for me" technologies I've avoided that I'm having to contend with now.</li></ul><p>On a more personal note, another major theme of 2023 has been learning more about what it means to be part of the <a href="https://www.pewresearch.org/social-trends/2013/01/30/the-sandwich-generation/">Sandwich Generation</a>.  My dad isn't well, and we've been helping he and my mom pack up and move out of the home they've lived in together for 50+ years.  That's been hard for everyone, and it's helped me understand more about my own limits and how fragile we all are as we age.</p><p>The reality of life for my dad (and my mom, who is caring for him) has had a profound impact on me.  Because I've been so connected to the plight of an immunocompromised senior, and paired that with personal learning about the evolving science on Covid and stayed connected with data like this recent wastewater graph in Toronto, I've found it hard to make sense of the total collapse of public and personal health in 2023:</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/12/Screenshot-2023-12-27-at-3.13.48-PM.png" class="kg-image"><figcaption>Toronto Wastewater Data, Christmas 2023</figcaption></figure><p>The disconnect between what's actually happening and how people are dealing with it (or not dealing with it), has been made worse by my refusal to pretend.  I've literally taught thousands of students during the pandemic, and with that many people coming and going in my life, the variety and severity of stories I've heard has been heartbreaking.  So many have lost parents, spent weeks or months in hospital, had to drop out due to their own long covid or so that they can care for loved ones suffering the same.  Just this past week I got an email from a woman in my class who is withdrawing after spending the entire fall at a local children's hospital with her 1 year old, who never recovered from Covid in September.  Maybe in my position, I've seen more than most people; or maybe I've just been willing to see what others have decided to avoid looking at altogether.  In either case, it's changed me.</p><p>There's not a lot one can do anymore, but for my part, I spent all of 2023 lecturing in a mask, and I was grateful to the students who joined me, despite my school removing all protections and helping to re-enforce a culture that ignores the most vulnerable amongst us.  I'll continue do the same in 2024.  Despite the obvious challenges, masking has kept our entire family from getting sick with anything since March 2020 (no Covid, no RSV, no flu...nothing), which is remarkable given where our society has been and is heading right now.  Masks have worked for me.</p><p>Another reason I wanted to write this is that I've really missed my professional connections on Twitter and learning from my friends and colleagues--I realize I have to write in the same ways I wish others would do for me.  So many people I used to read on a daily basis are gone from social media now.  I can't say I blame them.  Twitter died in 2023, and I've found its demise a sad one.  I still use it, because certain communities I value only exist there; but I'm there less and less, and I wonder if I'll still be there a year from now.  Like so much of my life, I'm learning how to exist and be happy in smaller spaces.  So I'm thankful for this blog, which is turning 20 soon.</p><p>Thanks to so many who wrote and maintained the code I relied on this year.  Thanks to those who taught, encouraged, and supported me.  I've appreciated you. May God bless you in 2024.</p><p>Happy New Year, internet.</p>]]></content:encoded></item><item><title><![CDATA[Teaching ChatCraft to do Citations]]></title><description><![CDATA[<p>Back in the summer, I did a bunch of work to give <a href="https://chatcraft.org">chatcraft.org</a> the power to <a href="https://blog.humphd.org/teaching-chatcraft-to-use-tools/">use functions</a>.  Since then, <a href="https://github.com/tarasglek/typescript2openai">Taras has improved it</a> to allow you to use TypeScript and JavaDoc-style comments to build your function with full metadata understood by OpenAI.  I didn't really have a reason</p>]]></description><link>https://blog.humphd.org/teaching-chatcraft-to-do-citations/</link><guid isPermaLink="false">654514b84a8ffa03eff4d73f</guid><dc:creator><![CDATA[David Humphrey]]></dc:creator><pubDate>Fri, 03 Nov 2023 15:56:36 GMT</pubDate><content:encoded><![CDATA[<p>Back in the summer, I did a bunch of work to give <a href="https://chatcraft.org">chatcraft.org</a> the power to <a href="https://blog.humphd.org/teaching-chatcraft-to-use-tools/">use functions</a>.  Since then, <a href="https://github.com/tarasglek/typescript2openai">Taras has improved it</a> to allow you to use TypeScript and JavaDoc-style comments to build your function with full metadata understood by OpenAI.  I didn't really have a reason to use it at the time, but it was the kind of "nice to have" feature that seemed good to include.</p><p>But this week I did need it.  I was talking with some people about how they were using ChatCraft to help format citations properly, but they were frustrated that it would make up details.  "It's too bad it can't get the real info and use that."  But it can!</p><p>I wrote a <a href="https://gist.github.com/humphd/74cf88283239c62f53caff5cddf4cfe5">small TypeScript function</a> to use the <a href="https://www.crossref.org/documentation/retrieve-metadata/rest-api/">CrossRef REST API</a> to get all the metadata that's known about a given <a href="https://www.doi.org/">DOI</a> or DOI URL.  I then gave ChatCraft access to the function by including this in my prompt <code>@fn-url:https://gist.github.com/humphd/74cf88283239c62f53caff5cddf4cfe5</code>.  This <code>@fn-url:...</code> syntax tells the app it can use a function hosted at the given Gist URL.  The function describes what it can do, and the inputs/outputs.  OpenAI can then decide when/if to use it.</p><p>Here's an example of what it looks like when you use it:</p><p><a href="https://chatcraft.org/c/humphd/CVwmeKqB8iwwIO9CYKeV2">https://chatcraft.org/c/humphd/CVwmeKqB8iwwIO9CYKeV2</a></p><p>I prompt the LLM like so:</p><blockquote>I need to cite <a href="https://doi.org/10.1093/ajae/aaq063" rel="nofollow">https://doi.org/10.1093/ajae/aaq063</a> in APA, MLA, and Chicago</blockquote><p>And it calls the function, gets back the metadata, and returns a response:</p><blockquote>To cite the article "An Analysis of the Pricing of Traits in the U.S. Corn Seed Market" in APA, MLA, and Chicago styles, you can use the following formats</blockquote><blockquote>APA: Shi, G., Chavas, J., &amp; Stiegert, K. (2010). An analysis of the pricing of traits in the U.S. corn seed market. *American Journal of Agricultural Economics, 92*(5), 1324-1338. doi:10.1093/ajae/aaq063</blockquote><blockquote>MLA: Shi, Guanming, Jean-paul Chavas, and Kyle Stiegert. "An Analysis of the Pricing of Traits in the U.S. Corn Seed Market." *American Journal of Agricultural Economics*, vol. 92, no. 5, 2010, pp. 1324-1338. doi:10.1093/ajae/aaq063. </blockquote><blockquote>Chicago: Shi, Guanming, Jean-paul Chavas, and Kyle Stiegert. "An Analysis of the Pricing of Traits in the U.S. Corn Seed Market." *American Journal of Agricultural Economics* 92, no. 5 (2010): 1324-1338. doi:10.1093/ajae/aaq063.</blockquote><blockquote>Please note that the DOI (Digital Object Identifier) is included at the end of each citation.</blockquote><p>Extending ChatCraft with extra powers is really easy by adding <code>@fn:name</code> (for functions you write in the app itself) or <code>@fn-url:url</code> for functions you want to share with others.</p>]]></content:encoded></item><item><title><![CDATA[Releasing ACTION]]></title><description><![CDATA[<p>There are many milestones as a father, from watching your children learn to walk and talk, to learning to ride a bike and drive a car.  Sometimes you're lucky enough to be involved in these experiences and get to share in the joy of discovery and feelings of success. This</p>]]></description><link>https://blog.humphd.org/releasing-action/</link><guid isPermaLink="false">6542aa614a8ffa03eff4d4e4</guid><dc:creator><![CDATA[David Humphrey]]></dc:creator><pubDate>Thu, 02 Nov 2023 12:08:30 GMT</pubDate><content:encoded><![CDATA[<p>There are many milestones as a father, from watching your children learn to walk and talk, to learning to ride a bike and drive a car.  Sometimes you're lucky enough to be involved in these experiences and get to share in the joy of discovery and feelings of success. This week was like that for me.  I got the chance to help my youngest daughter release her first open source project.</p><p><a href="https://github.com/humphrem/action#automated-camera-trapping-identification-and-organization-network-action">Automated Camera Trapping Identification and Organization Network (ACTION)</a> is a command-line tool that uses AI models to automate camera trap video analysis.  It works on both terrestrial (mammals, etc.) and aquatic (fish) camera footage. Typically a project will have hundreds or thousands of videos, and manually scrubbing through them to find positive detections is incredibly time consuming and error-prone.</p><p>Using ACTION, one or more videos can be analyzed with AI models specially trained to detect animals in camera trap footage or fish in underwater video.  It uses the open source <a href="https://github.com/tamim662/YOLO-Fish">YOLO-Fish</a> and <a href="https://github.com/microsoft/CameraTraps/blob/main/megadetector.md">MegaDetector</a> neural networks to build a video analysis pipeline.  The models are able to detect an "animal" or "fish" with a given confidence score in an image:</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/11/image.png" class="kg-image"><figcaption>ACTION finding "Animal"s via MegaDetector in a frame of video</figcaption></figure><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/11/image-1.png" class="kg-image"><figcaption>ACTION finding a "Fish" via YOLO-Fish in a frame of video</figcaption></figure><p>However, as powerful as these AI models are, they aren't programs you can run on their own to accomplish real-world tasks.  I don't think people realize this, since it seems like everything has "AI" in it these days.  But the reality is quite different: you can use powerful AI models to do super-human things, <strong><em>but</em></strong> they require you to build custom pipelines around them to extract video frames, resize images, process detections, draw bounding boxes, process clips, create new videos, and allow a user to customize many settings.  Then you have to package it up in a way that others can use without downloading and installing half the internet.  The process turns out to be quite involved, and, dare I say, worthy of its own open source project!</p><p>Like all useful tools, ACTION was born out of genuine need.  My daughter, who is studying biodiversity at university, is currently completing an internship with a local conservation authority that is studying the endangered <a href="https://www.ontario.ca/page/redside-dace">Redside Dace</a> in their aquatic ecosystems. </p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/11/0002-00_02.jpg" class="kg-image"><figcaption>ACTION successfully detecting a Redside Dace, along with a White Sucker</figcaption></figure><p>Specifically, their researchers have been comparing the effectiveness of different conservation monitoring technologies, from <a href="https://en.wikipedia.org/wiki/Electrofishing">electrofishing</a> to <a href="https://en.wikipedia.org/wiki/Environmental_DNA">eDNA</a> (environmental DNA) to "aquatic camera trapping."  The project seemed like a great fit, since my daughter is an expert on <a href="https://en.wikipedia.org/wiki/Camera_trap#:~:text=Camera trapping is a method,in ecological research for decades.">camera trapping</a>, with years of existing field experience.  However, applying the same ideas to underwater cameras was a totally new challenge.</p><p>Typically camera traps use motion and/or heat sensors to detect animals and trigger recordings.  In underwater environments, these same techniques don't map as well, since temperature, visual noise (i.e., debris and water turbidity), and light levels don't work the same as on land.</p><p>In her work, she needed to be able to record underwater environments at various sites using multiple cameras (i.e., multiple cameras upstream and downstream) and compare this to results from eDNA samples taken at the same time.  The fish she is studying are all small (i.e., minnows) and move very fast--nothing stays in frame for more than a second or two, sometimes less.</p><p>The footage ends up being periods of cloudy water with no fish, then suddenly a fish! then nothing again.  It's difficult to manually find these fish in the collected video footage and so easy to miss them completely as they race past in the current. </p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/11/0013-00_10.jpg" class="kg-image"><figcaption>Not all fish are camera shy</figcaption></figure><p>"Finding the fish in these videos is going to take you forever," she was told.  Challenge accepted! So we built ACTION together.  Using the existing AI models with our custom pipeline, we are able to transform hours of video footage into short, separate video clips that contain all the fish.  After this, the serious work of species analysis and identification is much easier.</p><p>The process of building the tool together with my daughter was a lot of fun.  In addition to the internship, another of her current courses is Bioinformatics, where she's learning Unix, Bash scripting, and Python.  This made it easy for us to choose our tech stack.  As we iterated on the code, we were able to incorporate real-time feedback from the fieldwork and data she was collecting.  The feedback loop was amazing, since we quickly understood what would and wouldn't work with actual data.</p><p>For example, unlike with terrestrial camera traps, it became clear that having sufficient amounts of buffer (e.g., video frames before and after a detection clip) was critical: while a Racoon might amble in front of your camera for 20 seconds, fish are there and then not.  You need extra frames to be able to slowly scrub through when identifying a flash of colour (was that a Redside Dace?).</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/11/0001-00_00.jpg" class="kg-image"><figcaption>ACTION detecting a family of Raccoons via MegaDetector</figcaption></figure><p>The real-world testing helped us come up with a <a href="https://github.com/humphrem/action#options">useful set of options</a> that can be tweaked by the user depending on the circumstances: what should the confidence threshold be for reporting a positive detection?  how long should each clip be at minimum?  where should the clips get stored? should the detection bounding boxes get displayed or saved along with the clips?  Week by week we coded, refined our approach, and tested it on the next batch of videos.  Real-world testing on live data turns out to be the fastest path to victory.</p><p>Eventually the fish detection was working really well and we wondered if we could extend it to work for terrestrial camera traps.  Would the code work with MegaDetector, too?  The answer turned out to be "yes," which was very rewarding.  I've wanted to play with MegaDetector for years, but never had the reason to get started.  Once the pipeline was built, swapping in a different AI model was doable (though not easy).  We think that having this work for traditional camera traps, in addition to aquatic cameras, adds a lot to the potential usefulness of ACTION. </p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/11/0084-00_48.jpg" class="kg-image"><figcaption>ACTION detecting a Green Heron via MegaDetector</figcaption></figure><p>After we finished the project, we knew we had to share it.  Imagine if this had existed when the internship started (it didn't, we looked!).  We would love to see other people benefit from the same techniques we used.  Also, the fact that YOLO-Fish and MegaDetector are both open source is an incredible gift to the scientific and software communities.  We wanted to share, too.</p><p>Together we packaged things with <a href="https://prefix.dev/docs/pixi/overview">pixi</a> (thanks to the maintainers for accepting our fixes), learned <a href="https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-citation-files">how to do proper citations on GitHub</a>, and used <a href="https://github.com/humphrem/action/releases">GitHub Releases to host our ONNX model files</a> (protip: you can't use git-lfs for this).  It was excellent real-world experience to see what's involved in shipping a cross-platform project.</p><p>For me, the most difficult problem technically was figuring out how to optimize the pipeline so it could run efficiently on our machines.  We don't have access to fancy GPUs, and needed something that would work on our laptops.  The answer was two-fold.  First, converting the models to <a href="https://onnx.ai/">ONNX</a> format and using <a href="https://onnxruntime.ai/">ONNX Runtime</a> instead of Yolo or PyTorch/TensorFlow as our model runtime.  This was a game changer, and also let us throw away a whole host of dependencies.  Finding the right magical incantations to achieve the conversions kept me up a few nights, but thanks to <a href="https://github.com/parlaynu/megadetector-v5-onnx">https://github.com/parlaynu/megadetector-v5-onnx</a> and <a href="https://github.com/Tianxiaomo/pytorch-YOLOv4">https://github.com/Tianxiaomo/pytorch-YOLOv4</a> we eventually got there.  We also learned that we didn't need to analyze every frame in a video, and could use some tricks to make things work faster by doing less (as always, the fastest code is the code you don't run).</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/11/0022-00_03.jpg" class="kg-image"><figcaption>ACTION detecting a Flying Squirrel mid-flight via MegaDetector</figcaption></figure><p>I learned a ton working on this project, and I'm thrilled to have been able to get involved in some of the amazing conservation biology research that I see my daughter doing all the time.  At the start of the summer, I set myself the task of figuring out how to use AI and conservation technology together, and this was the best possible realization of that goal.  It's very rewarding to have my skillset compliment my daughter's (neither of us could have done this project on our own), and I'm excited that I finally got a chance to work with my favourite scientist!</p><p>Do me a favour and please go give <a href="https://github.com/humphrem/action">https://github.com/humphrem/action</a> a star on GitHub, try it out, tell your friends, and let us know if you use it.  Until then, we'll look forward to seeing your citations roll in on Google Scholar.</p>]]></content:encoded></item><item><title><![CDATA[A month with AudioMoth]]></title><description><![CDATA[<p>In my personal time over the past few months, I've worked on a lot of code to connect machine learning models to various conservation, species-detection problems.  A bunch of this work has focused on automating species detection in camera trap videos, both terrestrial and aquatic.  I'll write about that work</p>]]></description><link>https://blog.humphd.org/a-month-with-audio-moth/</link><guid isPermaLink="false">6519ad634a8ffa03eff4d33c</guid><dc:creator><![CDATA[David Humphrey]]></dc:creator><pubDate>Sun, 01 Oct 2023 18:46:40 GMT</pubDate><content:encoded><![CDATA[<p>In my personal time over the past few months, I've worked on a lot of code to connect machine learning models to various conservation, species-detection problems.  A bunch of this work has focused on automating species detection in camera trap videos, both terrestrial and aquatic.  I'll write about that work in another post at some point, but today I wanted to talk about a related project: automatic species detection for birds using audio recordings done with an AudioMoth.</p><p><a href="https://www.openacousticdevices.info/audiomoth">AudioMoths</a> are small, relatively inexpensive acoustic loggers used by bird, bat, and other biologists, ecologists, and people like me, to monitor wildlife species based on sound.  Imagine a device, smaller than a deck of playing cards, with a built-in microphone, SD storage card, and microprocessor that can be programmed to record audio on a specific schedule.  </p><figure class="kg-card kg-image-card"><img src="https://static.wixstatic.com/media/b31671_731d319ae0e249ad98e5fcdc86cb9592~mv2.jpg/v1/crop/x_92,y_0,w_1178,h_1247/fill/w_540,h_572,al_c,q_80,usm_0.66_1.00_0.01,enc_auto/_edited_edited.jpg" class="kg-image" alt="AudioMoth"><figcaption>AudioMoth (without case)</figcaption></figure><p>Our family has spent years working with camera traps, which are great for collecting photos or videos (we prefer video) of wildlife based on movement.  The camera traps we use <em>do</em> record some audio, but it's poor quality.  I really wanted to see how much better I could do with an audio-only device like the AudioMoth.  I've loved using <a href="https://merlin.allaboutbirds.org/">Merlin Bird ID</a> to do bird species detection on my phone, and I wanted to see if I could do this kind of monitoring long-term in an environment. </p><p>Unfortunately (for me), AudioMoths are very popular with researchers, and can be hard to acquire.  Last winter I started stalking the <a href="https://groupgets.com/manufacturers/open-acoustic-devices/products/audiomoth">site where you can buy them in North America</a>.  You have to wait for them to have enough backers to build a new run of these units.  Eventually in early summer, I was able to buy one and it shipped at the end of August.</p><p>I spent a number of days learning how to use it.  You need 3 AA batteries, a micro-SD card, and a cable to connect to it via USB.  There is an <a href="https://www.openacousticdevices.info/applications">open source app</a> that lets you create a config file, defining your recording settings, schedule, etc.</p><figure class="kg-card kg-image-card"><img src="https://static.wixstatic.com/media/b31671_2b260f28b72b4e4182990586ab2060cd~mv2_d_2344_2068_s_2.png/v1/fill/w_484,h_380,al_c,q_85,usm_0.66_1.00_0.01,enc_auto/AM1-1-USB.png" class="kg-image" alt="AudioMoth USB Connection"></figure><figure class="kg-card kg-image-card"><img src="https://groupgets-files.s3.amazonaws.com/AudioMoth/AudioMoth_Software.JPG" class="kg-image" alt="Programming the AudioMoth"></figure><p>I also opted to buy a <a href="https://store.groupgets.com/products/the-official-audiomoth-ipx7-waterproof-case-green?channel=buy_button&amp;referrer=https%3A%2F%2Fgroupgets.com%2Fmanufacturers%2Fopen-acoustic-devices%2Fproducts%2Faudiomoth&amp;variant=35617620295842&amp;">waterproof case</a>, which comes with a velcro strap you can use to attach the unit to a tree limb.  After this you mount it wherever you want to do your recording.</p><figure class="kg-card kg-image-card"><img src="https://store.groupgets.com/cdn/shop/products/DSC04196-min_1e5dda7c-1d48-4287-886e-a76afb8aa882.jpg?v=1641951098" class="kg-image" alt="AudioMoth Waterproof Case"></figure><p>It took me the better part of a week to figure out the right settings to use (I'm still not 100% sure what I'm doing is optimal).  I've found the AudioMoth to be a bit more like a research tool vs. a commercial product, in terms of its ease of use for an amateur/enthusiast like me.  However, I'm used to <em>learning-as-I-go</em> with technology, and if you're willing to tinker and experiment, you can get good results.</p><p>Once I was satisfied that my recordings were working, I moved the device into a better spot to try a longer experiment.  I picked an area where a number of different ecosystems meet: mixed deciduous forest, scrub brush, swamp wetland, edge of a creek, etc.  This is usually a great birding spot, and also unlikely to cause issues with recording human activity (too early, too buggy, too dense):</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/10/audio_moth_location.jpg" class="kg-image"><figcaption>AudioMoth Location at intersection of different habitats</figcaption></figure><p>My goal was to record the bird chorus for 2 hours every morning (5:30 am - 7:30 am) for the month of September.  I wanted to collect this data in order to understand which species were moving through our woods during <a href="https://ebird.org/news/fall-birding-basics/">fall migration</a>.</p><p>Yesterday I went to collect my data.  However, when I got there, the AudioMoth was gone!  Something (raccoon, likely) had undone the velcro strap and it was laying in the long grass beneath the tree.  I'm going to need a better mounting option!</p><p>After downloading all the <code>.wav</code> audio files from the SD card, I wrote a bunch of Python code to use <a href="https://birdnet.cornell.edu/">BirdNet</a> as my species detection algorithm.  BirdNet can detect ~6,000 species, and uses latitude and longitude, as well as historic sighting data from <a href="https://ebird.org/home">eBird</a> to determine likely species lists for a given location/period.</p><p>My code uses the open source <a href="https://github.com/kahst/BirdNET-Analyzer">BirdNet-Analyzer</a> via <a href="https://joeweiss.github.io/birdnetlib/">birdnetlib</a>, both of which are amazing.  I was able to very quickly go from concept to working code in a matter of hours.  The BirdNet lib is able to extract data and audio recordings for each detection.  For example, you might get a detection like this, along with a 3 second <code>.mp3</code> file:</p><pre><code>{
	"common_name": "Common Yellowthroat",
	"scientific_name": "Geothlypis trichas",
	"start_time": 3207.0,
	"end_time": 3210.0,
	"confidence": 0.732090950012207,
	"extracted_audio_path": "20230831_053000_3206s-3210s.mp3"
}</code></pre><p>This is amazing!  However, coming from a "camera trap" background, I really wanted to have something larger than a single clip.  I decided to analyze all of the detections to find the top 10 by confidence (i.e., how confident is the model about the species) for each species, then use that to create a kind of "greatest hits" sample that combines each individual detection one-after-another.</p><p>This produced fantastic results, making it so much easier to compare audio for a species to other recordings, even those that occurred on different days.  Having multiple samples to listen to back-to-back also helped get a better sense of the range of what each species sounds like.</p><p>I did some more analysis on the data to get a complete list of the 39 species for the month:</p><pre><code>- American Crow
- Yellow-rumped Warbler
- Common Yellowthroat
- White-breasted Nuthatch
- Black-capped Chickadee
- Cedar Waxwing
- Common Grackle
- Blue Jay
- Downy Woodpecker
- American Goldfinch
- Swamp Sparrow
- White-throated Sparrow
- Eastern Bluebird
- Pileated Woodpecker
- Ruby-throated Hummingbird
- Hairy Woodpecker
- Gray Catbird
- Golden-crowned Kinglet
- Song Sparrow
- American Robin
- Eastern Phoebe
- Killdeer
- Eastern Kingbird
- Canada Goose
- Northern Flicker
- Barn Swallow
- Swainson's Thrush
- House Finch
- Rose-breasted Grosbeak
- Yellow Warbler
- Magnolia Warbler
- Blackpoll Warbler
- White-crowned Sparrow
- Common Loon*
- Savannah Sparrow
- Chestnut-sided Warbler
- Greater Yellowlegs
- Lesser Yellowlegs
- Ruby-crowned Kinglet
- Red-bellied Woodpecker</code></pre><p>I've been able to confirm most of these manually (to the best of my ability), however, the Common Loon detections are for sure not right (the recordings were of coyotes howling).  I'll need to tweak my parameters to the AI model.</p><p>Here's a graph of all the species and how many days each one was detected, giving a sense of frequency across the month. I was surprised that the Yellowthroats were the most numerous.</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/10/species_days.png" class="kg-image"></figure><p>Overall, I'm really impressed.  This was my first major test of the setup and my software, and other than nearly losing the AudioMoth in the swamp, everything worked great.</p><p>My next goal is to build a web-based explorer for the data.  For example, I don't have a great way to share the audio files and I'd love to pull in open licensed images of each bird species to build a multi-media experience that makes this data more accessible to others.  Thinking about this, it strikes me that it would be cool to build a bird-species visualization toolkit, since so much of what I need (open licensed images, audio, video, text) would also be needed by anyone else who wants to work on bird web apps.  Maybe I'll find that this already exists, which would be great.</p><p>In the meantime, if you're interested in wildlife audio recording, I'd recommend picking up an AudioMoth and doing your own experiments.  With very little knowledge, equipment, or code, I was able to start answering some interesting questions about my own local ecosystems.</p>]]></content:encoded></item><item><title><![CDATA[Teaching ChatCraft to use Tools]]></title><description><![CDATA[<p>It's <a href="https://blog.humphd.org/what-edit-unlocks/">been a month</a> since I wrote about the work we're doing on <a href="https://chatcraft.org">ChatCraft.org</a>.  Since then we've been focused on a few things and I wanted to write about how they work.</p><p>First, Taras added support for <a href="https://openrouter.ai/">OpenRouter</a>, which was extended by <a href="https://github.com/Chigala">Chigala Kingsley</a>.  Previously, we could only work</p>]]></description><link>https://blog.humphd.org/teaching-chatcraft-to-use-tools/</link><guid isPermaLink="false">64c819a44a8ffa03eff4d253</guid><category><![CDATA[seneca]]></category><dc:creator><![CDATA[David Humphrey]]></dc:creator><pubDate>Mon, 31 Jul 2023 21:06:02 GMT</pubDate><content:encoded><![CDATA[<p>It's <a href="https://blog.humphd.org/what-edit-unlocks/">been a month</a> since I wrote about the work we're doing on <a href="https://chatcraft.org">ChatCraft.org</a>.  Since then we've been focused on a few things and I wanted to write about how they work.</p><p>First, Taras added support for <a href="https://openrouter.ai/">OpenRouter</a>, which was extended by <a href="https://github.com/Chigala">Chigala Kingsley</a>.  Previously, we could only work with OpenAI models, but OpenRouter also gives us access to Google's PaLM, Anthropic's Claude, and Meta's Llama models. This is fantastic, because it pairs so nicely with our existing ability to work with different models in the same chat (i.e., send a prompt to any model, or retry a response with a different model).  It's amazing being able to mix and match responses from so many LLMs across all these providers within the same tool.  This is the real power of ChatCraft, which isn't beholden to any particular LLM or vendor.</p><p>Second, we've been working a lot on adding support for function calling and the ability to create tools.  This is a feature that Taras and I have discussed at length for months, and it's been <em>extremely</em> difficult to nail down the UX.  However, we've finally managed to ship an initial version.</p><p>OpenAI and ChatGPT have had the ability to <a href="https://platform.openai.com/docs/guides/gpt/function-calling">use functions</a> for a while.  The concept is this:</p><ol><li>You define the interface for a function (name, description, and schema for the arguments).  Imagine a function that can add two numbers, download a file and process it somehow, etc.</li><li>You let the LLM know about the existence of the function and either tell it to call it, or let it decide if it needs to use it when you send your prompt (i.e., functions can be optional or required)</li><li>If the LLM wants to call your function, it sends a special message in JSON format with the name and arguments it wants to pass to the function</li><li>The LLM never calls the function–you have to do this on your own.  When you're done executing the function on its behalf, you send the LLM the result and it continues processing</li></ol><p>ChatGPT lets you do this, too, but you're not in control of the process or functions.  We wanted to give users the ability to easily define and run their own custom functions, and to do so directly in their browser.</p><p>ChatCraft defines a function as an ES Module that includes the function and its metadata:</p><ol><li>The name of the function</li><li>The description of the function</li><li>The schema of the parameters (simplified JSON schema)</li><li>A default function export, which is the function itself</li></ol><p><a href="https://gist.github.com/humphd/647bbaddc3099c783b9bb1908f25b64e">Here's a simple example</a>, which can calculate the sum of a list of numbers.  LLMs aren't great at doing math, so a function that can do the calculation is really useful.  A function can do whatever you want, as long as it's doable in the context of browser-based JavaScript (or via CORS and API calls).  You return a <code>Promise</code> with your result and we send it back to the LLM.  If you want your data to get formatted in the UI you can wrap it in a Markdown codeblock (e.g., 3 backticks with a language).</p><p>These function modules can be written and hosted inside ChatCraft itself: <a href="https://chatcraft.org/f/new">https://chatcraft.org/f/new</a> will create a new one.  Or you can host them (as plain text) on the web somewhere (e.g., as a Gist).  In both cases we load them dynamically and run them on demand for the LLM.</p><p>You tell the LLM about these functions using the syntax <code>@fn: sum</code> or <code>@fn-url: https://gist.github.com/humphd/647bbaddc3099c783b9bb1908f25b64e</code>, where <code>@fn</code> refers to the name of a function stored in your local ChatCraft database, and <code>@fn-url</code> points to a remotely stored function.</p><p>I shared a chat earlier today to demonstrate how this works, and you can see it here:</p><p><a href="https://chatcraft.org/c/humphd/LtEkj_J3p66MkDadwzwrH">https://chatcraft.org/c/humphd/LtEkj_J3p66MkDadwzwrH</a></p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/07/Screenshot-2023-07-31-at-10.21.35-AM.png" class="kg-image"></figure><p>Here I use my remote function via the <code>@fn-url</code> syntax, and GPT-4 incorporates it into its answer.  We show all the steps of the process in the UI: the function call, the result, and what the LLM does with it.  In theory you could also collapse this all into one message, but we think "showing your work" is a more powerful approach.</p><p>That said, we're just getting started with the UX for this, and every time we've tested things we've realized new ways to make it better or bugs that need to be fixed.  However, it actually works and we're really excited about it!  It's very easy to experiment and get things running fast.</p><p>The real power of this feature is going to come from combining it with custom system prompts, where we give the LLM an initial context and set of functions it can use to achieve specific goals.  Getting both of these features in place over the summer has opened up a very interesting space that we're excited to start exploring.</p><p>If you want to try this today for yourself, you're welcome to do so (it's live on <a href="https://blog.humphd.org/teaching-chatcraft-to-use-tools/ChatCraft.org">ChatCraft.org</a>).  <a href="https://github.com/tarasglek/chatcraft.org">Let us know</a> what you build, what breaks, and what you think we should do next.</p>]]></content:encoded></item><item><title><![CDATA[Time vs Size]]></title><description><![CDATA[<p>I've been thinking about how much I favour measurements in <em>time</em> vs. <em>size</em> and how the shift to a chronological perspective allows me to overcome my fear of engaging with large tasks. This week a few things have overlapped in my mind to reinforce this idea.</p><p>First, during my summer</p>]]></description><link>https://blog.humphd.org/time-vs-size/</link><guid isPermaLink="false">64b29fb94a8ffa03eff4d076</guid><category><![CDATA[seneca]]></category><category><![CDATA[mozilla]]></category><dc:creator><![CDATA[David Humphrey]]></dc:creator><pubDate>Sat, 15 Jul 2023 15:16:56 GMT</pubDate><content:encoded><![CDATA[<p>I've been thinking about how much I favour measurements in <em>time</em> vs. <em>size</em> and how the shift to a chronological perspective allows me to overcome my fear of engaging with large tasks. This week a few things have overlapped in my mind to reinforce this idea.</p><p>First, during my summer holidays I've been binge-watching the <a href="https://www.youtube.com/@escapetoruralfrance">Escape to Rural France</a> YouTube channel, where <a href="https://www.buymeacoffee.com/8rCuAhZaiJ">Dan Preston</a> is <a href="https://www.youtube.com/watch?v=9JbwdGPpFA8">restoring a burnt-out château ruin</a> (the <a href="https://en.wikipedia.org/wiki/Château_de_Chaumont_(La_Serre-Bussière-Vieille">Château de Chaumont</a><a href="https://en.wikipedia.org/wiki/Ch%C3%A2teau_de_Chaumont_(La_Serre-Bussi%C3%A8re-Vieille)">)</a>.  My eldest daughter has long been a fan of various château restoration channels, which is partly why I suspect this series ended up in my YouTube recommendations.  "Dad, you'd love this," and she's right.  I initially watched one episode, then another, and now I'm hooked.  The task he's taken on is enormous, and the hubris and total disregard for practicality is what makes it so engaging.</p><p>It's slow, which I love. Each episode is focused on a day of work, which might entail clearing trees, shovelling debris, repairing brickwork, or discovering a beehive in an old bathroom wall.  There is no urgency to the pace.  The scale is both human and beyond reach: nothing big happens, and yet over time and bit by bit, large transformations occur.  In an episode I watched a few days a go, Dan pauses his work laying floor joists to reflect: the key to this work is thinking about everything as a set of small jobs vs. the whole, which would overwhelm.</p><p>One of my favourite regular characters on the channel is Nick, the tree surgeon.  Dan and Nick spend a lot of time thinking about how to <a href="https://www.youtube.com/watch?v=9FAdZXszMDA">open up and restore the grounds</a>, which have been badly overgrown.  Nick even spends one episode <a href="https://www.youtube.com/watch?v=4P8wPr9_4F8">removing a massive tree that's grown inside the Château</a>.  He also reflects on the need to think in years and decades, planning for the woodland, but recognizes how small actions taken now, in the present, will come to influence the future.</p><p>Watching Dan and Nick slowly pick away at massive jobs has been inspiring for me. This past week it motivated me to start work on a project I've wanted to do for a few years.  On our property we've been overrun with <a href="https://www.ontarioinvasiveplants.ca/invasive-plants/species/spotted-knapweed/">Spotted Knapweed</a>, an invasive plant from Europe that takes over Ontario grasslands, outcompetes native species, and increases fire risk.</p><p>In the past decade these "purple flowers" went from being something we thought were pretty to being the dominant plant in many areas of the property.  My youngest daughter has got me thinking more and more about the negative effects on biodiversity that invasive vs. native species have.  I've long wanted to do something about it, but every time I've contemplated it, the job has felt impossible.  You can control Knapweed with chemicals, but I don't want to go that route.  The alternative is to hand-pull it.</p><p>The weather here recently has been rainy, and with the rain the soil has been perfect for going to work on the Knapweed.  I've spent days carefully removing plant after plant, making sure to get the root.  Some plants have been taller than me, while others are just starting to come up.  Regardless, they all get pulled.</p><p>The task seemed ridiculous at first.  And yet, bit by bit and slowly, the patches of Knapweed are giving way.  Because I'd always thought about this project as a whole, starting it never made sense: surely there is no way to do this by hand, the job is too big!  But by converting the size of the task to a series of steps in time (i.e., "I'll do this patch today, that patch tomorrow, ..."), I've been able to make an impact.</p><p>The consequence of approaching problems from the perspective of time is that it becomes possible to begin.  I'm not worrying about completing the project, which in the case of the Knapweed will likely never be done; rather, I'm deciding to participate <em>today</em>.  The problem is no longer measured in square meters but hours and days.</p><p>As I've been pulling Knapweed, I've also come to realize that this is my preferred approach to software as well.  I spend most of my time working on software vs. natural ecosystems, and in that work I've long known that many small fixes are invariably better than one massive effort.  I learned this first when working on Mozilla, where the scale of the code was inhuman (no one knew or understood it all, least of all me), but everything was accomplished through the repeated contributions of individuals.</p><p>I've since been reminded of this in my work with Taras on <a href="https://chatcraft.org/">chatcraft.org</a>.  I would never have started this project on my own because it would have looked too big.  Taras coming to me with something already started made it feel possible–I'm much more comfortable fixing bugs and improving code vs. writing it from zero.  That's the risk of "product thinking," focusing on applications vs. their code, where the former requires everything to exist, but the latter can be useful at various stages.  The reality is that by picking away at something, editing and refactoring what was already there, and slowly fixing and cleaning-up the code, you're able to make something really amazing.  Especially if you don't get fixated on being "finished" and instead learn to embrace the half-finished nature of the work, it's possible to go very far.</p><p>I'm generally not good at being on vacation, but by allowing myself to exist at the intersection of a French château, Spotted Knapweed, and <a href="https://chatcraft.org/">chatcraft.org</a>, I'm enjoying the possibilities of existing in time.</p>]]></content:encoded></item><item><title><![CDATA[What Edit Unlocks]]></title><description><![CDATA[<p>Last week <a href="https://blog.humphd.org/my-spring-with-chatcraft-org/">I wrote</a> about a bunch of new <a href="https://chatcraft.org">ChatCraft</a> features we shipped, including the ability to edit any message in a chat.  This week we explored what this new edit feature could enable, extending the idea to the system prompt, which was previously hidden and fixed.  By combining a</p>]]></description><link>https://blog.humphd.org/what-edit-unlocks/</link><guid isPermaLink="false">6495f0eb4a8ffa03eff4cf50</guid><category><![CDATA[seneca]]></category><dc:creator><![CDATA[David Humphrey]]></dc:creator><pubDate>Fri, 23 Jun 2023 20:13:03 GMT</pubDate><content:encoded><![CDATA[<p>Last week <a href="https://blog.humphd.org/my-spring-with-chatcraft-org/">I wrote</a> about a bunch of new <a href="https://chatcraft.org">ChatCraft</a> features we shipped, including the ability to edit any message in a chat.  This week we explored what this new edit feature could enable, extending the idea to the system prompt, which was previously hidden and fixed.  By combining a visible system prompt with an edit button, we've been able to unlock some really interesting abilities in the app.</p><p>A lot has been written about the need to create detailed prompts for LLMs. However, I've been equally interested in exploring what I could do by altering the behaviour of the model via a good system prompt.  System prompts (or system messages) are often the first message in a chat.  Most apps don't expose them to users.  From the <a href="https://platform.openai.com/docs/guides/gpt/chat-completions-api">OpenAI docs</a>:</p><blockquote>Typically, a conversation is formatted with a system message first, followed by alternating user and assistant messages. The system message helps set the behavior of the assistant. For example, you can modify the personality of the assistant or provide specific instructions about how it should behave throughout the conversation. However note that the system message is optional and the model’s behavior without a system message is likely to be similar to using a generic message such as "You are a helpful assistant." </blockquote><p>The system prompt helps to set the tone of your interaction with a model.  Do you want output in a particular format?  Are certain types of responses required and/or forbidden?  Are there underlying assumptions that the model needs to know about when interacting with the user?  The system prompt, like stage direction in a play, lets you guide the model on how to best play the part of a character in the story that's about to unfold.</p><p>As I mentioned above, most apps try to hide their system prompt, and there's <a href="https://twitter.com/marvinvonhagen/status/1657060506371346432">lots of interest</a> when someone can goad a model into revealing what it is.  I understand why many apps can't disclose this, and don't want people trying to circumvent it.  However, one of the things I love about ChatCraft is that we don't have to establish a wall around how the system works.  Instead, everything is editable!</p><p>Our current system prompt looks like this:</p><pre><code>I am ChatCraft, a web-based, expert programming AI assistant. I help programmers learn, experiment, and be more creative with code.

I follow these rules when responding:

- Use GitHub flavored Markdown
- ALWAYS include the programming language name (js) or type of data (csv) at the start of Markdown code blocks
- Format ALL lines of code to 80 characters or fewer
- Use Mermaid diagrams when discussing visual topics</code></pre><p>It's worked quite well, not least because we aren't trying to dramatically alter the nature of the LLMs we use (it can be hard to get a model to go against its training.  For example, try getting a model to respond <em>about</em> code, but <em>never</em> give actual source code).</p><p>As we've built ChatCraft, we've iterated on this prompt a few times to fix bugs, or nudge the responses toward or away from certain types of output.  It's pretty easy to change, but requires you to submit a PR to GitHub.</p><p>With our new edit feature, we wanted to see if we could make this process more open to experiments by users.  For example, I've wanted to test-drive the ChatCraft UI as a way to do research into biology and ecology vs. programming, and Taras has been exploring business topics.  Obviously these changes aren't something that we're going to land in git, so doing it was possible but frustrating.</p><p>By replacing our old "I'm a helpful assistant!" greeting message with an actual system prompt we expose in the UI, we make it possible to let the user take a more active role in establishing the limits and potential of the conversation.  Here's the new UI:</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/06/3-1.jpg" class="kg-image"><figcaption>Greeting converted to System Prompt</figcaption></figure><p>The system prompt is truncated, but clicking "More..." will reveal it.  The user can also click "Edit" in the message's menu to customize it:</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/06/2-1.jpg" class="kg-image"><figcaption>Edit a System Prompt</figcaption></figure><p>Now it's easy to do both minor tweaks or major surgery.  Here are 2 examples of the same prompt, but each chat uses a customized system prompt.  Notice how different each conversation becomes as a result of these changes–one in French, the other in short sentences:</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/06/4-1.jpg" class="kg-image"><figcaption>System Prompt altered to get French responses</figcaption></figure><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/06/5-1.jpg" class="kg-image"><figcaption>System Prompt altered to get short responses</figcaption></figure><p>These examples are really minor adjustments, but they show how powerful the technique can be.  I think that as we begin to include the system prompt in the scope of what we allow the user to alter, we'll discover new ways to work with the tool more quickly.</p><p>Taras and I are excited to apply this technique to our upcoming explorations into adding functions to ChatCraft.  Imagine being able to use ChatCraft to help you write functional tools that can then be used to solve problems in the conversation.  We think it's possible, and this is taking us one step closer!</p><p><a href="https://github.com/tarasglek/chatcraft.org">Let us know</a> if you manage to do something interesting with a custom system prompt or have more ideas.</p>]]></content:encoded></item><item><title><![CDATA[My Spring with ChatCraft.org]]></title><description><![CDATA[<p>It's <a href="https://blog.humphd.org/thinking-about-context/">been a month</a> since I last wrote our work on <a href="https://chatcraft.org/">ChatCraft.org</a>, the developer focused AI <a href="https://github.com/tarasglek/chatcraft.org">open source project</a> that I've been working on with <a href="https://twitter.com/tarasglek">Taras</a>.  In that time, we've been able to iterate on the UX, <a href="https://twitter.com/niallohiggins/status/1668694877574680576?s=51&amp;t=Q-jSo3oqjJCmqVa-pXrlSQ">work with users</a> to add new features, and implement a number of</p>]]></description><link>https://blog.humphd.org/my-spring-with-chatcraft-org/</link><guid isPermaLink="false">648b23c54a8ffa03eff4cca2</guid><category><![CDATA[seneca]]></category><dc:creator><![CDATA[David Humphrey]]></dc:creator><pubDate>Thu, 15 Jun 2023 17:24:26 GMT</pubDate><content:encoded><![CDATA[<p>It's <a href="https://blog.humphd.org/thinking-about-context/">been a month</a> since I last wrote our work on <a href="https://chatcraft.org/">ChatCraft.org</a>, the developer focused AI <a href="https://github.com/tarasglek/chatcraft.org">open source project</a> that I've been working on with <a href="https://twitter.com/tarasglek">Taras</a>.  In that time, we've been able to iterate on the UX, <a href="https://twitter.com/niallohiggins/status/1668694877574680576?s=51&amp;t=Q-jSo3oqjJCmqVa-pXrlSQ">work with users</a> to add new features, and implement a number of the ideas I described in my <a href="https://blog.humphd.org/thinking-about-context/">last post</a>.  It's coming along really well, so I thought I'd take a few minutes to show you what we have right now.</p><p>As I've worked on the code, and used it myself, the definition of what we're building has evolved.  At its most basic level, <a href="https://chatcraft.org/">ChatCraft.org</a> is a personal, web-based tool for discussing code and software development ideas with large language models.  "So it's ChatGPT, right?"  Sort of, but increasingly less so.  ChatCraft is really a tool for <em>iteratively</em> writing and thinking about code with LLMs.</p><p>As an open source software developer and educator, I spend the majority of my day in the GitHub UI: writing feedback, reviewing code, reading and adding comments to issues and PRs, and searching.  The flows and conventions of reading and writing on GitHub are now built into how I want to work: linkable, editable Markdown everywhere.  As such, I've had a strong desire to find a way to replicate my approach to talking to developers with my LLM conversations.  These are obviously completely different activities, both in aim and outcome; but they share an important idea.  In both cases, I'm thinking through writing.</p><p>Much as I'm doing now in this post, I write to understand and solidify my thinking.  When I'm blogging, I'm trying to communicate what I'm experiencing, first to myself, but also to others.  Similarly, when I'm working on GitHub, I'm at once attempting to develop both a line of thought as well as a line of commits.</p><p>Taras laughs at me for doing everything in pull requests vs. simply landing small things on <code>main</code>.  However, I find that I need the extra layer of explanation and writing to wrap around my code.  Just as I'm hoping to cleanly merge my branch with yours, I also need to find a way to turn what I'm thinking into something that can be integrated with what you are thinking.  I do this through writing, sometimes in code, more often in prose. </p><p>Because I write so much on GitHub, I've come to appreciate its affordances.  I think this comfort and familiarity with its UX has influenced how I wanted to see ChatCraft evolve.  Let me show you some of what I mean.</p><p>Here's the start of my most recent chat with ChatCraft, in which I'm fighting with a re-rendering bug in a hook within ChatCraft:</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/06/1.jpg" class="kg-image"></figure><p>All chats have their own URL and get stored in a local database.  This chat is <a href="https://chatcraft.org/c/W1F3Tgin1QE5UFb5MMtBy">https://chatcraft.org/c/W1F3Tgin1QE5UFb5MMtBy</a>. You won't be able to open that link and get my data because it's stored in IndexedDB in my browser.  We're using <a href="https://dexie.org/">Dexie.js</a> to interact with the database, and it's been <a href="https://dexie.org/docs/Tutorial/React">fabulous to use with React</a>.  Taras has dreams of using SQLite with WASM in a Worker down the road, but I wanted to get something working now.</p><p>Like Chats, each Message also has its own link, allowing me to deep-link to anything in my history.  The first message above by me is reachable via <a href="https://chatcraft.org/c/W1F3Tgin1QE5UFb5MMtBy#DnrwATIZiCsQ_C6JMCX95">https://chatcraft.org/c/W1F3Tgin1QE5UFb5MMtBy#DnrwATIZiCsQ_C6JMCX95</a> by clicking on the date <code>Jun 14, 2023, 3:53 PM</code></p><p>I can also open the sidebar (via the hamburger menu), revealing previous and shared chats (more on that below), making it easy to go back to something I was working on before:</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/06/2.jpg" class="kg-image"><figcaption>Sidebar to navigate Previous and Shared Chats</figcaption></figure><p>If I instead want to search for something, all of my old chats and messages are indexed.  Here I'm searching for <code>context</code>, which returns a bunch of messages in various chats.  Clicking on any of their links (i.e., dates) will take me to the chat itself:</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/06/3.jpg" class="kg-image"><figcaption>Searching within old Chats and Messages</figcaption></figure><p>Every message supports various actions via a "dots" menu on the right.  For example, here are examples of 1) a message written by me; followed by 2) a message by an LLM:</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/06/4.jpg" class="kg-image"><figcaption>Actions for Human Messages</figcaption></figure><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/06/5.jpg" class="kg-image"><figcaption>Actions for AI Messages</figcaption></figure><p>In both cases, I can Copy, Download, Duplicate, Edit, and Delete.  Most of these are self-explanatory, but I'll call out Duplicate and Edit.</p><p>We had users asking to be able to "fork" a message, thereby taking a conversation in multiple directions at once.  We let you do that at the level (i.e., create a new chat in the DB with its own URL, but copy all the messages) or from a particular message in a chat (i.e., use this message as my end point and go back vs. the whole thing).</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/06/6.jpg" class="kg-image"><figcaption>Creating a blank or duplicate chat</figcaption></figure><p>This is really useful when you want to experiment with different paths through a conversation, trying alternate prompts without erasing your current work.  Because each chat has its own URL, you can work in multiple tabs at the same time.</p><p>The Edit feature is also both obvious and amazing.  I've wanted it for a long time.  Using it, I can put myself in the driver seat for all interactions.  Rather than passively reading and accepting an LLM's response, I can (and do!) edit it to reflect the way I want things to move.  Having the ability to edit and delete text anywhere in a chat gives incredible freedom to experiment and explore.  I'm no longer beholden to AI hallucinations, my own typos, or Markdown formatting issues. Also, I'm not locked into a past version of how things went in a chat.  I can always rework anything to fit new directions.  Remember that a chat is really a context for the messages that will follow, so it's helpful to be able to alter your current context to meet new expectations as the chat unfolds.</p><p>Extending this idea even further, another feature we've recently shipped is the ability to Retry an AI message with a different language model.  Imagine you're 5 messages deep into a conversation with GPT-3.5 and you wonder what would GPT-4 might have said in response to the third message.  Or maybe you'd like to mix GPT-3.5 with GPT-4 in the same conversation, or compare how they work.  It's really simple to do:</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/06/7.jpg" class="kg-image"><figcaption>Pick a Model to Retry a Response</figcaption></figure><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/06/8.jpg" class="kg-image"><figcaption>New Model's response is streamed into current chat</figcaption></figure><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/06/9.jpg" class="kg-image"><figcaption>Old versions are saved and easily switched between</figcaption></figure><p>Here I've taken a response from GPT-4 and retried it with ChatGPT.  The old response is saved as a version along side the new one, and the new response is streamed into the current message.  I can easily switch back and forth between the two, making it easy to compare (I'm amazed how little they differ most of the time).  I can also edit one of the responses and create a third version:</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/06/Screenshot-2023-06-15-at-11.58.43-AM.png" class="kg-image"><figcaption>Editing an LLM response, creating a new version</figcaption></figure><p>The flexibility to mix and match LLMs, edit and retry prompts, delete messages that don't make sense, duplicate some or all of a message and keep going in new directions–all of this has made writing about code with LLMs incredible productive for me.</p><p>Taras has also been really bullish about getting sharing to work.  We've built an initial version on top of <a href="https://pages.cloudflare.com/">CloudFlare Pages</a> and <a href="https://developers.cloudflare.com/pages/platform/functions/">Functions</a>, with <a href="https://docs.github.com/en/apps/oauth-apps/building-oauth-apps/authorizing-oauth-apps">GitHub OAuth</a> for authorization and <a href="https://www.cloudflare.com/products/r2/">R2 as our object store</a>.  I've wanted to learn CloudFlare's serverless tools for a while in order to compare with what I know from AWS.  I have been impressed so far, and filed issues on things that have been harder than they need to be.</p><p>Here's what sharing looks like:</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/06/11.jpg" class="kg-image"><figcaption>Sharing a Chat</figcaption></figure><p>After authenticating with GitHub, I can manually or automatically summarize my chat and click Share Chat.  This creates a public URL that I can share with friends (NOTE: As I write this I notice that we've broken something with loading shared chats, which I've <a href="https://github.com/tarasglek/chatcraft.org/issues/109">filed to fix later</a>.  You can try this one, which still works: <a href="https://chatcraft.org/c/humphd/3HuOFWRlV-kY28C2tjuGk">https://chatcraft.org/c/humphd/3HuOFWRlV-kY28C2tjuGk</a>):</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/06/12.jpg" class="kg-image"><figcaption>Creating a Public URL for a Chat</figcaption></figure><p>When I open a shared URL in a browser where I'm not logged in, I can read and duplicate the chat, whereas if I was logged in I could also edit:</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/06/13.jpg" class="kg-image"><figcaption>Opening a Shared chat</figcaption></figure><p>I'm excited to extend and fix bugs in what we have so far.  <a href="https://github.com/tarasglek/chatcraft.org/pull/107">Taras has a PR</a> up to add dynamic model support, which will make it easier to try out more models.  We're also hoping to add models from vendors other than OpenAI.  I'm looking forward to being able to have a more "social" experience by mixing different LLMs in the same chat, which is easy now with the features I just outlined.  We're also interested to explore adding <a href="https://github.com/tarasglek/chatcraft.org/issues/86">tools and functions</a> to the mix.</p><p>I don't think I'm alone in my belief that writing, and tools that support good writing, are the keys to unlocking LLMs. This week I had a <a href="https://twitter.com/humphd/status/1668266263494242306">tweet about people reflex-accepting Copilot suggestions</a> do close to a million views (lots of people seem to recognize what I was concerned with).  I think AI has tremendous power to support software developers, but as an educator that spends so much time with the next generation trying to learn our craft, I'm not convinced that we're getting this right in all cases.</p><p>With ChatCraft, I'm hoping to do the opposite.  I want to read, write, edit, review, and compare text with LLMs.  I'm not interested in having text dropped into my editor, word processor, or other tools as-is.  I'm looking for, and trying to build, tools for thought.</p><p>I suspect that ChatCraft will look different again in another few months.  Maybe that will be because you've gotten involved and helped us do something cool.  In the meantime, I'm having fun in my spare time exploring what I think AI and coding can do.</p>]]></content:encoded></item><item><title><![CDATA[Thinking about Context]]></title><description><![CDATA[<p>I've <a href="https://blog.humphd.org/chatcraft-org/">written recently</a> about <a href="https://github.com/tarasglek/chatcraft.org">my work</a> on <a href="https://chatcraft.org/">ChatCraft.org</a>.  I've been doing a bunch of refactoring and new feature work, and things are in a pretty good state.  It mostly works the way I'd expect now.  <a href="https://github.com/tarasglek/chatcraft.org/issues">Taras and Steven have filed a bunch of good ideas</a> related to sharing, saving,</p>]]></description><link>https://blog.humphd.org/thinking-about-context/</link><guid isPermaLink="false">646262194a8ffa03eff4c9ee</guid><category><![CDATA[seneca]]></category><category><![CDATA[mozilla]]></category><dc:creator><![CDATA[David Humphrey]]></dc:creator><pubDate>Mon, 15 May 2023 19:17:49 GMT</pubDate><content:encoded><![CDATA[<p>I've <a href="https://blog.humphd.org/chatcraft-org/">written recently</a> about <a href="https://github.com/tarasglek/chatcraft.org">my work</a> on <a href="https://chatcraft.org/">ChatCraft.org</a>.  I've been doing a bunch of refactoring and new feature work, and things are in a pretty good state.  It mostly works the way I'd expect now.  <a href="https://github.com/tarasglek/chatcraft.org/issues">Taras and Steven have filed a bunch of good ideas</a> related to sharing, saving, and forking chats, and I've been exploring using <a href="https://developer.chrome.com/blog/sqlite-wasm-in-the-browser-backed-by-the-origin-private-file-system/">SQLite Wasm</a> for offline storage. But over the weekend I was thinking about something else.  Not a feature exactly, but a way of thinking about the linear flow of a chat.  The more I've worked with ChatCraft, the more I've learned about this form of dialog.  Because a number of separate features flow from this, I thought I'd start by sketching things out in my blog instead of git.</p><h2 id="a-chat">A Chat</h2><p>The (current) unit of AI interaction is the <em>chat</em>.  A chat, in contrast to a conversation, dialog, debate, or any of the other ways one might describe "talking," is a kind of informal talk between friends.  The word choice also gives a nod to the modern, technical meaning of "chat" as found in "chat app" or "chat online." When we "chat," with do so without formality, often in short bursts.</p><p>What informal chats depend upon is an existing, shared (i.e., external) context between the speakers.  I want to chat with you about details for some event we're planning, or to clarify something you said on the phone, or to quickly ask for help.  I can duck in and out of the conversation without ceremony, because this "chat" does not represent anything serious or lasting.  That is, the relationship of the participants is independent of this interaction--we're just chatting.</p><p>Something similar is at work when I'm talking to an AI.  Most of what I'm saying is not present in the chat.  Maybe I want to know the specific syntax for performing an operation in a programming language.  I'm not interested in learning the language, talking about why the syntax evolved the way that it did, debating other approaches, etc.  I might write only 2 or 3 sentences, but everything I don't write is also necessary for the interaction to work.</p><p>Just as with a friend, I have to signal to an AI the type of thing I'm after.  Lots has been written about prompting an AI, but increasingly I'm becoming aware of the need to evolve that prompt over a series of messages, to refine the idea (both my own and the AI's), and work toward an understanding.  It's less about coming up with the right magical incantation to conjure an idea into existence, and more like a conversation over coffee with a colleague.  So there's always going to be an enormous, shared context that we generally won't discuss; but in addition to this we necessarily need to build a smaller, more immediate context within the discussion itself.</p><p>My interest in AI doesn't include "training LLMs from scratch," which is to say, I'm not concerned with the larger, shared context.  It's fundamentally important, but beyond me.  However, I am fascinated by this more intimate, smaller context that develops within the conversation itself.</p><h2 id="context-in-chatcraft">Context in ChatCraft</h2><p>A chat in ChatCraft looks like this:</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/05/Screenshot-2023-05-15-at-1.23.34-PM.png" class="kg-image"><figcaption>Typical ChatCraft chat</figcaption></figure><p>To begin, we've got the usual back-and-forth you'd expect.  Beneath the UI, we actually have the following:</p><ol><li>A system prompt, helping to define the way our assistant will behave.</li><li>AI message, the first one we seed</li><li>User message</li><li>AI message,</li><li>Repeat...</li></ol><p>Lots of apps need to hide their system prompt (it's <a href="https://twitter.com/marvinvonhagen/status/1657060506371346432?s=51&amp;t=Q-jSo3oqjJCmqVa-pXrlSQ">hard to do well!</a>), but <a href="https://github.com/tarasglek/chatcraft.org/blob/main/src/hooks/use-system-message.ts">ours is easy to find</a>, since <a href="https://github.com/tarasglek/chatcraft.org">ChatCraft is open source</a>:</p><blockquote>You are ChatCraft.org, a web-based, expert programming AI.<br>You help programmers learn, experiment, and be more creative with code.<br>Respond in GitHub flavored Markdown. Format ALL lines of code to 80<br>characters or fewer. Use Mermaid diagrams when discussing visual topics.</blockquote><p>When the user enables "Just some me the code" mode, we amend it with this:</p><blockquote>However, when responding with code, ONLY return the code and NOTHING else (i.e., don't explain ANYTHING).</blockquote><p>By the time you read this, it will probably have changed again, but this is what it was when I wrote this post.</p><p>The AI/Human Message pairs are kind of an obvious construct, but after using this paradigm for a while, new things are occurring to me.</p><h2 id="there-is-only-one-author">There is Only One Author</h2><p>When I'm chatting with a friend, there are two (or more) people involved.  An effective and emotionally safe interaction will involve all parties getting a chance to speak and be heard.  Furthermore, it's important that neither party manipulate or intentionally misrepresent what the other is saying.</p><p>These ideas are so obvious that I almost don't need to mention them; and it makes sense that they would find their way into how we model interactions with an AI as well.</p><p>In terms of manipulation, much has been said about AI hallucinations and how you have to be careful not to swallow whole any text that an AI provides.  This is true.  But I haven't read as much about people tinkering in the other direction.</p><p>When I first started working on ChatCraft, Taras had already added a very important feature: being able to remove a message from the current chat.  If the AI gets off on some tangent that I don't want, I can delete a response and try again.  It doesn't even have to be the last message I delete.</p><p>This seemingly simple idea has some profound implications.  By adding the ability to remove a message from anywhere in the current context, we establish the fact that only one party is involved: I am at once the author, editor, and reader.  There is no one else in the chat.</p><p>This realization becomes a foundation for building other interesting things. Let me give you a simple example.  ChatCraft takes advantage of GPT's ability to create <a href="https://mermaid.js.org/">Mermaid diagrams in Markdown</a>, and lets us render visual graphics:</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/05/Screenshot-2023-05-15-at-1.39.40-PM.png" class="kg-image"><figcaption>Example Mermaid Diagram in ChatCraft</figcaption></figure><p>It can create <a href="https://twitter.com/tarasglek/status/1656223829570592769">some really complex diagrams</a>, which makes understanding difficult relationships much easier.  But it also makes silly mistakes:</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/05/Screenshot-2023-05-15-at-2.43.49-PM.png" class="kg-image"><figcaption>Syntax Errors in Pie Chart Diagrams</figcaption></figure><p>In these examples, our inline renderer has blown up trying to render a diagrams with syntax errors (i.e., the numeric values shouldn't include <code>%</code>, <a href="https://mermaid.js.org/syntax/pie.html">per the docs</a>).  For a while, I was embracing the typical notion of what a "chat" should be and pointing out the error.  "My apologies, you're right..." would come the reply, and the error gets fixed.</p><p>But when I'm the only author in the chat, I should be able to manipulate and edit any response, be it mine or the AI's.  Fixing those graphs would be as simple as adding an EDIT button I can click to fix anything in an AI's response, thus unlocking my follow-up messages:</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/05/Screenshot-2023-05-15-at-2.49.59-PM.png" class="kg-image"><figcaption>Fixed Pie Diagram</figcaption></figure><h2 id="mixing-multiple-ai-models">Mixing Multiple AI Models</h2><p>Another idea that becomes possible is swapping out the AI for one or more of the messages in the chat.  This would be unthinkable in a real chat, or even when chatting to an AI using a commercial product (why would anyone let me bring their competitor's AI into this interaction?).  But in an open source app, I should be able to move effortlessly between chatting with ChatGPT, GPT-4, Claude, Bard, etc.  I owe no fielty to an API provider.  I should be able to pull and mix responses from various AI backends, leveraging different AI models where appropriate for the current circumstance.</p><p>Speaking of different models and context, I've also been thinking about context windows.  While chatting with ChatGPT in ChatCraft, I often hit the 4K token limit (it's 8K with GPT-4).  Rarely am I asking a question and getting an answer.  More often than not, it's a slow evolution of an idea or piece of code.  Up til now, that's meant I have to start manually pruning messages out of the chat to continue on.  But I've realized that I could implement a sliding context window, which would allow me to chat indefinitely with a ~4K context window that includes the most recent messages.</p><h2 id="working-with-data">Working with Data</h2><p>Part of what takes me over the 4K/8K limit is including blocks of code.  I usually write to ChatCraft the way I'd discuss something in GitHub: Markdown with lots of code blocks.  I even find myself copy/pasting 3 or four whole files into a single message.  It's made me realize that I need to be able to "upload" or "attach" files directly into the chat.  I want to talk about a piece of code, so let me drag it into the chat and have it get included as part of the context.  If I want to deal with it as piece of text, I can still copy/paste it into my message; but if all I want is for it to ride along with the rest of what I'm discussing, I should be able to add it easily.</p><p>The same is true for other kinds of data.  ChatCraft can already render HTML, to build things like charts:</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/05/Screenshot-2023-05-15-at-3.11.49-PM.png" class="kg-image"><figcaption>Rendering a Chart.js line chart in ChatCraft</figcaption></figure><p>Maybe I want to draw a graph using a bunch of CSV data.  What if I could drag or attach that data into the chat just like I mentioned above with code?  "I need a line graph of this CSV data..."</p><h2 id="conclusion">Conclusion</h2><p>I want to build a bunch of this, but thought I'd start by writing about it.  Using ChatCraft to build ChatCraft has evolved my understanding of what I want in an AI, and it's fun to be able to prototype and explore your own ideas without having to wait on features (<a href="https://twitter.com/humphd/status/1656372230018723877">or even access!</a>) from big AI providers.</p>]]></content:encoded></item><item><title><![CDATA[ChatCraft.org]]></title><description><![CDATA[<p>I've been continuing my experiments with AI development.  I <a href="https://blog.humphd.org/adventures-with-chatgpt/">wrote previously</a> about my attempts to use ChatGPT more intentionally, as a way to better understand how my students are encountering it.  Since then, I've been focusing on contributing to <a href="https://chatcraft.org/">https://chatcraft.org/</a> and wanted to talk about what it is</p>]]></description><link>https://blog.humphd.org/chatcraft-org/</link><guid isPermaLink="false">6459376e4a8ffa03eff4c8c9</guid><category><![CDATA[seneca]]></category><category><![CDATA[mozilla]]></category><dc:creator><![CDATA[David Humphrey]]></dc:creator><pubDate>Tue, 09 May 2023 11:40:00 GMT</pubDate><content:encoded><![CDATA[<p>I've been continuing my experiments with AI development.  I <a href="https://blog.humphd.org/adventures-with-chatgpt/">wrote previously</a> about my attempts to use ChatGPT more intentionally, as a way to better understand how my students are encountering it.  Since then, I've been focusing on contributing to <a href="https://chatcraft.org/">https://chatcraft.org/</a> and wanted to talk about what it is and what it's been like to build it.</p><p>My recent AI posts prompted an old friend from my Mozilla days (<a href="https://twitter.com/tarasglek?lang=en">Taras Glek</a>) to reach out on Twitter.  He wanted to talk about our shared interest in AI and programming.  Both of us learned to program long before AI, but we also see the tremendous potential of using AI to accelerate our work going forward.  We've also been finding that many of our colleagues and peers aren't as interested as we are, and having someone else to talk to and work with on this stuff has been important.</p><p>Taras wanted to show me an experiment he'd been building to create his own open source, programming-focused, ChatGPT web client.  It already had a lot of cool features like being able to render <a href="https://mermaid.js.org/">Mermaid diagrams</a> and HTML in responses from the ChatGPT API.  He'd also hooked up <a href="https://github.com/hwchase17/langchainjs">langchainjs</a>, which is a project I've been following with interest.</p><p>Seeing a pure browser-based web app (no server-side code) really inspired me.  For some reason, all of my AI work thus far has been done using two-tier web apps or with node.js all on the server.  I don't know why it never occurred to me to do all of this in the browser.  Seeing what Taras was doing, it all suddenly clicked for me: I really want my AI to be in the browser.</p><p>Before this, I was using various AI Assistants in VSCode to see what that's like.  I've tried <a href="https://about.sourcegraph.com/blog/open-sourcing-cody">Cody from Sourcegraph</a>, <a href="https://aws.amazon.com/codewhisperer/">Amazon CodeWhisperer</a>, and a few more.  So far this process has convinced me that what I really want is the ability to reason with, and explore ideas in code with an AI vs. having it dump suggestions in my editor.  I love having this be a browser tab vs. an editor extension.</p><p>Like me, Taras had started using GPT via the OpenAI Playground.  We both loved it.  You could try things in a web page, use it or delete it, and keep trying again until you were happy. It was so easy to experiment.  The ephemeral nature of the output (nothing being saved, not integrated with anything you're working on) encouraged playfulness and exploration. Then OpenAI brought out ChatGPT.  I don't need to tell you what it is.  Again, the <a href="https://twitter.com/humphd/status/1646320648635969539">"it's just a website"</a> phenomenon really struck me.</p><p>Rather than give up on his own UI and using ChatGPT, Taras kept going with what he was building.  He wanted to know if I'd help him with the UI.  It felt like old times, when I was building <a href="https://blog.humphd.org/vocamus-557/">DXR (a web UI) on top of his Dehydra</a> gcc plugins.</p><p>Since then I've been working with Taras to rebuild the UI for what's become <a href="https://chatcraft.org">chatcraft.org</a>.  It's now gotten to the point that I only use it vs ChatGPT or VSCode assistants.  I like how much freedom it gives: paste in your OpenAI API key and you're ready to go.  No logins, annoying rate limits, and the UI and responses are tailored to what a programmer wants vs. being a general purpose chatbot.</p><p>I've also loved being able to build it the way that makes sense to us.  I don't have to wait on OpenAI or some other company to give me what I want--I can build it myself.  The cheap and ready access we have to the underlying models, and the flexibility of the web as a UI and rendering platform is amazing.</p><p>Another unexpected benefit of having an AI-based project is that it's helped me get over the hump of using AI to program.  I don't naturally think to use AI when programming: I've never had access to one in the past, and old habits die hard.  However, writing an AI app has made it obvious that I should be using AI to build it.  I've had all kinds of help from ChatGPT and GPT-4 while writing the code.</p><p>When I'd get stumped on something, I paste in the code and start talking about my bugs.  Because I work in Markdown, it's very similar to writing issues on GitHub.  Often I get what I need back: a push in the right direction and sometimes complete code as well.  I've also been amazed at how it has been able to replace automate tests.  For example, the other day I was working on a bug in the syntax highlighting code, and I worked with ChatCraft on ChatCraft.  I'd ask it for examples of code blocks, fix the code, repeat, ask about bugs I was seeing, fix things, repeat.  Using the app as an AI-REPL is extremely productive and unlike any programming I've done before.  It's like assembling a robot with the robot's help.</p><p>I'm excited to try using it for some other AI experiments over the next few months. I have a few other collaborations I'm wanting to do with friends who are interested in AI, and I'm going to suggest we use ChatCraft to do the work.</p><p>ChatCraft.org is still pretty young, but I love it and wanted to share.  If you'd like to <a href="https://chatcraft.org">give it a try</a> and <a href="https://github.com/tarasglek/chatcraft.org">contribute</a>, please do.  Let Taras and I know what you think.</p><p>UPDATE: Taras has also <a href="https://taras.glek.net/post/chatcraft.org-opensourcechatgpt-for-devs/">written his own post</a> about ChatCraft.</p>]]></content:encoded></item><item><title><![CDATA[Starchart 1.0.0: My.Custom.Domain]]></title><description><![CDATA[<p>Over the Winter 2023 semester, my open source students and I have been working together to build a new project.  This past week we were able to make our 1.0.0 release and get it into production.  I wanted to write about what we made and how we made</p>]]></description><link>https://blog.humphd.org/starchart-1-0-mycustomdomain/</link><guid isPermaLink="false">6447da484a8ffa03eff4c73a</guid><category><![CDATA[seneca]]></category><dc:creator><![CDATA[David Humphrey]]></dc:creator><pubDate>Tue, 25 Apr 2023 15:07:57 GMT</pubDate><content:encoded><![CDATA[<p>Over the Winter 2023 semester, my open source students and I have been working together to build a new project.  This past week we were able to make our 1.0.0 release and get it into production.  I wanted to write about what we made and how we made it.</p><p><a href="https://mycustomdomain.senecacollege.ca/">My.Custom.Domain</a> is a web app that allows Seneca faculty and students to create custom domains and obtain SSL certificates without spending any money or having to give any third-party personal information.  The project itself we call <a href="https://github.com/DevelopingSpace/starchart">Starchart</a>.  We built the app in partnership with Seneca ITS, who chose the My.Custom.Domain name, and have provided all our infrastructure and production support.</p><h2 id="a-tour-of-my-custom-domain">A Tour of My.Custom.Domain</h2><p>I'll start with a quick tour of what the app looks like.</p><p>Users begin by authenticating using their Seneca SSO accounts:</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/04/Screenshot-2023-04-25-at-9.57.02-AM.png" class="kg-image"><figcaption>Initial login screen</figcaption></figure><p>After authenticating with Seneca's Active Directory via SAML2, users are taken to the landing page (I'm logged in as a regular user below in development, because my UI looks different as an admin in production):</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/04/Screenshot-2023-04-25-at-9.58.31-AM.png" class="kg-image"><figcaption>Landing page for managing DNS Records and Certificate</figcaption></figure><p>Users can create and manage DNS Records (A, AAAA, TXT, or CNAME) using a <code>{name}.{user}.mystudentproject.ca</code> domain:</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/04/Screenshot-2023-04-25-at-10.00.05-AM.png" class="kg-image"><figcaption>Managing DNS Records</figcaption></figure><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/04/Screenshot-2023-04-25-at-10.03.41-AM.png" class="kg-image"><figcaption>Creating a new DNS Record</figcaption></figure><p>Users can request a new SSL Certificate, which is created for all of their subdomains (i.e., <code>*.{user}.mystudentproject.ca</code>).  The process takes some time, so we send a notification email when it's done:</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/04/Screenshot-2023-04-25-at-10.04.58-AM.png" class="kg-image"><figcaption>Certificate Request being Processed</figcaption></figure><p>Once the certificate is issued, we provide instructions and various ways to download, view, or copy it.  Users can also renew the certificate when it approaches its expiry date:</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/04/Screenshot-2023-04-25-at-10.07.56-AM.png" class="kg-image"><figcaption>Working with a Certificate</figcaption></figure><p>Finally, admin users can "impersonate" regular users to help fix problems.  In the screenshot below, I'm logged in as "user3" (an admin user) but my <strong>effective username</strong> is "user1" (a regular user):</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/04/Screenshot-2023-04-25-at-10.08.57-AM.png" class="kg-image"><figcaption>Admins can Impersonate Regular Users to Help Solve Problems</figcaption></figure><p>We also provide information and instructional pages about DNS Records and SSL Certificates to users who are new to the concept (most of our users will be):</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/04/Screenshot-2023-04-25-at-10.12.39-AM.png" class="kg-image"><figcaption>Informational Pages to Help Users</figcaption></figure><h2 id="building-the-app">Building the App</h2><p>The app was built by a passionate team of developers:</p><table>
<thead>
<tr>
<th>Name</th>
<th>GitHub</th>
<th>Blog</th>
</tr>
</thead>
<tbody>
<tr>
<td>Stefan Frunza</td>
<td><a href="https://github.com/sfrunza13">sfrunza13</a></td>
<td><a href="https://dev.to/sfrunza13">https://dev.to/sfrunza13</a></td>
</tr>
<tr>
<td>Mario Leonardo</td>
<td><a href="https://github.com/Ririio">Ririio</a></td>
<td><a href="https://dev.to/ririio">https://dev.to/ririio</a></td>
</tr>
<tr>
<td>Denes Adam Dolhay</td>
<td><a href="https://github.com/dadolhay">dadolhay</a></td>
<td><a href="https://dev.to/dadolhay">https://dev.to/dadolhay</a></td>
</tr>
<tr>
<td>Wonkeun No</td>
<td><a href="https://github.com/Genne23v">Genne23v</a></td>
<td><a href="https://dev.to/Genne23v">https://dev.to/Genne23v</a></td>
</tr>
<tr>
<td>Chen-Yuan Chu</td>
<td><a href="https://github.com/cychu42">cychu42</a></td>
<td><a href="https://dev.to/cychu42">https://dev.to/cychu42</a></td>
</tr>
<tr>
<td>Taimoor Dawami</td>
<td><a href="https://github.com/SerpentBytes">SerpentBytes</a></td>
<td><a href="https://dev.to/tdaw">https://dev.to/tdaw</a></td>
</tr>
<tr>
<td>Eakampreet Singh</td>
<td><a href="https://github.com/Eakam1007">Eakam1007</a></td>
<td><a href="https://dev.to/eakam">https://dev.to/eakam</a></td>
</tr>
<tr>
<td>Tymur Levtsun</td>
<td><a href="https://github.com/Myrfion">myrfion</a></td>
<td><a href="https://dev.to/myrfion">https://dev.to/myrfion</a></td>
</tr>
</tbody>
</table>
<p>We also had fantastic contributions from various community members:</p><ul><li><a href="https://github.com/shawnyu5">Shawn Yu</a> </li><li><a href="https://github.com/sirinoks">Alex Romanova</a></li><li><a href="https://github.com/manekenpix">Josue Quilon Barrios</a></li><li><a href="https://github.com/sykoivisto">Sy Koivisto</a></li></ul><p>We also worked closely with Seneca ITS, represented by Mehrdad Ziaei.  Mehrdad was fantastic to work with, and helped me solve many infrastructure and deployment issues, providing us with:</p><ul><li>Load Balancing, Domains, and Clusters for Staging and Production deployments, including shell and sudo access</li><li>SAML2 Active Directory Access</li><li>MySQL Staging and Production databases</li><li>AWS IAM Users and Route53 Hosted Zones for Staging and Production</li><li>Root domains (i.e., mystudentproject.ca)</li><li>Email Accounts for notifications on Staging and Production</li></ul><p>I think it's important to call out how successful this partnership was, since my colleagues and I often like to complain about how difficult ITS is to work with when trying to build our custom infrastructure.  While it's true that we often don't understand each other, when we actually do collaborate on a project together, it's amazing how fast it can get done.</p><p>I also relied on Chris Tyler and his Linux knowledge, to support me in properly configuring various things in production.  Thank you, Chris!</p><h2 id="technologies">Technologies</h2><p>Doing these term-long open source projects with my students, I often try to experiment with new technologies.  It's a great way for me to get caught-up on new ways of working in our industry.</p><p>This term was no different, and we worked with a bunch of things that were new to me, including:</p><ul><li><a href="https://remix.run/">Remix.run</a> for our front-end and back-end. I really like Remix.  It's a great fit for how my brain thinks about web development, with the flattening of front-end and back-end code, emphasis on web platform features over JS when possible, tight integration of all the various pieces, etc.  The only thing I didn't love was the way the development hot-reloader made some server-side patterns hard or impossible.  In the future, I'd probably break some pieces out into separate containers vs. trying to do it all as a Remix app. </li><li><a href="https://www.typescriptlang.org/">TypeScript</a>, which I've used lots before, but never with students. This went quite smoothly.  Our students take many C/C++ courses in parallel to learning JS, so TypeScript seems to fit nicely into their understanding of how a programming language should work.  The added benefits in reviews and tests, where TypeScript helped us solve problems before we merged them was amazing.  I'll add more TypeScript to future courses.</li><li><a href="https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/clients/client-route-53/">AWS and the AWS SDK for Route53</a> with <a href="http://docs.getmoto.org/en/latest/docs/services/route53.html">Moto for mocking Route53</a> locally</li><li><a href="https://letsencrypt.org/">Let's Encrypt</a> for SSL Certificates, and their <a href="https://github.com/letsencrypt/pebble">Pebble containers</a> for testing</li><li><a href="https://nodemailer.com/about/">Nodemailer</a> for email notifications and <a href="https://github.com/mailhog/MailHog">MailHog</a> for mocking and testing</li><li><a href="https://chakra-ui.com/">Chakra-UI</a> for front-end components and styles.  I think Chakra-UI is the best React component framework I've ever used.  I'm using it on other projects now too, and it's so productive. </li><li><a href="https://www.prisma.io/">Prisma</a> for our ORM.  Prisma was great.  I've suggested we add it to our node.js course, where students learn about ORMs</li><li><a href="https://redis.io/">Redis</a> and <a href="https://bullmq.io/">BullMQ</a> for background job processing.  I love Redis, and I still find BullMQ needlessly complicated in places.  But we made it all work.</li><li><a href="https://samlify.js.org/#/">Samlify</a> for our SAML SSO code and <a href="https://hub.docker.com/r/kristophjunge/test-saml-idp/">SimpleSAMLphp</a> for mocking and testing</li><li><a href="https://playwright.dev/">Playwright</a> for E2E testing.  Playwright gets better and better with every release.  I can't say enough positive things about it.</li><li><a href="https://vitest.dev/">Vitest</a> for Unit Testing.  I've been bumping into Vite and Vitest more and more lately, and it's been pretty smooth.</li><li>Docker, docker compose, and docker swarm for development, testing, and deployment.  I've used and taught Docker for many years, but this was my first experience using Docker swarm.  It's so good.  I don't understand why more people don't use it for smaller apps like this.  The way it manages secrets, does rolling updates, handles load balancing, etc. is all amazing.</li><li>GitHub Actions for CI/CD.  We ran into the usual small bugs, but it's still amazing to use, and did everything we wanted.</li><li>Lots of other things that I won't bother to mention, but you can see in <a href="https://github.com/DevelopingSpace/starchart">our repo</a></li></ul><p>I'm really happy with the tech stack, and would use just about all of this again in future projects.</p><h2 id="becoming-a-user-and-maintainer">Becoming a User and Maintainer </h2><p>I'm excited to start using My.Custom.Domain with my students.  In future versions of my AWS class, I want to have the students use it to work with HTTPS based APIs, which is something you have to pay for in AWS (not supported in the Learner Lab).</p><p>I'm also keen to get new students involved in maintaining this code.  I think that one of the benefits of projects like this is that it allows students to jump into different parts of the software lifecycle.  Building something takes one kind of skill, but maintaining is another.</p><p>If you're a Seneca prof or student, and want to try using this, please let me know.  Also, if you're interested in getting involved in the project, we still have a <a href="https://github.com/DevelopingSpace/starchart/issues">bunch of issues to fix</a>.</p><p>It feels good to get to this point in the term and have a project ship with so little fuss.  Not every software project is the right scope for a group of students to do in a semester.  It's fun when you can find one that fits like this.</p><p>Now go and make some domains and certificates!</p>]]></content:encoded></item><item><title><![CDATA[Unnecessarily Hard]]></title><description><![CDATA[<p>Denes and I have been trying to figure out a bug with our Starchart DNS code.  After hooking up the <a href="https://dev.to/dadolhay/week-10-completing-an-order-333n">Let's Encrypt Certificate flow</a>, we were finally in a position to test working with <code>_acme-challenge</code> <code>TXT</code> records in Route53.</p><p>Locally and in CI, we've been using the amazing <a href="http://docs.getmoto.org/en/latest/docs/services/route53.html">moto route53</a></p>]]></description><link>https://blog.humphd.org/unnecessarily-hard/</link><guid isPermaLink="false">6422f5964a8ffa03eff4c631</guid><category><![CDATA[seneca]]></category><dc:creator><![CDATA[David Humphrey]]></dc:creator><pubDate>Tue, 28 Mar 2023 14:56:59 GMT</pubDate><content:encoded><![CDATA[<p>Denes and I have been trying to figure out a bug with our Starchart DNS code.  After hooking up the <a href="https://dev.to/dadolhay/week-10-completing-an-order-333n">Let's Encrypt Certificate flow</a>, we were finally in a position to test working with <code>_acme-challenge</code> <code>TXT</code> records in Route53.</p><p>Locally and in CI, we've been using the amazing <a href="http://docs.getmoto.org/en/latest/docs/services/route53.html">moto route53</a> server to test things.  It's given us the ability to develop the majority of the code quickly, but production is obviously different than mocking.  Route53, and DNS in general, is hard to simulate.  There's just so much that can go wrong, so many interconnected pieces, timing issues, etc.</p><p>Over the weekend we got into a loop of finding a bug and fixing a bug, finding another, fixing another. Eventually this has to work, right? How many can there be...</p><p>The process of finding, debugging, and fixing these bugs with AWS is unnecessarily hard.  I really like AWS and I even teach an upper-semester course on it.  But I feel like AWS makes things harder than they need to be for no apparent benefit.</p><p>Let me show you what I mean with one of the bugs.  Let's Encrypt needs us to set these  <code>_acme-challenge</code> <code>TXT</code> records in Route53.  I need to put the string value <code>8cV4hs2A8VmH3a2f2QYkvANYtXZWm9I93kUXYZtiGgE</code> into <code>_acme-challenge.whatever.com.</code> as a <code>TXT</code> record.  To do this in node.js, you need to use the AWS SDK and the <a href="https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/clients/client-route-53/classes/changeresourcerecordsetscommand.html">ChangeResourceRecordSetsCommand</a>.  It lets you specify an array of changes you want to apply: <code>CREATE</code>, <code>DELETE</code>, <code>UPSERT</code>.  For example:</p><pre><code>// The following example creates a resource record set that routes Internet traffic to a resource with an IP address of 192.0.2.44.
const input = {
  "ChangeBatch": {
    "Changes": [
      {
        "Action": "CREATE",
        "ResourceRecordSet": {
          "Name": "example.com",
          "ResourceRecords": [
            {
              "Value": "192.0.2.44"
            }
          ],
          "TTL": 60,
          "Type": "A"
        }
      }
    ],
    "Comment": "Web server for example.com"
  },
  "HostedZoneId": "Z3M3LMPEXAMPLE"
};
const command = new ChangeResourceRecordSetsCommand(input);
const response = await client.send(command);</code></pre><p>This is creating an <code>A</code> record with a single IP address <code>Value</code>.  We need to use a <code>TYPE</code> of <code>TXT</code>, but there is no example or mention of this.  Surely that means it's the same, right?  The API docs seem to imply this, noting of <code>Value</code>:</p><pre><code>ResourceRecords: [
  {
    Value: "STRING_VALUE", // required
  },
],</code></pre><p>"So, <code>ResourceRecords</code> is an <code>Array</code> of <code>Objects</code> with a <code>string Value</code>, go it."  No other details or examples of working with <code>TXT</code> records.</p><p>Off we go, and despite everything we throw at it, back comes a <code>400</code> error with <code>InvalidChangeBatch</code>.  Let's check the docs on this error:</p><pre><code>Throws: InvalidChangeBatch (client fault)

This exception contains a list of messages that might contain one or more error messages. Each error message indicates one error in the change batch.</code></pre><p><em>"Might contain,"</em> eh?  I'll skip ahead and tell you that it does not contain anything useful:</p><pre><code>"Code":"InvalidChangeBatch", "Type":"Sender", "name":"InvalidChangeBatch"</code></pre><p>Alright, so we're doing something wrong, but what?  We try half-a-dozen things, and fix some other bugs, but nothing will unlock this <code>InvalidChangeBatch</code> problem.  Now we're neck deep in AWS browser tabs, with articles about every possible way to deal with Route53 other than what we need to do.</p><p>Eventually, deep in a Stack Overflow comment (not the answer!) I see someone toss out a lifeline: <em>"FYI, I needed to wrap my value in <code>\"...\"</code> in case that helps anyone."</em>  That not only helps me, but <strong>it's what should have been written in the official API docs to begin with</strong>!  Why is this so hard to find?</p><p>Armed with this new crumb of information, I again go searching in the AWS documentation labyrinth, and discover <a href="https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/ResourceRecordTypes.html#TXTFormat">this</a>:</p><blockquote>A TXT record contains one or more strings that are enclosed in double quotation marks (<code>"</code>).</blockquote><p>Furthermore, there's all kinds of other special cases you should deal with, including:</p><pre><code>A single string can include up to 255 characters, including the following:
  - a-z
  - A-Z
  - 0-9
  - Space
  - (hyphen)
  - ! " # $ % &amp; ' ( ) * + , - / : ; &lt; = &gt; ? @ [ \ ] ^ _ ` { | } ~ .

If you need to enter a value longer than 255 characters, break the value into strings of 255 characters or fewer, and enclose each string in double quotation marks ("). In the console, list all the strings on the same line:

"String 1" "String 2" "String 3"

For the API, include all the strings in the same Value element:

&lt;Value&gt;"String 1" "String 2" "String 3"&lt;/Value&gt;

The maximum length of a value in a TXT record is 4,000 characters.</code></pre><p>So, in other words, a bit more than <code>Value: "STRING_VALUE", // required</code>.  There's no mention of any of this in the API "docs."</p><p>As we're looking through this list of DOs and DON'Ts for a <code>TXT</code> record value, I ask myself: "surely someone has written this code already, I wonder where we can get it?"  Denes jokingly says, "you mean like having it in the AWS SDK?"  I laugh.  of course it isn't there.  Why would this be part of the SDK?  But seriously, AWS knows that I'm setting a <code>TXT</code> record value <code>Type</code>, and when I give them a <code>Value</code>, why not simply do the right thing?  Or at the very least, expose a function that formats a <code>TXT</code> record according to <em>their own specifications</em>!  Or give me an error message that says "invalid TXT value, USE QUOTES!"</p><p>It doesn't need to be this hard.  These docs could link to each other, so you can find the details instead of relying on the kindness of strangers on the internet.  The errors could actually tell you what's wrong, even link to URLs with the info you need.  The API could expose methods to help you get your job done and show you examples for edge cases (are <code>TXT</code> records really an edge case?).</p><p>If AWS was some fledgling startup who just shipped a beta and hadn't had time to get to the documentation, or if Route53 was some new service that hadn't been battle-tested yet, I'd be more sympathetic.  But that's not what this is.  This is unnecessarily hard.</p>]]></content:encoded></item><item><title><![CDATA[A Week with ChatGPT]]></title><description><![CDATA[<p>ChatGPT continues to rule the news cycle.  It's also increasingly finding its way into my work on our department's Academic Integrity Committee, where we've seen it pop up more and more in reported cases: students using it on tests, both paper-based (cell phone pic of a question uploaded to a</p>]]></description><link>https://blog.humphd.org/adventures-with-chatgpt/</link><guid isPermaLink="false">642069764a8ffa03eff4c34b</guid><category><![CDATA[seneca]]></category><dc:creator><![CDATA[David Humphrey]]></dc:creator><pubDate>Sun, 26 Mar 2023 18:50:12 GMT</pubDate><content:encoded><![CDATA[<p>ChatGPT continues to rule the news cycle.  It's also increasingly finding its way into my work on our department's Academic Integrity Committee, where we've seen it pop up more and more in reported cases: students using it on tests, both paper-based (cell phone pic of a question uploaded to a WhatsApp group, where a friend types it into ChatGPT) and electronic, as well as in assignments.  Much of the academic dialog online and at our institution has been focused on ChatGPT as a vector for cheating.</p><p>Meanwhile, a decade's worth of innovation has happened in the past few weeks as an arms-race plays out for control of AI in big tech.  I was trying to keep track of everything I saw being released, but it's impossible.  The rate of change is beyond anything I've ever experienced.  The idea that an instructor or institution can write a policy that gets on top of this is laughable.</p><p>Despite my concerns over academic integrity, I'm more interested in understanding how to properly use AI in teaching, learning, and development. As a result, I've been trying to take inspiration from people like <a href="https://twitter.com/simonw/status/1639691902357483520">Simon Willison</a>, who is using ChatGPT and Copilot to learn new technologies (e.g. <a href="https://simonwillison.net/2022/Dec/5/rust-chatgpt-copilot/">Rust</a>, <a href="https://simonwillison.net/2022/Dec/5/rust-chatgpt-copilot/">AppleScript</a>) and taking notes as he goes.</p><p>So this past week I challenged myself to try and use ChatGPT in order to better understand how my students are encountering it, and what lessons I could learn in order to teach them how to make better use of the responses they are getting.</p><p>I started the challenge by <a href="https://github.com/humphd/chatcss">building ChatGPT in CSS</a> for my second-semester web programming class.  We were studying CSS Layout, and the ChatGPT site is a fairly simple one to create, using the topics for the week: </p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/03/image.png" class="kg-image"><figcaption>ChatCSS - recreating ChatGPT in CSS for my Web Programming class</figcaption></figure><p>Writing the code, I spent a long time using their interface. I was struck by the <strong>Limitations</strong> column, which I doubt most people have read:</p><ul><li>May occasionally generate incorrect information</li><li>May occasionally produce harmful instructions or biased content</li></ul><p>These are <strong>incredible</strong> warnings to see displayed on a web site.  <em>"This site might harm you."</em>  Wow!  We could argue that this warning should be placed on lots of the web, but to see it called out like this is fascinating and not something I encounter very often.  I thought about that a lot this week.</p><p>I went to class on Monday, plugged my laptop in, and started with this ChatCSS tab open.  <em>"Does anyone recognize this site?"</em>  Silence! Eyes down. Nervous laughter.  It was like I had just begun a lecture about sex.  It's clear that ChatGPT has quickly earned taboo status in academic discourse between professors and students.  That's too bad, because I believe it needs to get discussed openly.</p><p>Spending two lectures this week building the ChatCSS UI and talking about how it works allowed me to start to engage with the students, hear their ideas, discuss their questions.  They are absolutely using it, but mostly having to hide that fact.  They need help navigating this new world.  From this experience I realized that I need to talk about it more, not less.</p><p>Next, I did a bunch of reviews for my open source students working on the <a href="https://github.com/DevelopingSpace/starchart">Starchart</a> DNS/SSL project.  A number of times as I read their code, I asked questions about what I was seeing, and the answer came back, <em>"ChatGPT recommended this."</em>  I've never encountered this before.  ChatGPT was a third entity in our conversations, always <em>there</em> but also completely absent.  For example, it wanted to use a <code>403</code> HTTP status code to indicate that a resource being created asynchronously isn't ready yet (I'd use a <code>409</code>).  Or writing regular expressions to cover validation cases on IPv4, IPv6, and domain names.  I wanted to know about edge cases that were missed.  Having ChatGPT's output, but not ChatGPT itself, made the process a bit painful.  The code raised questions that couldn't be answered, which felt unsatisfying.  I learned that detaching text from a ChatGPT response, but still insisting that "ChatGPT wrote it" won't work.  Once you pull that text out of the ChatGPT window, you have to own it, answer for it, clean up after it.</p><p>I had a number of times this week were I struggled to name something in code, and I tried using ChatGPT to help me explore possible options.  For example, we have been dealing with overlaps between <code>name</code> vs. <code>subdomain</code> vs. <code>fqdn</code> in our Starchart code.  I didn't end up using a lot of what it suggested, but I think the exercise of forcing myself to get it out of my head, and into written form, helped me come to a decision faster.  In this way, using ChatGPT was a forcing function.  Much as writing an email to a colleague, or discussing with peers on Slack, writing my problem out kickstarted the problem-solving process.</p><p>Later in the week I used ChatGPT to solve a real problem I had.  I was struggling to write some GitHub Actions code for a Docker build and webhook deployment flow.  GitHub Actions are annoying to test, because you have to do it for real vs. running things locally.  I couldn't figure out or remember how to do the following:</p><ul><li>Interpolate environment variables into JSON strings</li><li>Get a short git sha (I only wanted the first part of the string)</li></ul><p>I found ChatGPT was eventually able to get me where I needed to be, but it took half-a-dozen questions/responses to get there.  For example, it told me to use the <code>substring</code> function:</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/03/Screenshot-2023-03-26-at-12.25.04-PM.png" class="kg-image"></figure><p>However, there is <a href="https://docs.github.com/en/actions/learn-github-actions/expressions">no such function</a> (GitHub, please add it!), so I had to press it:</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/03/Screenshot-2023-03-26-at-12.26.02-PM.png" class="kg-image"></figure><p>It did help, and I was able to write my workflow YAML file and get automatic deploys to work.</p><p>Next, I struggled to make sense of the <a href="https://docs.bullmq.io/">BullMQ API</a> for dealing with errors in flow jobs.  We have a series of asynchronous worker jobs that run in the background, and we need to chain them.  When one of them fails, we need to execute an error handler, and the way we were doing it was not working.</p><p>I was pleased to discover that simply mentioning the name of the library, BullMQ, was enough and it knew what I was talking about.  However, once again, I had to be patient to arrive at my desired destination.  I went back and forth with it many times before I could get it to stop hallucinating imaginary APIs and had to explicitly call out its inconsistency:</p><figure class="kg-card kg-image-card"><img src="https://blog.humphd.org/content/images/2023/03/Screenshot-2023-03-26-at-12.30.43-PM.png" class="kg-image"></figure><p>Not only is ChatGPT going to give you incorrect info, but it also has no problem contradicting itself in a single response.  This is something I'm not used to.  If a person did this to me, I'd get annoyed; in fact, I was annoyed in the exchange above.  Humans won't tolerate being lied to or manipulated like this.  This erosion of trust breaks down communication: "How can I trust you if you just lied to me?"  The truth is, you can't trust ChatGPT.  It neither lies nor tells the truth, flitting effortless between the two, giving what Neil Gaiman called "information-shaped sentences" on <a href="https://twitter.com/neilhimself/status/1639610373115375616">Twitter this week</a>.</p><p>You have to approach ChatGPT with zero-trust and plan to verify everything.  Knowing this, I can route around incorrect information, and try to aim deeper at the truth I really need.  This tells me that using responses in areas I know nothing about is going to leave me vulnerable to misinformation.  My students, who don't know enough yet to push back at the AI, will struggle when it gives reasonable but false code.</p><p>Later in the week, I asked it to rewrite a TypeScript file I was reviewing.  I wanted to see what other styles it could imagine for the code, which was somewhat complex to read.  It did a nice job of renaming things, switching loop types to make it more readable, using early returns (amen!), etc.  It also broke the logic of the code.  When I shared it with the original author, he wasn't impressed.  Where I found the <em>style</em> vs. the <em>substance</em> of the code interesting, he was 100% focused on the <em>substance, </em>which had problems.  This taught me that I need to be clear about my expectations for the use of the text I share from ChatGPT: am I looking for truth or form or both?  What does my audience need and expect?</p><p>Next, I had some free time and wanted to figure out how to get <a href="https://github.com/microsoft/CameraTraps#whats-megadetector-all-about">Microsoft's Megadetector</a> model running with node.js and TensorFlow.  I'm really interested in Camera Traps and biodiversity tracking using technology.  I spent a long time going back and forth with ChatGPT on this one, before eventually giving up.  First, it gave me code to use it remotely via Azure vs. locally (not a bad idea, actually).  Once I told it that I wanted to run the model locally, it got me 80% of the way there.  However, it gave me invalid model URLs, combined Python/pip and node.js/npm dependencies, and made-up an npm module for the image processing.  Once again, I had to be willing to sort through the response and push back in order to have it correct its mistakes.</p><p>This can feel uncomfortable, too, since a human at the other end of my needy and critical responses would quickly tire of our conversation.  ChatGPT doesn't, and will happily try again.  Eventually it became clear that the 2021 model cut-off date wasn't going to work with the current state of the Megadetector repo, and I ran out of time.  However, if I'd been willing to keep going a bit longer, I could have got something working.</p><p>Next, I decided to port some Python code I'd read online to <a href="https://ziglang.org/">Zig</a>.  I don't know why I've been interested in Zig lately, but I keep coming back to it (I think where Rust feels too much for me, Zig somehow feels just right).  I was pleasantly surprised to see how easily this process went.  Once again, it hallucinated a mix of pip modules and zig standard library code; however, by now I was used to sorting out the 20% of the response that would need to be thrown away.  I liken the process to working with a senior dev who only has 5 minutes to answer your question: the response they give is "right" but includes a mix of real and pseudo-code that you have to expand and debug yourself.  Knowing how to set my expectations (that I'll almost never be able to use code it returns as-is, that responses won't be idiomatic to the ecosystem I'm targeting and will often mix ideas from different languages) has made the process less frustrating for me.</p><p>Next I tried something completely different and more risky. Having read <a href="https://twitter.com/peakcooper/status/1639716822680236032?t=WQoH_znL-Oa3s-D9qYa8oQ&amp;s=19">this thread</a> about a guy who used ChatGPT to help save his dog's life, I tried doing some medical research.  Given everything I've said above about needing to be able to verify all responses, not being able to trust at least 20% of what comes back, etc. you might think this is insane.  Again, expectation setting is key here.  I can't use what it gives back as-is, but by now I don't plan to ever do that.  Here, I'm interested in having it guide and inform my research.</p><p>I asked about a condition I wanted to know about, starting by asking what the usual way of treating it would be (i.e., without me saying anything). I wanted to see if it could give me information I already know to be true, before going into areas I'm less sure about.  It was able to confirm current treatment options, which I know are correct.  I then asked for the state of the research into newer methods: what's emerging?  It gave me 3 amazing jumping off points, which led me on a fruitful literature review and opened my eyes to some interesting possibilities I hadn't heard about.  Here I needed to pair ChatGPT with research into the scientific literature.  Unlike code, where I can lean on a mix of my own understanding and tools to verify what I'm being told, I have no ability to assess the value of medical information.  One needs to be ready to jump out of ChatGPT and into proper research, but doing so can be a really valuable exercise.  I know from experience that I could never get Google to give me this info (I've tried)--I don't know how to put it into a searchable question format.  However, after working with ChatGPT, Google once again becomes useful, since I have keywords I can use for search.</p><p>Finally, I used ChatGPT to help me quickly do a task that would have been time-consuming to code or do manually.  I had a bunch of CSS, and I wanted a sorted list of all the unique CSS properties I'd used.  I pasted the entire file into ChatGPT and told it to give me exactly that, which it did.  This was probably the most satisfying use I'd had all week.  It was something I knew how to do manually, and could also write a program to do, but I wanted it done fast without doing either of those.  It "just worked" and got it 100% correct.  It made me realize that this kind of language/text programming problem is something I should be feeding to ChatGPT more often.  I do a lot of it while teaching, coding, and in my daily life.</p><p>This week I found that I had to force myself to use ChatGPT, it wasn't my default.  I found the realities of using it both more and less magical than expected.  As predicted, I can't trust it.  However, if I enter the conversation somewhat adversarially, I can often extract something useful.  I'm not talking to AI.  We're not chatting.  It's much more combative, defensive, and frustrating than talking to a person.  I have to verify everything I'm given, which means I have to be in the mood to do that.  I can't be tired and let the response wash over me.  But I also recognize the value of what I was able to do with it this week.  I solved real problems, from technical to academic to personal.  It's interesting and I need to try it more.</p>]]></content:encoded></item><item><title><![CDATA[Starchart 0.7]]></title><description><![CDATA[<p>Today we're shipping <a href="https://github.com/DevelopingSpace/starchart">Starchart</a> v0.7.  This felt like a big one, so I wanted to discuss what we did. I'd say the theme of this release is "2 steps forward, 1 step backward."  I'll discuss some of the highlights.</p><h2 id="ci-cd-and-staging">CI/CD and Staging</h2><p>First, we now have a working</p>]]></description><link>https://blog.humphd.org/starchart-0-7/</link><guid isPermaLink="false">642054ed4a8ffa03eff4c1c7</guid><category><![CDATA[seneca]]></category><dc:creator><![CDATA[David Humphrey]]></dc:creator><pubDate>Sun, 26 Mar 2023 15:40:45 GMT</pubDate><content:encoded><![CDATA[<p>Today we're shipping <a href="https://github.com/DevelopingSpace/starchart">Starchart</a> v0.7.  This felt like a big one, so I wanted to discuss what we did. I'd say the theme of this release is "2 steps forward, 1 step backward."  I'll discuss some of the highlights.</p><h2 id="ci-cd-and-staging">CI/CD and Staging</h2><p>First, we now have a working CI/CD pipeline to staging!  This has taken much longer than I anticipated, and required all sorts of people to lend a hand (thank you to many members of the team who helped me, to Mehrdad in ITS, Chris Tyler, and even ChatGPT!).  Our <a href="https://github.com/DevelopingSpace/starchart/blob/main/.github/workflows/ci.yaml">GitHub Actions Workflow</a> runs all of our tests and static analysis tools, then builds and pushes a Docker image to the <a href="https://github.com/DevelopingSpace/starchart/pkgs/container/starchart">GitHub Container Registry</a>.  We then use a <a href="https://github.com/adnanh/webhook">webhook</a> from GitHub Actions to staging to trigger Docker to update the running Docker swarm to the new image tag.</p><p>ITS asked us to see if we could avoid sending secrets over the network, which we're doing via an HMAC SHA-256 digest on the webhook body.  We have the same encrypted secret on GitHub Actions as well as the staging server, and use it to calculate our digest, which has to match for the webhook to get processed.  It works really smoothly.</p><p>A major benefit of having a live CI/CD pipeline like this is that you can more easily test features as they land.  But testing leads to finding bugs, and this week we found lots of bugs!  For example, we learned that Let's Encrypt won't support <code>_</code>s in domain names (our AWS Route53 Hosted Zone at <code>_stage_</code> in it).  We also learned that our SMTP setup for email notifications, which works great in development, isn't working against Office365.</p><h2 id="reconciling-our-tech-debt">Reconciling our Tech Debt</h2><p>Another thing that testing does is to reveal inadequate architectural choices.  Imagine building a bridge and working from both ends simultaneously toward the middle.  Eventually, a small error on one side is going to reveal itself in the middle when things don't line up the way they should.</p><p>We experienced this when we landed our Let's Encrypt certificate order code this week and tried to use it for the first time on staging.  Denes and I were excited to witness this small victory live, but the server logs told us a different story: our DNS data model and internal API won't work the way we wanted.  This realization has led us to move to a new approach, which Denes spent the weekend hacking together.</p><p>Our DNS works across two systems: 1) we store DNs record data in MySQL; 2) we deploy those records to AWS Route53.  Previously, we did atomic DNS changes.  If someone wanted to add a DNS record, we issued a request, waited for Route53 to sync the change, then updated our MySQL database.  However, we're going to move to a kind of "Virtual DOM" approach.  In <a href="https://legacy.reactjs.org/docs/faq-internals.html">much the same way that React deals with state and applies changes to a virtual DOM</a> before diff'ing the real DOM, we're going to maintain our DNS "state" in MySQL and reconcile that with Route53.  We take the state of the database and our hosted zone's records and do a comparison: if a record exists in Route53 that's not in MySQL, it needs to get deleted; similarly, if a record exists in MySQL but not in Route53, it needs to get created or updated to match.</p><p>Using this new DNS "reconciler," we'll continually sync the state of our database with Route53 in the background, allowing us to simplify the app in the foreground, where we can work exclusively with the database as our "source of truth" and ignore Route53 completely--eventually Route53 will match our database.  It will also allow us to quickly reflect the database into Route53 when we make a move (e.g., like when we moved to a new hosted zone this week to get rid of the <code>_stage_</code> name).</p><p>Making this overhaul at such a late stage of development is a concern.  Do we have enough time to properly shift all of our code over to the new architecture, fix the bugs it causes, and also complete the other existing work we have?  It's a great question!  I'm fairly confident that we can, but it's going to require a bunch of co-ordination across the team to sort out all the various systems that need updating. Changes of this scale aren't without risk and cost.</p><p>Another side-effect of doing this change is that it's going to trigger us to delete a huge amount of code and tests that Won and others worked hard to write over the past 2 months.  I know <a href="https://blog.humphd.org/vocamus-1626/">from experience</a> that it can feel pretty demoralizing to watch your hard work get flushed down the drain.  "What was the point of all that?" And yet, without that amazing code having been written in the first place, there's no way that we could have come to understand what your design should <em>actually</em> look like.  Brook's says it better than I can:</p><blockquote>“Where a new system concept or new technology is used, one has to build a system to throw away, for even the best planning is not so omniscient as to get it right the first time. Hence plan to throw one away; you will, anyhow.”  --Fred Brooks, "The Mythical Man-Month"</blockquote><p>Like all project leaders, I wish I'd figured this out sooner. What if I'd insisted on more integration testing sooner in the process?  Would we be at this stage now, or would it have dragged the inevitable out longer?  The truth is, this always happens. Real software projects are full of this stuff and above all, I'm trying to give the students an authentic experience of real-world, open source software development.  I'd say this is "mission accomplished!"</p><h2 id="other-notable-changes">Other Notable Changes</h2><p>I took a stroll through the changesets from 0.6.0 til today, and it's packed with amazing work by the team.  Here's some of what I saw:</p><ul><li>Won's added code to limit the number of records a user can create, and fixed a bug where we counted the Let's Encrypt <code>TXT</code> records in that value by mistake.</li><li>Tymur fixed the UI buttons for the DNS Records to work properly when they shouldn't be usable and added validation to our create/edit form, which is amazing.</li><li>Eakam chipped away at many of our front-end "papercuts:" fixing the styling on the Renew button in the DNS Records table, improved the front-page's accessibility, creating a first-run experience when users create their first DNS Record.  We also got new icons and a favicon added (goodbye Remix logo!) and fixed some HTML issues that Lighthouse exposed.</li><li>Eakam and Chen-Yuan worked a bunch on our tests and test infrastructure, and we added support for <a href="https://www.youtube.com/watch?v=jF0yA-JLQW0">Playwright's new and very cool <code>--ui</code> flag</a>. </li><li>TD wrote an epic PR to rename DNS Domain to DNS Record across the entire tree, all APIs, data model, etc to standardize our naming.  Won did something similar with the terms Subdomain and FQDN (fully-qualified domain name). We've needed both of these for a while.  He also got our notifications wired up to the DNS worker, which I discussed above.</li><li>Stefan got our SLO flow finalized and fixed a bad bug we had with overly strict cookie settings.  The SAML stuff on staging has been working amazingly well!</li><li>I rewrote our DNS worker flow so it properly dealt with error cleanup, and we learned a lot about BullMQ's flows and error patterns this week. </li><li>Denes finished the Let's Encrypt certificate order flow and landed the first version of our DNS reconciler code, which I discussed above.</li><li>Mario's been working to get the front-end pieces of the certificate order logic landed, which he <a href="https://user-images.githubusercontent.com/32577022/227579673-cbcefb3c-aa38-4b15-9102-38e01f989998.mp4">demoed on GitHub this week</a>. It will land in 0.8 I think.</li><li>Dependabot updated a bunch of our deps.  Good bot!</li></ul><h2 id="toward-0-8">Toward 0.8</h2><p>This week we're going to focus on paying down the tech debt I outlined above, and also try to get the last of our main features in place. Stefan, Tymur and others are working on adding an Admin UI and ability for admin users to assume the role of a regular user.  Mario, Eakam and others are finalizing our UI to add missing copy, include instruction pages for users, etc.  We also need to add the logic for disabling a user, finis implementing all the expiration and notification logic, add error UI, and dozens of other small things.</p><p>There's lots to do, but also lots that's already done.  We're making good progress.</p><p> </p>]]></content:encoded></item><item><title><![CDATA[On Tedium]]></title><description><![CDATA[<p>This week I had a chance to talk with <a href="https://pg.ucsd.edu/">Philip Guo</a>, an associate professor at UC San Diego.  He's been interested in how CS/programming instructors are dealing with the influx of LLMs, like ChatGPT/Copilot, in their classrooms.  After talking with more than a dozen people in the US,</p>]]></description><link>https://blog.humphd.org/on-tedium/</link><guid isPermaLink="false">640e27504a8ffa03eff4c07a</guid><category><![CDATA[seneca]]></category><dc:creator><![CDATA[David Humphrey]]></dc:creator><pubDate>Sun, 12 Mar 2023 20:28:14 GMT</pubDate><content:encoded><![CDATA[<p>This week I had a chance to talk with <a href="https://pg.ucsd.edu/">Philip Guo</a>, an associate professor at UC San Diego.  He's been interested in how CS/programming instructors are dealing with the influx of LLMs, like ChatGPT/Copilot, in their classrooms.  After talking with more than a dozen people in the US, he saw my "<a href="https://blog.humphd.org/cheatgpt/">CheatGPT</a>" post on Hacker News and reached out for my Canadian perspective.</p><p>I wanted to write down some of what we discussed, since the many AI conversations I've been having recently are forcing me to ask myself important questions–I still lack answers, but the process is helping me get closer to where I want to be.</p><p>We started with how I've used these tools.  What I've found most compelling so far as been their ability to help me accelerate difficult tasks.  For example: writing distractors to multiple-choice questions or developing similar types of problems for students to work on, without repeating any of the core ideas (i.e., "I need a question like this one, but make it different").  Philip talked about the idea of something being easy to recognize, while hard (i.e. tedious) to generate.  I also talked about my <a href="https://blog.humphd.org/pouring-language-through-shape/">experiments building an infinite API backend</a>, which strictly adheres to JSON formats. I've also been playing with the new <a href="https://platform.openai.com/docs/guides/chat/chat-vs-completions">Chat completions API</a> to see what I can do with <a href="https://minimaxir.com/2023/03/new-chatgpt-overlord/">the extra System Prompt</a>–so far I've been having good success building a chatbot that helps teach students about coding concepts, but never gives them actual code snippets.</p><p>I also talked about one of my big worries with early semester CS students, namely, that they will overlook the value of going slow while learning, mistaking AI's speed with their own ability.  I don't have a great way to describe this idea yet.  I'm fascinated with how tasks that seem tedious can feel slow for different reasons.  I asked Philip if he ever uses AI when coding: "I tried Copilot when it came out, but at some point I lost access and never felt compelled to go restore it."  This is largely how I feel, too.  I don't really need AI to write code.  It's neat that I can, but in languages where I'm proficient, I can already code at about the same speed that I can type.</p><p>So what about a language where I'm not proficient?  What if I put myself back into the position of my students?  Here I run into a new problem.  How do I judge the quality of the code I'm being presented?  I see lots of people on Twitter enjoying themselves as they prototype new apps in an afternoon with LLMs: "I don't know any Python and I just wrote this!"  I think that's fantastic, and I need to try doing it myself.  But I also know that any software you plan to use beyond prototyping requires all kinds of verification, contextual and domain knowledge, security considerations, bug fixes, etc.  ChatGPT and Copilot are just as likely to make things up and miss important details when writing code as we see with prose hallucination.  I saw someone online discussing how Copilot had written code with a division by zero error, and another showing how it had confused seconds and milliseconds in an algorithm.  There's a reason why even the best programmers get their code reviewed by each other.  Writing a correct, secure, fast program is hard!</p><p>Speaking of code review, I already do a lot of it with my students.  What I've experienced in my open source courses is that, for the most part, students don't know how to do it (yet), and are largely afraid to try (for fear they'll miss things or reveal their own shortcomings).  Running your eyes and mouse over a piece of code isn't code review.  However, a lot of the reason why students struggle to do this well is that they are taught to write, not read code.  We haven't prepared them for it.</p><p>Philip and I talked about how literary educators often rely on having their students read texts much more complex than anything they expect them to write.  I've always subscribed to this approach, requiring my students to work on and in code that is beyond their current skill level.  I've always believed that if you want to become a better writer, you should focus on reading more than writing.  But if our current students aren't being taught to wrestle with large code as a reading exercise, how will they cope with AI generated code that may or may not be valid?  The reality is that new developers <a href="https://twitter.com/Carnage4Life/status/1633449630557171712">don't have the capacity to debug</a> problems in code they already don't understand.  "Look at how much of the code it just wrote for me!"  Amazing.  How many bugs did it include for free? Without a major shift in the way we teach, moving from write-only to read-and-write, I don't see how learning to program is compatible with AI dropping 25-50 lines of code in your editor.  Text isn't a reusable component.  No one can be handed a bunch of text as-is and hope to combine it with what they already have.</p><p>We also discussed the loss of traditional approaches to assessment.  One of my favourite assignments for my programming students is to ask them to write various functions, giving them a set of tests and a comment block that describes how the function should work in prose.  The difficulty I see now is that what I've just described is a prompt for learning, and hence, a prompt for ChatGPT and Copilot.  As a result, marking assignments has become a Turing Test: did you or the AI write this code?</p><p>Back to what I was saying about tedium.  I need to try and understand how to differentiate "slow" as in "I'm learning how to do this and I'm not very fast yet" from "slow" as in "this takes forever, despite my skill level."  We're told that AI is here to free us from the latter.  But in an educational setting, where students are slow in both senses, I need to find compelling ways to help them understand how to build patience as they struggle to acquire new skills.  And I need to do this while not withholding the power of AI to free them from working on pointless exercises.  To pretend that AI has no place in the classroom is ridiculous.  But what are its limits?</p><p>I told Philip that I'm planning to have my open source students work with the ChatGPT API in the fall (or whatever it's called by then), building programs that leverage LLMs to solve their own problems.  I want to see what students would do if they were allowed to work with this technology outside of issues of academic integrity.  I think the only way I'm going to get closer to understanding how to sort this out is to plunge myself into something beyond my own skill level.  I also think I need to do it with my students vs. coming up with a plan on my own.</p><p>I'm enjoying the many chances I've had lately to talk with people about these issues.  There's so much to consider.</p>]]></content:encoded></item></channel></rss>