Learning how to Learn with ChatGPT

By Michael Faber // November 7, 2023

I was going to write up a short 1-page document explaining the differences between ChatGPT's two models: GPT 3.5 and (paid subscription model) GPT4.  I was going to explain some of the under-the-hood differences, like how GPT4 has a larger model size than the previous version, leading to a significantly improved understanding of linguistic structures.  I would have gone on to explain how this, plus the much larger token limit allows users to load in orders of magnitude more information, allowing for much more nuanced prompting and responses that understand more subtlety.  Then, I would add how this has resulted in a model that does much better in zero- and few-shot scenarios - in other words, is more accurate in places where the model hasn't been explicitly trained, and can make more sophisticated generalizations without prior knowledge.  And I would have ended this section on its rote capabilities by talking about how some of the beta features allow for multi-modal inputs, including providing the system with images and data to then process and reply.

The point of all that setup would have been to lay the groundwork for an argument about equity and access.  Most of those improvements, capabilities, and additional sophistication come with the $20/month GPT Plus account, leaving the free version with the previous stats.  (I may have gone on a parenthetical aside here (as I am wont to do) to point out that GPT3.5 is plenty powerful enough for most use cases, and is supremely up to the task of basic information paraphrasing, writing, and analysis.  But then would have pivoted to the next point, which is that) Duke students with means and access to paid tools would still find a potentially inequitable advantage over their peers who cannot.  And yes, this inequality already exists with tools like Chegg or other "homework helper applications", but ChatGPT seems to be something different.  It feels less like another flash in the pan app, and more like a fundamental change in how we access, process, and interact with information, akin to the Internet itself, and probably should be treated as such.  Including how access, or lack thereof, can create unfair learning conditions.

images of computers, labelled with gpt 3.5 and gpt4, illustrating how you can use chatgpt to learn programming

Over the last few weeks, I have used both GPT3.5 and GPT4 to do a variety of tasks, some of which are fine on both, some of which fail miserably on both, and a few that truly show the difference between 3.5 and 4.  On basic tasks like writing drafts, coming up with weekly meal plans, and things like that, of course both are up to the task, and not very instructive on capability.  On the other end of the spectrum, I found a Harvard Physics "Problem of the Week" website which had some very tricky math and physics problems (and more importantly, the solutions).  Time after time, giving it the exact problem, it found creative and exciting ways to get it the solutions horribly wrong.  GPT3.5 and GPT4 were for the most part similarly unsuccessful, though each in their own unique way 💫.  GPT4's LaTeX interpreter at least made its incorrect answers look pretty.

Where I did find some interesting improvement from GPT3.5 to GPT4 was in two key areas: Programming and Data Analysis.  Over the last ten years or so I have had a handful of unsuccessful attempts to learn the programming language Processing, which is a java-based language used primarily for generative art / creative coding.  Essentially, it lets you draw with code (it is super cool).  As someone with a design and tech background, I have tons of ideas for interesting procedural-based designs that could be written in Processing, but always have gotten stymied by the syntax, logic, etc.  So I took a new stab a few weeks ago with ChatGPT at my side, and I made some really significant progress.  It was a whole new experience working toward a visual/process goal without needing to fully understand everything the code was doing.  But eventually, as I added more sophisticated asks, and tried to introduce some nuanced improvements, I hit a wall with GPT3.5.  In my specific case, I was trying to ask it to do two things at once (introduce some noise into the plotting of concentric circles, while also keeping them concentric).  However I prompted and re-prompted, it would continue waffling between providing the noise modification I wanted (and breaking the concentricity), or keeping things concentric and losing that randomness.  

Processing sketch showing spirograph design

So I started again on GPT4 and with the additional subtle language understanding of this model, it flew right past this particular brick wall and on to much more interesting and complex designs.  (Look out maybe for a Roots class on this sometime soon? ¯\_(ツ)_/¯ )  As a small aside, some of the libraries I was using are also newer, so having the more up-to-date knowledge of GPT4 also helped, but probably contributed less to this move forward than the improved model.  Perhaps even more interestingly, I found that I was actually learning Processing along the way.  I could stop and ask things like "here is my understanding of what Perlin Noise is and what it's doing in the context of my code.  Is that a correct way to think about this?" and it would stop, unpack the function, and confirm (or refute) my understanding, essentially teaching me along the way.  

The second area of improvement, and the one that continues to impress (and frustrate - a bit on that part later) is the "Advanced Data Analysis" beta feature in GPT Plus.  This modality allows users to upload datasets, like csv's, and then use natural language to analyze them.  GPT takes the data, along with your prompts, and in real time, writes and executes Python code (which you can view/access if you want, or just let it do its thing in the background), and spits out the results.  In a few minutes we had built data frames for the Roots program, and I was able to ask it to build an algorithm to calculate an "Impact Score" based on attendance and enrollment.  Then provide insights on what kinds of classes have higher impact scores, or what days or times of day have best attendance, etc etc.  None of this is possible with GPT3.5, and you can imagine how it could be valuable to academic endeavors, research, etc.  This step of having the system execute code on your behalf feels like an important step closer to AI Agents, an important milestone in AGI development.

Grain Boulder of Salt time, though.  There is a learned skill to this (which I am by no means a master of yet), and it often had to be led very piecemeal through these calculations in order for me to feel confident that the result was going to be accurate.  It failed when I asked it to jump straight to conclusions off the bat, but was much more successful in step-wise modification and analysis of the data.  For example, asking straight away "Which day is best for Roots classes" did not provide accurate results.  But going through the data structure, explaining what each table did, asking it to build a reasonable algorithm to measure impact, sanity checking the results, cleaning various data that it was struggling with, and then finally asking "which day is best" did get some valid (in the ballpark for sure) insight.  One frustrating caveat to this Advanced Data Analysis mode is that the data you provide, as well as all of its python programs and results, are deleted and the chat is essentially 'reset' whenever you leave for long enough or close the browser.  This is ostensibly a Good Thing™ because it provides at least some reasonable confidence that they aren't storing or using this data back in their models, etc.  (OpenAI continues to stress this point whenever they can, for what it's worth).  But it can make deep dives challenging because you have to start from scratch if you can't complete your work in one session.

So at the end of a few weeks of dedicated exploration, research, and playing (Dalle3 is also quite fun in the same, but more custom way, that we use reaction gifs in group chats, etc), I was intending to write this document in order to make a case that there are measurable improvements between GPT 3.5 and GPT4, exacerbating an emerging issue of access to AI as an education inequity problem - maybe not quite on the same order of magnitude as "access to the internet", but probably not far off.  And I still do think that is the case, and it's worth considering how to level that playing field for students.  But the other day, as I watched the OpenAI Dev Days keynote address, I couldn't help but think beyond this immediate issue.

OpenAI, the makers of ChatGPT, and defacto industry leader in AI at the moment, held their very first Dev Day in San Francisco, and it had all the makings of a Silicon Valley Apple-style keynote address.  A young, tech industry CEO (who, though ostensibly a real human, feels bordering on uncanny valley territory in his presentation style) standing in front of a massive screen, sharing obscene growth numbers and highlighting all kinds of new features for the platform to adoring (yet adorably awkward) applause.  And there was a lot of promising stuff announced, most of which are for paid or beta-access customers: including GPT4 Turbo (better, bigger, faster), Make-your-own-GPTs, reduced pricing, a new "Assistant" API system (integrate assistive tech into your own apps), text-to-speech capabilities, and a lot more.  But one overarching sentiment stuck with me throughout Sam Altman's keynote.  He repeatedly said things like "we believe in gradual, iterative deployment" and "it's important for people to start building with and using these agents now, to get a feel for what the world is going to be like as they become more capable" (a reference to AI Agents, which people believe to be the next step on the way from Large Language Models like ChatGPT to Artificial General Intelligence).   He was not-so-implicitly conceding that what we are currently living through today (and what feels to me a bit like society-altering magic already) is still essentially an demo version of what AI can be.  It is as if we are in the "14.4kpbs dialup modem era" of AI, and they have just announced 28.8kbps, knowing that they have gigabit fiber waiting in the wings, but the world isn't quite ready yet (he's probably right).  

The Distracted Boyfriend meme, but with a robot instead of the girl in the red dress

AI is likely going to fundamentally change how we learn, how we teach, how we do research, and how we exist as humans.  So yes, we likely need to invest in resources to provide equitable opportunity to students at Duke today, including access to resources like ChatGPT, etc.  But what I thought was going to be the core thesis of this essay (access to GPT4) feels now like a given in light of the larger picture. 

A few days ago I had an exciting conversation with some faculty about some thoughtful pedagogical approaches to AI in the classroom (how can we make ChatGPT a partner on a climate-based product development project team, so that it best complements the student skillsets and can help them test a product/idea/concept more quickly).   New approaches to teaching need to be less about "how do i make my class GPT-proof", and more "how can I teach my students thoughtful and ethical use of an ever-changing technology landscape, which they will interact with (and likely shape) on a daily basis as 21st century digital citizens".  We will need to also consider how Duke's technological systems will take advantage of a tech that in no time is going to be an expected way of interacting with computers and information (I am already frustrated with Alexa - it feels ancient now).  How do we engage directly with students, faculty, staff on improved experiences around class registration, or applying to college, or any other IT system we have, in light of AI today and what's coming?  

My initial goal with this exploration was to delineate the distinctions between GPT 3.5 and GPT 4, and understand the capabilities and limitations of the current technology.  But in the long-term (or perhaps short or medium term, given the velocity of change), the current and previous versions hardly matter: they will both be obsolete soon.  Iterative improvements touted as monumental change today that will look like quaint rounding errors in just a few years.  Focusing solely on the disparities between two young versions of a groundbreaking technology risks overlooking the broader narrative: that it is not only about who can wield the most powerful AI today (though that is important to answer today), but also preparing students for a future where AI is intricately woven into the fabric of both their education and their daily lives.  (insert dial-up modem sounds here)


(Words written by my human brain, images produced by Dall-e 3, and edited with the help of GPT4, who doesn't like my parenthetical asides but I'm keeping them anyway in a desperate and likely futile attempt to maintain a grip on my humanity - GPT5 will probably have no problem impersonating me and my idiosyncratic writing style)