Pandora
An open-source framework for evaluating online assessments on their susceptibility to AI-based cheating
Demo of Pandora running on a Canvas quiz comprised of MIT OCW questions (4x speed, actual time to complete quiz was 1 minute and 40 seconds)
Roy Lee and InterviewCoder/Cluely have made headlines recently, printing money and claiming to be shining a light on lazy hiring practices by allowing virtual interviewees to have their interview questions answered for them by an stealthy AI assistant. Much debate can be had about the true motivations behind these projects and his freshly-funded startup, but a relevant takeaway is this:
AI is making most forms of online assessment obsolete. Interviews are one thing - but in addition to having InterviewCoder answer the technical questions, you can also have Pickle replicate your face and speaking style on video in real time so you (or an agent) can remote-operate your Zoom meetings. In fact, I’ll be shocked if a company hasn’t already hired a fully virtual software engineer claiming to be human and is paying it a full-time human salary (unfortunate for Devin). Or if r/overemployed isn’t having a field day with this stuff.
As far as online courses - you can already screenshot whatever you want and put it into ChatGPT and get the correct answer for all but the most complex problems, but you can also automate this end-to-end with about an hour of effort and make it so you never have to even touch your Canvas or Moodle or Blackboard. Online education is a fantastic thing for many reasons, but it is particularly susceptible to cheating: a process that we all know ultimately cheats learners of their skills etc etc but is a temptation to nearly every student.
It’s obviously not a new problem either - in my undergrad systems class (shoutout CS 241), I read a paper titled “Smart like a fox: How clever students trick dumb automated programming assignment assessment systems” that demonstrated various ingenious ways students managed to cheat online assessment systems. Many companies, like Cluely and Chegg before them would have you pay for this service, but what they offer is not particularly sophisticated technology (I vibe coded most of this in two hours last Sunday). Their advantage lies in their brand and figuring out how to cheat stealthily - the sophisticated and new bit is intelligence-on-tap from LLMs.
Cheating itself has never been that hard to do, but it’s when it becomes easy that it’s most dangerous. Loosely speaking, people who brazenly cheat using a paid tool were always going to cheat and I am optimistic the majority of students have some interest in their studies and typically cheat out of desperation caused by impending deadlines or on assignments that are so lazily built they disrespect the student’s time. Most online courses (and the Learning Management Systems that underpin them) are lazy, unimaginative products that can be trivially ‘solved’ with two hours of effort. You shouldn’t waste your time with them - a better example I’ve seen recently is Math Academy, a company that’s achieving unprecedented success teaching university-level mathematics to middle schoolers. They know that one of the best ways to discourage cheating is to make people pay $50 a month for their product, because why would you do that and then cheat through the course? You’d feel like a fool. (Not sponsored, just a big fan. They’ve also published a 500 page book on their learning design and philosophy, and perhaps even more admirably made it free for anyone to read on Google Docs.
So now what? Project-based assignments and in-person assessment are less susceptible to AI-based cheating, and we’ve always known that these were more effective learning techniques in the first place. But classic online learning, built around 16 weekly modules and knowledge check quizzes, needs to change.
Educators - I highly recommend downloading Pandora and running it on your online assessments. If it one-shots them, they’re probably too easy. Feel free to shoot me an email at aaron@tryquetzal.ai and I’ll happily review any assessment of yours and tell you how to make it AI-proof.
Students - if you paid for Chegg or are paying for Cluely, you can likely build something that does the same thing for free/a pittance to OpenAI and Browserbase. You can clone this repo, load it up into Cursor, and ask it to figure out how to navigate your particular online course. Even if you’ve never coded a day in your life, you can copy and paste this article into ChatGPT and it’ll give you step by step instructions on how to set all of this up. But obviously, if you’re going to just have a bot do your entire course for you, why are you enrolled in the first place?