Assess programming in Python and R with Numbas

Screenshot of a Numbas question with a code editor. The editor contains Python code which has been marked correct.

We’ve written a new extension which allows you to run and mark Python and R code in Numbas. Very exciting!

For a few years, Chris Graham and George Stagg have been teaching our modules on computing for mathematicians. To support those, they wanted to have a way of automatically marking code written by students, as small exercises in course notes and to support practical sessions. We tried out a few systems, which all involve sending code off to a server which would run it and return the results.

It eventually became a Numbas extension with a custom part type which did some unpleasant things to the submission code to get the results from the server. However, it worked well and students really appreciated it. But because it required a back-end server to support it, we could never share these questions with the rest of the world.

The big selling-point of Numbas is that everything runs on the student’s device, so we’ve been waiting for it to be possible to run the languages we teach, Python and R, in a web browser.

Just before Christmas, I had another look at pyodide, which I’d experimented with in the past. It’s now developed to the point where we can use it in Numbas! I made a proof of concept quite quickly, and I’ve spent the past month adding features to Numbas so that it all works smoothly.

George has also done some great work compiling R into WebAssembly, so we can use that too.

So now we’ve released our extension, and you can use it!

Here’s a video I made showing some of the things it can do:

Now have a go at the demo exam yourself!

I’ll do a general Numbas development post later, explaining the new features that made this possible. Here, I’ll explain how the “Code” part type works:

  • The extension adds a new “code editor” input method that is available to custom part types. It uses ace, an open-source code editor.
  • When the student submits some code, the appropriate language is loaded, and the student’s code is run, along with some other blocks of code defined by the question author:
    • variable definitions – you can copy Numbas question variables into the code environment; Numbas does the translation for you.
    • a preamble, to set up anything that the student should use in their answer, such as variables or functions.
    • a postamble, to do something with the student’s answer or set things up for the tests.
    • validation and marking tests, which check that the student’s answer is valid (if you ask the student to define a function with certain properties, you might want to reject an answer that doesn’t define a function), and then decide how much credit to give.
  • Once the code has finished running, the part’s marking algorithm runs. It can inspect the results of each of the blocks of code separately. The “Code” part type’s marking algorithm shows any errors produced by the student’s code, rejects code if it fails any validation tests, and awards credit based on the results of the marking tests.

It works really nicely! I was surprised by how quickly it runs.

The Python and R runners are quite big – a few megabytes each – so they’re loaded from a content delivery network instead of being included in the exam package like the rest of the Numbas code. The GeoGebra extension works similarly, and the code could be mirrored or included in the package if necessary, so we don’t think this will cause an archival problem.

The results of each run of code are cached, so when you’re reviewing an attempt at a question using the programming extension, your computer doesn’t have to re-run the code.

People have been asking if Numbas can mark code almost since the start, so I’m really glad to finally make it possible.

The programming extension is available to everyone in the Numbas editor at mathcentre.ac.uk, and you can find the “Code” part type by clicking More part types when adding a part to a question. There’s documentation on how it all works in our GitHub repository.

We’re busily writing lots of questions using the new features this extension provides. Please give it a go yourself, and let us know how you get on. As we start to use the extension seriously, I’m sure we’ll find bits that could be smoother and think of more features we’d like – this is just the beginning.