Submission Correctness Tests

Automatically checking student submissions and providing meaningful feedback to tell them what they are doing wrong are central to the learning experience on DataCamp. The life and blood of the automated system that points out mistakes and guides students to the correct solution is the Submission Correctness Test, or SCT.

The SCT is a script of custom tests that accompanies every coding exercise. These custom tests have access to the code students submitted and the output and workspace they created with their code. For every taught language, there is an open-source library that provides a wide range of functions to verify these elements of a student submission. When the functions spot a mistake, they will automatically generate a meaningful feedback message.

The table below lists out all of the utlity packages used to write SCTs. If you're authoring R exercises you write your SCT in R. If you're building Python, SQL or Shell exercises, you write your SCT in Python. The documentation pages for each package list out all of its functions, with examples and best practices.

Exercise Language SCT language GitHub Documentation Build Status
R R testwhat link Build Status
Python Python pythonwhat link Build Status
SQL Python sqlwhat link Build Status
Shell Python shellwhat link Build Status

In the remainder of this article, when xwhat is used, this means that the information applies to all of the SCT packages listed above.

How it works

When a student starts an exercise on DataCamp, the coding backend:

  • Starts a student coding process, and executes the pre_exercise_code in this process. This code initializes the process with data, loads relevant packages, etc., so that students can focus on the topic at hand.
  • Starts a solution coding process at the same time, in which both the pre_exercise_code and the solution are executed. This coding process represents the 'ideal final state' of an exercise.

When students click Submit Answer, the coding backend:

  • Executes the submitted code in the student coding process and records any outputs or errors that are generated.
  • Tells xwhat to check the submitted code, by calling the test_exercise() function that is available in all four of the SCT packages. Along with the SCT (the R/Python script with custom tests), the backend also passes the following information:

    • The student submission and the solution as text.
    • A reference to the student process and the solution process.
    • The output and errors that were generated when executing the student code.

    If there is a failing test in the SCT, xwhat marks the submitted code as incorrect and automatically generates a feedback message. If all tests pass, xwhat marks the submitted code as correct, and generates a success message. This information is relayed back to the coding backend.

  • Bundles the code output and the correctness information, so it can be shown in the learning interface.

Example

To understand how SCTs affect the student's experience, consider the markdown source for an R exercise about variable assignment:

## Create a variable

```yaml
type: NormalExercise
```

In this exercise, you'll assign your first variable.

`@instructions`
Create a variable `m`, equal to 5.

`@sample_code`
```{r}
# Create m

```

`@solution`
```{r}
# Create m
m <- 5
```

`@sct`
```{r}
ex() %>% check_object("m") %>% check_equal()
success_msg("Well done!")
```
  • Student submits a <- 4
    • Feedback box appears: "Did you define the variable m without errors?".
    • This message is generated by check_object(), that checks if m was defined in the student coding session.
  • Student submits m <- 4 (correct variable name, incorrect value)
    • Feedback box appears: "The contents of the variable m aren't correct.".
    • This message was generated by check_equal(), which compares the value of m in the student coding session with the value of m in the solution coding session.
    • Notice that there was no need to repeat the value 5 in the SCT; testwhat inferred it.
  • Student submits m <- 5 (correct answer)
    • All checks pass, and the message "Well done!" is shown, as specified in success_msg().

How to write good SCTs

A good SCT allows for different ways of solving the problem but gives targeted and actionable feedback in case of a mistake.

  • Inspecting results > inspecting output > inspecting actual code.

    As mentioned before, the SCT packages can access three pieces of information (and their solution counterparts): the student process, the output the student generated, and the code they wrote. When you check the student process for the correct variables, or whether an expression evaluates correctly in the student process, you are not checking how students got there; you are only checking whether they got there. Hence, these tests are more robust. Code-based checks, on the other hand, or more restrictive. They expect the student to type something, and you're not being flexible about it.

    Verify the contents of an object rather than the code to define that object. Verify the output of the student rather than the code that generated that printout. Verify the result of calling a function rather than the arguments used to call that function.

  • Use test_correct() whenever it makes sense.

    The seemingly opposite requirements of robustness to different solutions vs. targeted feedback can be satisfied by using test_correct(). This function takes two sets of tests: 'checking' tests, and 'diagnosing' tests. Checking tests verify the end result, while diagnosting tests dive deeper to look at the mistakes a student made. If the checking tests pass, the typically more restrictive diagnosing tests are not executed. If the checking tests fail, the diagnosting tests are execute which will give more detailed feedback. This results in flexibility in coding up the end result, but specific feedback when the student made a mistake. Tying into the previous point, you typically want your checking tests to be process-based checks, while your diagnosting tests can be more code-based.

  • Be frugal.

    With every SCT you write, you add another check that students can bump into, causing their exercise to not pass. For every SCT function you add, try to think for yourself whether you need this test and whether it tests what is asked of the student. Does it really matter if they do that printout, or is it a nice-to-have? Is it absolutely necessary that they specify the exact same plot title, or is just specifying a title argument enough? Do you really have to check the existence of an intermediate variable if the end result is okay?

  • Depend on the feedback messages that are generated by the SCT packages as much as you can.

    The automatically generated messages know what the solution is, know exactly what the student did, and can give tailored feedback because of it. In addition, we are constantly working on improving the feedback mechanisms. If you do not specify custom messages, your course's SCT automatically leverage these improvements. Only if you want to give very specific hints related to the course content or point to a common mistake that 90% of students get wrong, it makes sense to use custom feedback messages.

  • Think as the student: what will be the most common mistakes they make?

    You can shape your SCT differently or use different SCT functions depending on how you think students will make mistakes.

  • Follow common styleguides to make your SCTs readable and self-documenting.

    When written well and formatted properly, SCTs can be succinct and easy to read. It's self-documenting testing code. In R, use the %>% syntax to chain together your SCT calls. In Python, use the . and multi() functions to clearly show your intentions. Extra comments can make sense when chopping things up for large exercise or when you want to explain a workaround for a corner case, but every comment you write can also become outdated, which can make things confusing.

results matching ""

    No results matching ""