I feel a little inspired by Sam Ford’s post The Year We Talk About Our Awful Metrics. Ford writes about the need for change in metrics of online media publications, but we could just as easily be discussing the metrics of learning management systems, ed-tech vendor platforms, and institutional analytics.

Ford argues that we need to “get serious” about better forms of measurement in 2017. As long as we are measuring metrics with little meaning, we aren’t really improving learning.

Let me give you a few examples to illustrate the similar problems in education.

Page Clicks

As in, how many pages of the ebook has the student accessed? Because the student must read every page they access, right? And they don’t just scroll through pages to see roughly what the book has in it? Realistically, I think we all acknowledge these inevitabilities, but that doesn’t stop us from creating blingy dashboards to display our metric wares.

Consider the following scenarios.

Scenario 1: Student A has clicked on 55 pages whereas student B has only clicked on 10 pages. This means:

a. Student A has read more than Student B. Student A is a more engaged student.

b. Student B was reading deeply and Student A was skimming.

c. Student A reads faster than student B.

d. Student A read more online. Student B borrowed a book from a friend and read more on paper.

e. None of the above. Who knows what it really means.

Scenario 2: Student A has clicked on 55 pages whereas student B has only clicked on 10 pages. Both students spent 2 hours in the eReader platform.

a. Student A has read more than Student B. Student A is a more engaged student.

b. Student B was reading deeply and Student A was skimming.

c. Student A reads faster than student B.

d. Student A read more online. Student B borrowed a book from a friend and read more on paper.

e. None of the above. Who knows what it really means.

In either case, how much do we really know about how much Students A and B have learned? Nothing. We know absolutely nothing. These metrics haven’t done a thing to see what either student is capable of recalling or retrieving from memory. There is nothing to help us to see whether the student can make sensical decisions related to the topics and nothing to show whether concepts can be transferred to new situations. Page clicks are a bad metric. All they tell me is that students log in more every Sunday night than on any other night (and that metric has been the same for a decade now).

But wait … there are more metrics …

Attendance

We can measure attendance – whether it be logging in to the LMS or physically showing up in the classroom. Surely this is a valuable measure of learning?

Again no, it’s not a measure of learning. It’s potentially a necessary condition of a necessary-but-not-sufficient metric for learning. Yes, we do need students to show up in some way to learn. In very active face-to-face classrooms that engage all students in learning activities, I might go so far as to say that showing up is a good measure of learning, but this is still the exception rather than the norm. And even if the classroom is active, learning is more effective with certain kinds of activities: those involving interaction, those involving varied practicethose where students have to learn to recognize and remedy their own errors.

Attendance, by itself, does not measure learning.

Time to Complete

At organizations where the learning is assessed directly (CBE and MOOCs, for example), there is often a metric around the “time to complete” a course.  This is a particularly dangerous metric because of the extreme variability. Again, let’s look at two scenarios.

Scenario 1: Course 1 is a 4-credit course that takes (on average) 45 days to complete. Course 2 is a 4-credit course that takes (on average) 30 days to complete.

a. Course 1 is poorly designed and Course 2 is well-designed.

b. Course 1 is harder than Course 2.

c. Course 1 and Course 2 seem about equal in terms of difficulty and design.

d. None of the above.

Scenario 2: Course 1 is a 4-credit course that takes (on average) 45 days to complete and requires students to turn in 4 papers. Course 2 is a 4-credit course that takes (on average) 30 days to complete and requires students to pass 2 exams.

a. Course 1 is poorly designed and Course 2 is well-designed.

b. Course 1 is harder than Course 2.

c. Course 1 and Course 2 seem about equal in terms of difficulty and design.

d. Students procrastinate more on writing papers than on taking exams.

e. None of the above.

In either case, what does the “time to complete” actually tell us about the quality of learning in the courses? If we were comparing two Calculus I courses, and they were taught with different platforms, equivalent assessment, and the same teacher, I might start to believe that time-to-complete was correlated with design, learning quality, or difficulty. But in most cases, comparing courses via this metric is like comparing apples to monkeys. It’s even worse if that data doesn’t have any kind of context around it.

Number of Clicks per Page

This is one of my favorites. I think you’ll see the problem as soon as you read the scenario.

Scenario 1: Page A got 400 clicks during the semester. Page B got only 29 clicks.

a. Page A has more valuable resources than Page B.

b. Students are accidentally wandering to Page A.

c. Page A is confusing so students visit it to reread it a lot.

d. Page B was only necessary for those students who did not understand a prerequisite concept.

e. Page A is more central in the structure of the course. Students click through it a lot on their way to somewhere else.

Scenario 1: Page A contains a video on finding the derivative using the Chain Rule and got 400 clicks during the semester. Page B contains a narrative on finding the derivative using the power rule and got only 29 clicks during the semester. 

a. Page A has more valuable resources than Page B.

b. Page A is a more difficult topic than Page B, so students revisit it a lot.

c. The video on Page A is confusing so students watch it on multiple occasions trying to figure it out.

d. Page B was only necessary for those students who did not understand a prerequisite concept.

e. Page A is more central in the structure of the course. Students click through it a lot on their way to somewhere else.

Number of clicks per page is meaningless unless there is a constructive relationship between pages. For example, if we are looking at 5 pages that each contain one resource for learning how to find the derivative using the chain rule, the comparison of data might be interesting. But even in this case, I would want to know the order the links appear to the students. And just because a student clicks on a page, it doesn’t mean they learned anything on the page. They might visit the page, decide they dislike the resource, and go find a better one.

Completion of Online Assignments

Surely we can use completion of assignments as a meaningful metric of learning? Surely?

Well, that depends. What do students access when they are working on assignments? Can they use any resource available online? Do they answer questions immediately after reading the corresponding section of the book? Are they really demonstrating learning? Or are they demonstrating the ability to find an answer? Maybe we are just measuring good finding abilities.

Many online homework platforms (no need to name names, it’s like all of them) pride themselves on delivering just-in-time help to students as they struggle (watch this video, look at this slide deck, try another problem just like this one). I think this is a questionable practice. It is important to target the moment of impasse, but too much help means the learning might not stick. Impasse is important because it produces struggle and a bit of frustration, both of which can improve learning outcomes. Perfect delivery of answers at just the right moment might not have strong learning impact because the struggle stops at that moment. I don’t think we know enough about this yet to say one way or another (correct me if you think I’m missing some important research).

Regardless, even completion of assignments is a questionable measure of learning. It’s just a measure of the student’s ability to meet a deadline and complete a task given an infinite number of resources.

Where do we go from here?

Ford hopes that the ramifications of 2016 will foster better journalism in 2017 in ways that people read, watch, or listen to more intentionally, maybe even (shock!) remembering a story and the publisher it came from the next day.

I hope that education can focus more on (shock!) finding meaningful ways to measure whether a student actually learned, not just whether they clicked or checked off tasks. Reflecting on my own online learning experiences in the last year, I am worried. I’m worried we have fallen so deep down the “data-driven decisions” rabbit hole that we are no longer paying attention to the qualitative data that orbits the metrics. Good instructors keep their finger on the pulse of the learners, ever adjusting for those qualitative factors. But as the data ports up to departments, institutions, and vendors, where does that qualitative picture go?

I will close with a few goals for institutions, instructors, and vendors for 2017:

  1. Demand better learning metrics from ed-tech vendors. What that measure is really depends on the platform. Begin asking for what you really want.
  2. Build more integrations that pass quality learning data from the ed-tech vendor to the institution. Sometimes the platform does have better metrics, but the institution can’t access them.
  3. Create metrics that measure learning mastery over time in your own courses. This means choosing a few crucial concepts and probing them repeatedly throughout the learning experience to ensure the concept is sticking.

These are all concepts I hope to continue exploring with more research and more detail over the next year. If you want to join on that journey, consider subscribing here.