Clickety Click Click: Awful Measures for Learning

Dec 19, 2016 by

I feel a little inspired by Sam Ford’s post The Year We Talk About Our Awful Metrics. Ford writes about the need for change in metrics of online media publications, but we could just as easily be discussing the metrics of learning management systems, ed-tech vendor platforms, and institutional analytics.

Ford argues that we need to “get serious” about better forms of measurement in 2017. As long as we are measuring metrics with little meaning, we aren’t really improving learning.

Let me give you a few examples to illustrate the similar problems in education.

Page Clicks

As in, how many pages of the ebook has the student accessed? Because the student must read every page they access, right? And they don’t just scroll through pages to see roughly what the book has in it? Realistically, I think we all acknowledge these inevitabilities, but that doesn’t stop us from creating blingy dashboards to display our metric wares.

Consider the following scenarios.

Scenario 1: Student A has clicked on 55 pages whereas student B has only clicked on 10 pages. This means:

a. Student A has read more than Student B. Student A is a more engaged student.

b. Student B was reading deeply and Student A was skimming.

c. Student A reads faster than student B.

d. Student A read more online. Student B borrowed a book from a friend and read more on paper.

e. None of the above. Who knows what it really means.

Scenario 2: Student A has clicked on 55 pages whereas student B has only clicked on 10 pages. Both students spent 2 hours in the eReader platform.

a. Student A has read more than Student B. Student A is a more engaged student.

b. Student B was reading deeply and Student A was skimming.

c. Student A reads faster than student B.

d. Student A read more online. Student B borrowed a book from a friend and read more on paper.

e. None of the above. Who knows what it really means.

In either case, how much do we really know about how much Students A and B have learned? Nothing. We know absolutely nothing. These metrics haven’t done a thing to see what either student is capable of recalling or retrieving from memory. There is nothing to help us to see whether the student can make sensical decisions related to the topics and nothing to show whether concepts can be transferred to new situations. Page clicks are a bad metric. All they tell me is that students log in more every Sunday night than on any other night (and that metric has been the same for a decade now).

But wait … there are more metrics …

Attendance

We can measure attendance – whether it be logging in to the LMS or physically showing up in the classroom. Surely this is a valuable measure of learning?

Again no, it’s not a measure of learning. It’s potentially a necessary condition of a necessary-but-not-sufficient metric for learning. Yes, we do need students to show up in some way to learn. In very active face-to-face classrooms that engage all students in learning activities, I might go so far as to say that showing up is a good measure of learning, but this is still the exception rather than the norm. And even if the classroom is active, learning is more effective with certain kinds of activities: those involving interaction, those involving varied practicethose where students have to learn to recognize and remedy their own errors.

Attendance, by itself, does not measure learning.

Time to Complete

At organizations where the learning is assessed directly (CBE and MOOCs, for example), there is often a metric around the “time to complete” a course.  This is a particularly dangerous metric because of the extreme variability. Again, let’s look at two scenarios.

Scenario 1: Course 1 is a 4-credit course that takes (on average) 45 days to complete. Course 2 is a 4-credit course that takes (on average) 30 days to complete.

a. Course 1 is poorly designed and Course 2 is well-designed.

b. Course 1 is harder than Course 2.

c. Course 1 and Course 2 seem about equal in terms of difficulty and design.

d. None of the above.

Scenario 2: Course 1 is a 4-credit course that takes (on average) 45 days to complete and requires students to turn in 4 papers. Course 2 is a 4-credit course that takes (on average) 30 days to complete and requires students to pass 2 exams.

a. Course 1 is poorly designed and Course 2 is well-designed.

b. Course 1 is harder than Course 2.

c. Course 1 and Course 2 seem about equal in terms of difficulty and design.

d. Students procrastinate more on writing papers than on taking exams.

e. None of the above.

In either case, what does the “time to complete” actually tell us about the quality of learning in the courses? If we were comparing two Calculus I courses, and they were taught with different platforms, equivalent assessment, and the same teacher, I might start to believe that time-to-complete was correlated with design, learning quality, or difficulty. But in most cases, comparing courses via this metric is like comparing apples to monkeys. It’s even worse if that data doesn’t have any kind of context around it.

Number of Clicks per Page

This is one of my favorites. I think you’ll see the problem as soon as you read the scenario.

Scenario 1: Page A got 400 clicks during the semester. Page B got only 29 clicks.

a. Page A has more valuable resources than Page B.

b. Students are accidentally wandering to Page A.

c. Page A is confusing so students visit it to reread it a lot.

d. Page B was only necessary for those students who did not understand a prerequisite concept.

e. Page A is more central in the structure of the course. Students click through it a lot on their way to somewhere else.

Scenario 1: Page A contains a video on finding the derivative using the Chain Rule and got 400 clicks during the semester. Page B contains a narrative on finding the derivative using the power rule and got only 29 clicks during the semester. 

a. Page A has more valuable resources than Page B.

b. Page A is a more difficult topic than Page B, so students revisit it a lot.

c. The video on Page A is confusing so students watch it on multiple occasions trying to figure it out.

d. Page B was only necessary for those students who did not understand a prerequisite concept.

e. Page A is more central in the structure of the course. Students click through it a lot on their way to somewhere else.

Number of clicks per page is meaningless unless there is a constructive relationship between pages. For example, if we are looking at 5 pages that each contain one resource for learning how to find the derivative using the chain rule, the comparison of data might be interesting. But even in this case, I would want to know the order the links appear to the students. And just because a student clicks on a page, it doesn’t mean they learned anything on the page. They might visit the page, decide they dislike the resource, and go find a better one.

Completion of Online Assignments

Surely we can use completion of assignments as a meaningful metric of learning? Surely?

Well, that depends. What do students access when they are working on assignments? Can they use any resource available online? Do they answer questions immediately after reading the corresponding section of the book? Are they really demonstrating learning? Or are they demonstrating the ability to find an answer? Maybe we are just measuring good finding abilities.

Many online homework platforms (no need to name names, it’s like all of them) pride themselves on delivering just-in-time help to students as they struggle (watch this video, look at this slide deck, try another problem just like this one). I think this is a questionable practice. It is important to target the moment of impasse, but too much help means the learning might not stick. Impasse is important because it produces struggle and a bit of frustration, both of which can improve learning outcomes. Perfect delivery of answers at just the right moment might not have strong learning impact because the struggle stops at that moment. I don’t think we know enough about this yet to say one way or another (correct me if you think I’m missing some important research).

Regardless, even completion of assignments is a questionable measure of learning. It’s just a measure of the student’s ability to meet a deadline and complete a task given an infinite number of resources.

Where do we go from here?

Ford hopes that the ramifications of 2016 will foster better journalism in 2017 in ways that people read, watch, or listen to more intentionally, maybe even (shock!) remembering a story and the publisher it came from the next day.

I hope that education can focus more on (shock!) finding meaningful ways to measure whether a student actually learned, not just whether they clicked or checked off tasks. Reflecting on my own online learning experiences in the last year, I am worried. I’m worried we have fallen so deep down the “data-driven decisions” rabbit hole that we are no longer paying attention to the qualitative data that orbits the metrics. Good instructors keep their finger on the pulse of the learners, ever adjusting for those qualitative factors. But as the data ports up to departments, institutions, and vendors, where does that qualitative picture go?

I will close with a few goals for institutions, instructors, and vendors for 2017:

  1. Demand better learning metrics from ed-tech vendors. What that measure is really depends on the platform. Begin asking for what you really want.
  2. Build more integrations that pass quality learning data from the ed-tech vendor to the institution. Sometimes the platform does have better metrics, but the institution can’t access them.
  3. Create metrics that measure learning mastery over time in your own courses. This means choosing a few crucial concepts and probing them repeatedly throughout the learning experience to ensure the concept is sticking.

These are all concepts I hope to continue exploring with more research and more detail over the next year. If you want to join on that journey, consider subscribing here.


Possibly Related Posts:


read more

Why high contextual interference?

Dec 2, 2016 by

This week I followed a hunch and, with the help of a friend who is a music educator, dug into some additional research around this idea of blocked and random practice. In music there are a few goals to achieve with any passage of music:
  • can you play a passage accurately by itself?
  • can you play the passage in the larger context of the piece?
  • can you play the passage to tempo?
  • can you play the passage with the right expression?

Think about these goals in your own subject area and see if you can find a similar set of goals. For example, here are some potential goals for solving a math problem:

  • can you find the correct solution?
  • can you solve the problem in an elegant way?
  • can you prove your solution is correct?
  • can someone else understand your solution?

The first research paper I looked at was When Repetition Isn’t the Best Practice Strategy (2001), by Laura A. Stambaugh. A short summary is available here, though the original paper is a bit harder to get ahold of. I’ll elaborate a bit on the summary with the relevant points to our study of learning design.

Students were asked to practice three passages (denoted below in three colors) in either blocked- or random-formatpractice sessions. The three practice sessions were covered on three different days (denoted 1, 2, and 3 in the diagram below). The performance during the last three trials of each practice session were used as the baseline measure of comparison for the retention measures.

music-research
In this experiment, practicing “randomly” meant practicing the same three passages in either At the end of the three sessions, there were no performance differences between the two groups. However, when tested for retention, the blocked-practice students’ performance began to slow to the level of early practice in the trials. While the accuracy of the two groups of students was still the same, the random-practice students could now play the passages faster than the blocked-practice students. Stambaugh also tested transferability of skills, but did not find any statistically significant differences from this experiment. One other variable that Stambaugh thought to test was attitude towards practice depending on the research treatment (maybe students will really dislike random practice or blocked practice?). Here too, there were no statistically significant differences in attitude towards practice between the two student groups.

One of the reasons I find this article interesting is that it discusses the idea of contextual interference, the amount of cognitive disruption the learner experiences during practice with multiple tasks. When the learner has to redirect attention as the tasks change, this results in a high degree of contextual interference. When the tasks don’t change much (blocked practice), the brain can go into a sort of “autopilot” and stop paying attention. At this point, there may not be much point to practicing more on that day. Practicing the same things on a different day would have positive effect (that’s spaced repetition).

 

Possibly Related Posts:


read more

Recorded Webinar: Teaching Math in 2020

Nov 30, 2016 by

Just realized I never shared this webinar video from 2014 (you know, back when 2020 still seemed pretty far away).

What Does Teaching Math look like in 2020?

With every new iteration of technology, we create a generation of students whose primary media “language” for learning and interacting with the world is different than the one before it. In the last 5 years, technologies like free online videos, personalized learning software, and mobile devices, have been chipping away at the corners of education and traditional teaching. Technology-enhanced learning is here to stay, and it will alter the face of education, like it or not. This webinar is your guide to navigating and thriving in this new world.

Possibly Related Posts:


read more

AMATYC Keynote Notes: Challenge and Curiosity

Nov 21, 2016 by

In the 2016 AMATYC keynote, I covered three main themes:

  1. Interaction & Impasse (last post)
  2. Challenge & Curiosity (this post)
  3. Durable Learning

Here are references and resources for Challenge & Curiosity:

First, I have to point you to one of my favorite books on the subject, A Theory of Fun for Game Design, by Raph Koster.

Quote from Game Design: “How do I get somebody to learn something that is long and difficult and takes a lot of commitment, but get them to learn it well?” – James Gee

How do players learn a game? 

  • They give it a try
  • They push at boundaries
  • They try over and over
  • They seek patterns

It looks something like this:

Shows web of many nodes and branches coming off a person, with bridges between branches and potential paths to expand knowledge.

How does a player learn a game?

How do we teach students?

  • We tell them what we’re going to tell them.
  • We tell them.
  • We tell them what we told them.
  • We have them practice repetitively.

It looks something like this:

Very few linear paths branching out from the person at the center. Few nodes and few places to expand on knowledge.

How do we teach students?

Reference: Productive Failure in Mathematical Problem Solving

There’s a much wider body of research on productive failure worth reading.

Video: Playing to Learn Math

Resource: Good Questions from Cornell

Resource: Classroom Voting Questions from Carroll College

Design more activities that let the student figure out the mathematical puzzle, instead of providing all the secrets yourself.

Shows the graph of a rational function with vertical asymptote at x=5 and horizontal asymptote at y=2.

Explain the differences in the graphs: The student is given five rational functions to graph, each function looks only slightly different mathematically but produces very different results.

Possibly Related Posts:


read more

Interdisciplinary Courseware to the Rescue?

Oct 13, 2016 by

In the midst of all the bling of media-rich, adaptive, personalized, [insert-buzzword-here] digital products, there is a lurking underlying problem:

The general education curriculum in higher education has barely changed. Today’s world is cross-disciplinary, culturally diverse, and team-oriented. There is almost no problem that can be solved in a silo content area with a team of one.

Map showing the interconnected nodes between a variety of subject areas in research.

Interdisciplinary Thinking, from New Scientist’s article “Open your Mind to Interdisciplinary Research”

We need new cross-disciplinary curriculum. We need courses that are more engaging and reflective of today’s real issues. We need courses like these (referenced from my 2009 post on Hacking Higher Education):

  • Trend Analysis (Math + History)
  • Biology and Human Enhancement (Biology + Philosophy)
  • Science of Exercise (Science + Health & PE)
  • Exploring Water Issues (Science + Politics)
  • Design and Digital Presentations (Graphic Design + Communication)
  • Data Analysis and Information Presentation (Statistics, Graphic Design, and Communication)
  • Exploring Recycling and Refuse (Science, Government, and Humanities)
  • Chemistry of Nutrition (Chemistry + Health & PE)
  • Poverty and World Culture (Humanities, Government, and Sociology)
  • Sociology and Psychology of the Web (Sociology + Psychology)
  • How Computers Think (CIS + Philosophy)
  • Art, Media, and Copyright (Fine Arts + Law)
  • Writing for the Digital Age (CIS + Communication + English)
  • Energy (Physics, Chemistry, and Government)
  • Information, Query, and Synthesis (Literacy, Logic, English)

The problem is that very few faculty can teach courses like this without extensive learning or teamwork, and very few authors that could write such a curriculum from scratch.

This is exactly the moment when “digital courseware” should rise to the occasion. Digital courseware could be built to support these kinds of inter-disciplinary courses with a well-designed learning experience (not just text, but formative assessment and designed interactions with students and faculty). It could be multimedia rich, adaptive, personalized, and all that good buzzword stuff.

With a solid digital courseware backbone to support the learning, faculty could be tapped from different disciplines to evaluate work, answer questions, and coach students in their learning. No one faculty member would have to learn all the nuances of the course immediately.

So why aren’t we getting that? Why are we just getting more Algebra, English Comp, and Freshman Biology courses? Because that’s what we keep asking for. We keep saying, “give us better pass rates for these courses we currently teach.” We keep funding the rebuild (and rebuild) of those courses that create retention and graduation pressure in higher education. What if the problem is not the delivery of the course, but in the course itself? What if students are never going to do better in these courses because deep at the heart of the issue, the student knows the course isn’t applicable to the world they live in?

The Big History course (funded by Bill Gates) is an admirable step towards creating a more modern and more interdisciplinary curriculum. MOOCs do not have to pay attention to credit counts, what “department” the course lives in, or how it will or will not count as an elective towards multiple degrees. Consequently, MOOC providers have the freedom to build interesting, modern, and cross-disciplinary courses like The Science of Everyday Thinking (from EdX) or Politics and Economics of International Energy (from Coursera).

But why is it outsiders to education that have to lead these efforts? Educators should begin asking for the “right” curriculum from courseware providers (looking at traditional publishers, digital platforms, and MOOCs). We need to ask for the curriculum we want to teach instead of that which we have always taught.

Of course, courseware providers aren’t going to build something they don’t think has a market yet – and so we have a classic “chicken and egg” problem. This seems like exactly the kind of problem that needs a funding push. If a beautiful digital course on “How Computers Think” or “Poverty and World Culture” became available nationally at a low cost, I’d like to think that institutions and faculty would be able to step up to the challenge of figuring out the rest of the logistics to offer these courses.

Possibly Related Posts:


read more

Why prototype a digital course?

Oct 11, 2016 by

Very few of us would buy an unbuilt home without at least viewing a model home that conveys the look and feel of the interior and exterior of the rest of the community. We should be unwilling to build (or buy) an entire course (a “row” of units, modules, chapters, or weeks of content) without seeing at least one “model unit” first.

craftsman-exterior

From http://www.houzz.com/photos/36213135

In the software world, a low-fidelity prototype is used to give the look and feel of a future product. With this prototype there is some hand-waving (mockups) to explain away missing functionality and potential users are asked how they would navigate and use the product. This happens long before the product build, and is iterative.

In the learning world, we should consider that course builds (especially large-scale digital courseware) need the same kind of prototype.  Before the time and money is invested to build the a full course, consider building one unit as completely as possible, and make sure your stakeholders (students, faculty, instructional designers, deans, customers) actually want to learn in this course.  Choose a prototype unit that is most representative of the majority of the learning in course; this is usually not the first or last unit.

When the model unit is being designed and built, this is the ideal time to collaborate iteratively with students, faculty, IT, assessment, and instructional designers. While it will take some time to change the model unit as opinions shift, it will not take as much time as remodeling every unit in the course.

After you’ve got stakeholder approval for the model unit design, make sure to carefully document what features this prototype contains, since your team will need to apply it consistently across the full development. Here are just a few of the learning features you might want to apply across your multi-unit build:

  • content: where did it come from? what quantity per learning objective?
  • examples: how often, how relevant?
  • interaction: how much, what kind, and how often?
  • assessment: what kind? how often? authentic? purely for practice? for learning scaffolding?
  • images: for what purpose, how often?
  • videos: how long are they, what stylistic elements are there, how often do they occur?
  • simulations or games: for what purpose? how often?

As digital learning becomes more accepted (thanks MOOCs) and blended learning becomes a more standard model at traditional institutions, I hope we’ll see much more collaborative prototyping, followed by intentional design, in these courses.

Possibly Related Posts:


read more