Research Statement

I explore the intersection of mathematics education and technology. My dissertation focuses on how computers can be used to further learn how our students understand math, so that we may better help do so. And much of my service work has been developing online homework systems and teaching aids for math classes.

Dissertation

Overview

We want to anticipate changes in any college algebra student's behavior and attitude, so we may intervene and assist them when necessary. To classify behavior and attitude, we built off work done by a colleague, Professor Rachel Manspeaker, who categorized students at the first exam. She clustered students based upon their work up to and including the first exam and observed in interviews that the clusters were distinguished by behaviors and general beliefs about mathematics. Her groups were over-achiever (OA), rote memorizer (RM), employee (E), under-achiever (UA), and Sisyphean striver (SS). Most of these group names are self descriptive. The employee and under-achiever groups however do deserve a short definition. An employee is a student who looks at college like a low paying job; they do what the teacher says, nothing more, and expect to be compensated with a good grade. If they didn't learn the material, then it's the teachers fault. The under-achievers are simply smart slackers. Sisyphean strivers are those who, like Sisyphus, expended a great deal of effort but went unrewarded for their struggles. These students are the ones who work extremely hard, do well on the homework, but perform poorly on the exams.

/image-assests/f09-n163.png

In my dissertation, we extend this categorization at the first exam to a model which classifies each student at every assignment and exam. Our model is built using naive Bayesian techniques. At each assignment, it assigns to each student a probability distribution giving the probability that the student is in a behavioral group at the time of the chosen assignment. The figure to the right shows the distributions for one such student. The horizontal axis is the assignments, including exams, in the order in which they occur during the semester. At each point on this axis, there is a set of stacked bars with the height of each bar representing the probability of being in the corresponding category. The categories are distinguished by colors as indicated in the legend. (If you're reading a print out of this, the order of the colors on the legend match the order of the bars. So the rote memorizers which are at the top of the legend, and denoted with a lighter color, are always the top most set of bars. While the Sisyphean strivers associated with a darker color are always at the bottom.) One may note that the left most set of stacked bars are evenly distributed. This is because we require an initial distribution for each student, and we took this to be the uniform distribution for everyone. In this particular example, the student was categorized by Prof. Manspeaker as an over-achiever, and the distributions agree with that starting after the first exam and going up to the end of the term. But we see that initially, this student was likely a Sisyphean striver up to the first exam. And later in the term, the likelihood of this pupil being a Sisyphean striver rises significantly though briefly, yet does not eclipse the over-achiever probability.

The Bayesian model gave us one means of describing student behaviors over time. In looking for other models of behavior, we realized that we could generate transition probabilities from a behavioral group at one assignment to any group at the next assignment. From this, we could build a Hidden Markov model. We did this in part because having multiple models to corroborate is standard practice in data mining. Also, we knew it would be easier working with HMMs because we could reduce the size of our data. This reduction was accomplished using the Viterbi algorithm which computes a most likely path through the behavioral groups for each student. These Viterbi paths, as we will call them, assign one group to a student at every assignment, which is a reduction from probability distribution of groups at every point.

Note that these HMM and Viterbi paths offer another way to model students' behaviors and attitudes. But the Viterbi paths can only be computed after the semester, since the results of the algorithm vary greatly if done over an initial, partial sequence rather than the entirety. While this prevents us from acting in real time, so to speak, it does allow us to come up with new research questions to investigate. For example, we now examine how the number of over-achievers, or good students, changes. Or we compare a new way of teaching logarithms this term with the old way from previous terms to see if we get the number of rote memorizers to decrease with the new method.

It would be ideal if the Viterbi paths clustered. Then we could easily predict changes in a student by examining to which Viterbi path cluster they belong. Unfortunately, standard clustering techniques failed to produce meaningful groups. So instead we tried to focus on runs in the paths, or constant and contiguous sub-paths. The interpretation of a run is that a student who is in a run of group \(g\) is a student who stays in group \(g\) over a period of time. We choose to look at “long runs” which were runs that were at least as long as a weeks worth of assignments. From these runs, we obtained significant results.

Results

Our results for this work were (1) to find out how quickly we can say that a student as belongs to a group and (2) to see how we could anticipate when a student would change groups. Our models predict that most students start a long run as early as two weeks into the term. This has profound implications if we want our classes to start off on the right foot. Granted, there are a great many things outside of class which affect students, sometimes dramatically. But as there are things which we can affect, these results encourages us to pay careful attention to such factors.

In the following two figures, the number of students in a long run is plotted, with the groups of each run distinguished by color as before. The graph on the left is the number of students in a long run over the semester, while the graph on the right is the number of students in a long run of the employee group. In a class of about 270 students, the graph of all the runs clearly shows how fast people settle into a run of some group. In the employee long run graph, we observe interesting fluctuations in the number of students. We see that the number of students acting like employees has a few local maximums. The first pair of peaks occurs just after the first exam; the second pair of peaks occurs two weeks before the third exam and during the third exam; and finally we see a peak at the end, during the final exam. This seems to indicate that students start dragging their feet, so to speak, and at particular times in the middle of the semester. This midterm lull is something every teacher notices, which is encouraging for the validity of our models.

/image-assests/student-runs.png

Future Work

The nice thing about our approach is that it does not depend upon Dr. Manspeakers' classification of students. What this technique requires is just some way to classify students, at any point during the semester, and then we can produce a model which extends this arbitrary grouping throughout the semester. We choose Prof. Manspeakers' groups because they divided the course into roughly equal groups. This helped the analysis by giving us data which would demonstrate changes in the groups, and therefore more ways to examine the data.

I would like to continue to look at changes in behaviors to try to identify more complex patterns, to apply the technique to other classification schemes, and to extend the analysis over multiple courses as well. The ultimate goal of this work is to provide the students with an experience like what they get with Amazon. At Amazon, each person has a different home page, with personalized recommendations and suggestions based upon purchase history, viewing history, and many other metrics. Whereas Amazon's motive is profit, our motive would be that students learn more, and maybe even have an experience with math that isn't horrible. We too can personalize instruction just as Amazon personalizes recommendations by using data mining. As instructors, our goal would be to customize the course, homework, and assistance provided to the student based solely upon their attendance, homework scores, and exam scores. In particular, I'd like to be able to predict accurately when an individual student may be in trouble, e.g. moving from an over-achiever to an employee. With such a tool in place, I want to create effective ways of intervening before problems occur next.