A case study empirical comparison of three methods to evaluate tutorial behaviors
Abstract
Researchers have used various methods to evaluate the fine-grained interactions of intelligent tutors with their students. We present a case study comparing three such methods on the same data set, logged by Project LISTEN's Reading Tutor from usage by 174 children in grades 2-4 (typically 7-10 years) over the course of the 2005-2006 school year. The Reading Tutor chooses randomly between two different types of reading practice. In assisted oral reading, the child reads aloud and the tutor helps. In "Word Swap," the tutor reads aloud and the child identifies misread words. One method we use here to evaluate reading practice is conventional analysis of randomized controlled trials (RCTs), where the outcome is performance on the same words when encountered again later. The second method is learning decomposition, which estimates the impact of each practice type as a parameter in an exponential learning curve. The third method is knowledge tracing, which estimates the impact of practice as a probability in a dynamic Bayes net. The comparison shows qualitative agreement among the three methods, which is evidence for their validity.
BibTeX
@conference{Zhang-2008-122143,author = {Xiaonan Zhang and Jack Mostow and Joseph E. Beck},
title = {A case study empirical comparison of three methods to evaluate tutorial behaviors},
booktitle = {Proceedings of International Conference on Intelligent Tutoring Systems (ITS '08)},
year = {2008},
month = {June},
pages = {122 - 131},
}