.comment-link {margin-left:.6em;}

Sometimes I Wish That It Would Rain Here

Friday, February 22, 2008

evaluation, interpretability, and the utility of dreaming

a few nights ago, my girlfriend and I watched The Science of Sleep. for those not familiar, the basic premise revolves around a main character who has difficulty discerning between his waking life and his dreams. afterward, we got into (what I felt was) a somewhat confused conversation about whether or not it was a good movie, why it was a good movie, and how you know it's a good movie (and I fear that conversation has led indirectly to what is an almost equally confused blog post). one of the pseudo-conclusions to which we came is that it's a good movie because of its interpretability. that is, there are several parts of the movie, foremost the ending but also bits and pieces throughout, where it was, we think, left intentionally unclear what exactly happened. the point is not to figure out the "true" or "real" story at those points. rather, the genius of the movie seemed its ability to engage the audience in interpreting those somewhat ambiguous parts. my mind kept slipping towards questions of evaluation; how do we know it's a good movie? when the key aspect of the movie has nothing to do with the movie objectively and everything to do with the interactions between movie and viewer, how can one really say anything about the movie itself?

evaluation is a huge buzz word in HCI. "ok, cool, you built your system, but does it work? does it achieve the intended goal?" I've heard it described that part of a dissertation is scoping out a problem, picking a portion of that problem, describing the win condition wherein you know that the problem has been solved, and, crucially, demonstrating that the win condition has been achieved. even when we recognize that evaluation is an interactive process, that it's really about the meeting of system and user, evaluation so often boils down to a question of success. does the system achieve the goals it set out to accomplish? there are certainly lots of conversations going on right now about richer, fuller means of evaluation, focusing less on system evaluation and more on experience evaluation, and emphasizing that evaluation is a process of determining value, which is necessarily contextually (historically, culturally, etc.) contingent. personally, I find a lot of this work both particularly compelling and very liberating, especially with respect to the epistemological questions it raises; what do we as a field consider valid knowledge, and how do we validate methods of knowledge production? on the other hand (maybe it's just that I'm having a hard time shedding my positivist roots), I have a desire to know, does it work?

this desire becomes inherently problematic when the ostensible goal of a system is to support, facilitate, encourage, and even engender critical thinking and reflection, especially when that reflection hinges on the interpretability of the system. here, I refer to interpretability not as a question of "do participants interpret this system properly?" rather, the question I want to ask is, "to what extent does the system present a resource for interpretation?" it's difficult enough to ascertain whether or not interactors are engaging in these abstract process--critical thinking, reflection, interpretation--to begin with. now, try to determine to what extent the interactor's thoughts, feelings, behavior are a result of interacting with the system. the very notion seems misguided; we're not dealing with a system cause-and-effect relationship here, but rather a whole complex system in which I doubt any single aspect can be causally linked to any other. besides, this isn't about controlling for confounding factors. it's about getting people to think, to critically engage, and to question, reconsider, and possibly even reformulate their conceptual frame.

I think one of the difficulties in my case is that the system I'm developing has a goal that seems objectively evaluable. does it do what I say it does? am I able to automatically identify conceptual metaphors (a la Lakoff and Johnson) in bodies of written text? well, I think the question of whether or not it works, or how well it works, depends largely upon the interactor's (i.e., "user's") interpretation of the system's results. moreover, I think it hinges on the interpretability of those results. the question, I suspect, should not be, "does the system accurately and correctly identify conceptual metaphors?" rather, the question should be, "does the system produce results that serve as a resource for the interactor's interpretation, and through that interpretive process does the interactor engage in critical thinking and reflection?" not that this is a particularly easy question to answer, but it seems a somewhat more useful one in terms of evaluating, i.e., determining the value, of the system. it's not about measuring success, it's about understanding the interactors' experience with the system.

this ended up getting too long for a single post, so I'll end with the above thoughts about evaluating interpretability and reflection. more stuff about dreaming to follow...

Labels: , , ,


Post a Comment

Links to this post:

Create a Link

<< Home