Evaluating ubiquitous systems is hard, and has attracted the attention of others in the research community. These investigators, like others in CSCW, argue there is a basic mismatch between traditional evaluation techniques and the needs posed by ubiquitous systems. Namely, these systems are embedded in a variety of complex real world environments that cannot be easily modeled (as required by theoretical analyses), simulated, measured, or controlled (as required by laboratory experiments). As a result, many investigators have abandoned traditional comparative evaluation techniques and opted instead for techniques adapted from the social sciences, such as anthropology. We wanted to perform a comparative evaluation similar to a laboratory experiment, but in such a way that we could observe the effects of our design decisions in relatively unconstrained, real world use. This led us to the process described in this paper.