Most industry executives, military planners, research managers or venture capitalists charged with assessing the potential of an R&D project probably are familiar with the wry twist on Arthur C. Clarke's third law*: "Any sufficiently advanced technology is indistinguishable from a rigged demo."
After serving for five years as independent evaluators of emerging military technologies nurtured by the Defense Advanced Research Projects Agency (DARPA), a team from the National Institute of Standards and Technology (NIST) shares critical "lessons learned" that can help businesses and others negotiate the promises and pitfalls encountered when pushing the technology envelope to enable new capabilities.
Writing in the International Journal of Intelligent Control and Systems,** the NIST researchers also describe the evaluative framework they devised for judging the performance of a system and its components as well as the utility of the technology for the intended user. Called SCORE (System, Component, and Operationally Relevant Evaluations), the framework is a unified set of criteria and software tools for evaluating emerging technologies from different perspectives and levels of detail and at various stages of development.
SCORE was developed for evaluating so-called intelligent systems—a fast growing category of technologies ranging from robots and unmanned vehicles to sensor networks, natural language processing devices and "smart" appliances. By definition, explains Craig Schlenoff, acting head of NIST's Systems Integration Group, "Intelligent systems can respond to conditions in an uncertain environment—be it a battlefield, a factory floor, or an urban highway system—in ways that help the technology accomplish its intended purpose."
Schlenoff and his colleagues used their SCORE approach to evaluate technologies as they progressed under two DARPA programs: ASSIST and TRANSTAC. In ASSIST, DARPA is funding efforts to instrument soldiers with wearable sensors—video cameras, microphones, global positioning devices and more—to continuously record activities while they are on a mission. TRANSTAC is driving the development of two-way speech-translation systems that enable speakers of different languages to communicate with each other in real-world situations, without an interpreter. By providing constructive feedback on system capabilities, the SCORE evaluative framework helps to drive innovation and performance improvements.
Several lessons learned recounted by the NIST team are aimed at maximizing the contributions of test subjects and the developers of technologies without biasing test results. "There is often a balancing act," they write, "between creating the evaluation environment in a way that shows the system in the best possible light vs. having an environment that is as realistic as possible."
They also discuss unavoidable trade-offs due to costs, logistics, or other factors. While evaluators and technology developers should never lose sight of their ultimate objective, the NIST researchers also advise the need for flexibility over time. As more is learned about the system and about user requirements, features may change and project goals may be modified, necessitating adjustments to the evaluation approach.
"The main lesson," Schlenoff explains, "is that the extra effort devoted to evaluation planning can have a huge effect on how successful the evaluation will be. Bad decisions made during the design can be difficult and costly to fix later on."