Understanding Failure Response in Service Discovery Systems
Kevin L. Mills, Stephen Quirolgico, Christopher E. Dabrowski
Service discovery systems enable distributed components to find each other without prior arrangement, to express capabilities and needs, to aggregate into useful compositions, and to detect and adapt to changes. First-generation discovery systems can be categorized based on one of three underlying architectures and on choice of behaviors for discovery, monitoring, and recovery. This paper reports a series of investigations into the robustness of designs that underlie selected service discovery systems. The paper presents a set of experimental methods for analysis of robustness in discovery systems under increasing failure intensity. These methods yield quantitative measures for effectiveness, responsiveness, and efficiency. Using these methods, we characterize robustness of alternate service discovery architectures and discuss benefits and costs of various system configurations. Overall, we find that first-generation service discovery systems can be robust under difficult failure environments. This work contributes to better understanding of failure behavior in existing discovery systems, allowing potential users to configure deployments to obtain the best achievable robustness at the least available cost. The work also contributes to design improvements for next-generation service discovery systems.
, Quirolgico, S.
and Dabrowski, C.
Understanding Failure Response in Service Discovery Systems, Journal for Cluster Computing, [online], https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=150110
(Accessed June 2, 2023)