The workshop will bring together multiple research and development (R&D) communities focusing on big image analyses in computer cloud environments. Such analyses are frequently supported by implementing web client-server systems executing a wide spectrum of algorithms designed to extract image-based measurements, and perform image classification, object detection, object registration, object tracking, and object recognition. The purpose of this workshop is to discuss the bio-medical and bio-materials science application needs for big image analysis solutions, current open-source technical solutions, and community-wide R&D interests in defining inter-operable algorithmic plugins for web client-server systems designed for big image analyses.
There is an increasing interest in enabling discoveries from high-throughput and high content microscopy imaging of biological specimens and bio-material structures under a variety of conditions. As automated imaging across multiple dimensions increases its throughput to thousands of images per hour, the computational infrastructure for handling the images has become a major bottleneck. The bottleneck presents challenges that range from transferring data, storing and archiving, annotating, quantifying, and visualizing, to the mechanisms for applying the latest machine learning and artificial intelligence models by non-computational experts from a variety of application domains. These challenges arise due to big image data, complex phenomena to model, and non-trivial computational scalability that accommodates advanced hardware and cutting-edge algorithms. Furthermore, the challenges are amplified by the need to engage a broad community of experts in analyzing complex image content and the need to reproduce discoveries based on image measurements and any decisions derived from these measurements. Such measurements, discoveries and decisions are critical for biological and bio-materials science applications, for instance, quality assurance of stem cell therapies, design of cancer treatments, high throughput screening in drug discovery, and vaccine discoveries from atomic resolution structures of viruses and protein complexes.
Existing solutions: To overcome the aforementioned challenges, several research institutions have prototyped web-based systems in order to facilitate access to large image databases and to high performance computing (HPC) and cloud hardware resources. The existing web-based prototype solutions leverage a variety of web technologies on the client side and a spectrum of databases, scientific computational workflow engines, and communication protocols on the server side in order to hide the infrastructure complexity from the domain application experts and make them more productive in conducting research. While the web-based solutions deliver infrastructure capabilities, their capabilities for processing large images remain limited to the computational tools provided by each development team because the development of new tools is web solution specific and a definition of an inter-operable web computational plugin does not exist.
With the increasing popularity of software containers as standardized units for deployment, there is an opportunity for the communities working with large microscopy images to discuss creating inter-operable web computational plugins. These web computational plugins consist of software containers and web user interface (UI) description files to enter parameters needed for the software execution. Each container packages code with all its dependencies and has an entry point for running the computation in any computing environment. Each UI description file contains metadata about the plugin container and the computation parameters. This description file is intended for generating web UI for entering parameters dynamically.
Scope, Goals, and Topics:
The workshop scope will touch on the technical topics of big image data management, software containerization, software execution on advanced hardware architectures, container-based workflow management, dashboards for monitoring computations, web technologies for dynamic content creation, web-based visualization of large images, delivery of provenance information, and web plugins for image annotation creation. The main goal for the workshop is to establish a community consensus on creating inter-operable web computational plugins that can be chained into scientific workflows/pipelines and executed over large image collections regardless of the cloud infrastructure components.
Vested interests in the workshop topics: Multiple government agencies, non-profit research institutions and commercial entities have invested large amounts of resources into building software for big data analyses and setting up hardware infrastructure for cloud computing and high-performance computing. To improve the utilization of all investments, private and public sectors might benefit from inter-operability of cloud deployable algorithmic plugins. Specifically, improving the reuse and execution of open-source and close-source algorithms applied to big image analyses is of interest to bio-medical and bio-materials science stakeholders as the domain problems are complex and need very large number of image observations. We plan to facilitate discussions by forming working groups focusing on:
- Containerization of execution code
- Data storage and access interfaces for object-, block- and file-level storage
- Inter-operability requirements of workflow engines for running containerized plugins
- Standard packaging of web UI modules
- Security of container-based distribution