Terabyte Size Image Computations on Hadoop Cluster Platforms

Peter Bajcsy; Antoine Vandecreme; Julien M. Amelot; Phuong T. Nguyen; Joe Chalfoun; Mary C. Brady

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

PUBLICATIONS

Terabyte Size Image Computations on Hadoop Cluster Platforms

Published

October 7, 2013

Author(s)

Peter Bajcsy, Antoine Vandecreme, Julien M. Amelot, Phuong T. Nguyen, Joe Chalfoun, Mary C. Brady

Abstract

We present a characterization of four basic terabyte size image computations on a Hadoop cluster in terms of their relative efficiency according to the modified Amdahls law. The work is motivated by the fact that there is a lack of standard benchmarks and stress tests for big image processing operations on a Hadoop computer cluster platform. Our benchmark design and evaluations were performed on one of the three microscopy image sets, each consisting of about a half of a terabyte size image volume. All image processing benchmarks executed on the NIST Raritan cluster with Hadoop were compared against baseline measurements, such as the Tera-Sort/Tera-Gen designed for Hadoop testing previously, image processing executions on a multiprocessor desktop and on NIST Raritan cluster using Java Remote Method Invocation (RMI) with multiple configurations. By applying our methodology to assessing efficiencies of computations on computer cluster configurations, we could rank computation configurations and aid scientists in measuring the benefits of running image processing on a cluster.

Proceedings Title

2013 IEEE International Conference on Big Data

Conference Dates

October 6-9, 2013

Conference Location

San Diego, CA

Conference Title

http://www.ischool.drexel.edu/bigdata/bigdata2013/callforpaper.htm

Pub Type

Conferences

Download Paper

Local Download

Keywords

Big Data Industry Standards, Big Data Open Platform, Big Data Applications and Infrastructure

Information technology, Data and informatics and Image and signal processing

Citation

Bajcsy, P. , Vandecreme, A. , Amelot, J. , Nguyen, P. , Chalfoun, J. and Brady, M. (2013), Terabyte Size Image Computations on Hadoop Cluster Platforms, 2013 IEEE International Conference on Big Data, San Diego, CA, [online], https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=914001 (Accessed April 25, 2024)

Created October 7, 2013, Updated February 19, 2017

Terabyte Size Image Computations on Hadoop Cluster Platforms

Author(s)

Abstract

Download Paper

Keywords

Citation

Additional citation formats