Panel Discussion |
A. Panel for CSC 2011
|
B. Panelists
Prof. Lixin Gao, University of Massachusetts C. Panel Theme Nowadays, Cloud is not a debate of a future technology, but it simply exists. It is initiated based on years of vision, research and development. The Cloud now penetrates into our daily life and it provides resource from computation and hardware, to software and services.
Title: Distributed Framework for Iterative Computations on Massive Datasets
Abstract: Iterative algorithms are pervasive in many applications such as search engine algorithms, machine learning, and recommendation systems. These applications typically involve a dataset of massive scale. Fast iterative computations of the massive datasets are essential for these applications. This is particular important for on-line query such as keyword based search query. In this talk, we present an overview of MapReduce framework, and propose two frameworks, iMapReduce and pMapReduce, that enable fast iterative computations. By providing the support of iterative computations and prioritized execution, we can ensure faster convergence of the iterative process. Both iMapReduce and pMapReduce preserve the MapReduce distributed computing framework and is particularly efficient for online queries such as top-k queries. We implement iMapReduce and pMapReduce based on Apache Hadoop and evaluate its performance. Our evaluation results show that pMapReduce can reduce the computation time by two orders of magnitude comparing to that achieved with MapReduce. At the end of the talk, I will provide an overview of on-going projects in my research group.
|