|
CluE's Data Intensive
Computing Journey Takes Next Step
05 May 2008
The Computer and Information Science
and Engineering (CISE) directorate at the National Science Foundation
(NSF) released a solicitation for proposals for the new Cluster
Exploratory (CluE) initiative. The CluE program was announced in
February as a part of a relationship between Google, IBM and NSF. NSF
hopes this initiative will help lead to innovations in the field of
data-intensive computing, as well as serve as an example for future
collaborations between the private sector and the academic computing
research community.
Data
clusters, also called data centers or server farms, contain as many as
90,000 servers.
CluE will provide NSF-funded researchers access to software and services
running on a Google-IBM cluster to explore innovative research ideas in
data-intensive computing. NSF will allocate cluster computing resources
for a broad range of proposals which will explore the potential of this
technology to contribute to science and engineering research and produce
applications which promise to benefit society as a whole.
"The software and services that run on these data clusters provide a
brand new paradigm for highly parallel, highly reliable distributed
computing, especially for processing massive amounts of data," said
Jeannette Wing, assistant director for CISE at NSF. Academic researchers
have expressed a need for access to similar computing resources that
will allow them to engage and explore this emerging and pervasive model
of computing.
In the last five years, private sector companies have launched a number
of highly effective Internet-scale applications powered by massively
scaled, highly distributed computing resources known as data clusters.
Sometimes referred to as data centers or server farms, these clusters
contain as many as 90,000 servers, each co-located with hundreds of
gigabytes of data. These increases in network capacity and fundamental
changes in computer architecture are encouraging software developers to
take new approaches to computer-science problem solving.
Jeannette
Wing, assistant director for computer and information science and
engineering at NSF, discusses the CluE solicitation.
Until now, such resources have not been easily available or affordable
for academic researchers. In October 2007, Google and IBM created a
large-scale computer cluster of approximately 1600 processors to give
the academic community access to otherwise prohibitively expensive
resources. Earlier this year, NSF joined with two companies to assist
with this effort, and the CluE initiative was born. "With the CluE
initiative," Wing said, "through the software and services provided by
Google and IBM, the academic research community will now have access to
such resources."
This new relationship expands access to this research infrastructure to
academic institutions across the nation. In an effort to create greater
awareness of research opportunities using data-intensive computing, NSF
is now soliciting proposals from academic researchers who will then be
selected by NSF to have access to the cluster.
NSF will also provide support to the researchers to conduct their work
while Google and IBM will cover the costs associated with operating the
cluster and other support to the researchers.
Wing noted that the initiative is looking for proposals that focus on
data-intensive applications and "not cluster computing per se. We are
not looking for scientific applications that are based primarily on
solving massive numbers of partial differential equations since high-end
computing resources are available for such research already."
From this initial solicitation of the CluE initiative, NSF expects to
award up to $5 million spread between 10 and 15 awards, depending on
availability of funds. Selected projects will be funded up to $500,000,
for durations of up to two years. |