Indiana University's "Big Red" supercomputer is again ranked among
the world's fastest, and IU accomplishments in advanced
cyberinfrastructure are attracting attention and acclaim at the
International Supercomputer Conference being held this week in
Dresden, Germany. Big Red placed 30th on the June 2007 list of the
world's 500 fastest supercomputers unveiled today at the conference,
and Indiana University's Data Capacitor team, with partners from
Technische Universitaet Dresden, demonstrated impressive performance
on a distributed transatlantic Lustre file system designed to move
large amounts of scientific data quickly and easily.
Thanks to an upgrade this spring with the assistance of the Indiana
Economic Development Corporation, Big Red moved up in rank, after
placing 31st in the previous (fall 2006) list. In existence since
1993, The TOP500 List is compiled twice a year by a group of highly
respected leaders in the supercomputing community and released at
the world's two largest supercomputing conferences, held each June
in Germany, and each November in the US. The higher rankings of the
Top500 list change very rapidly.
IU's Data Capacitor, a 535 Terabyte storage system, is also
featured in the week's hardware news at the International
Supercomputing Conference. Using Wide Area Network access to the
Lustre file system and the Data Capacitor over GEANT2 and Internet2
advanced research networks, a team from IU and the Technische
Universitaet Dresden achieved nearly 100 MegaBytes/sec data transfer
over a single 1 Gigabit link across the Atlantic Ocean and has plans
to increase the capabilities of long-distance data access via Lustre
in the near future.
The ability to transparently access data across long distances is
critical to enabling new scientific advances, as the amount of
research data "born digital" continues to skyrocket. IU is also
working within the US to enable use of the Data Capacitor across
long distances, within the NSF-funded TeraGrid. Managing massive
amounts of data and large-scale computational analysis is a critical
aspect of Indiana University's strategy for supporting innovation.
Craig Stewart, associate dean for research technologies and chief
operating officer of Pervasive Technology Labs at Indiana University
said, "Big Red has enabled scientific innovations at IU and, via the
TeraGrid, throughout the nation. The key challenge for us in the
months ahead will be to use Big Red to enable new business
innovations within the State of Indiana, working with our colleagues
from Purdue for the benefit of the economy of the state."
This material is based upon work supported by the National Science
Foundation under grant numbers. CNS-0521433. ACI-0338618l,
OCI-0451237, OCI-0535258, and OCI-0504075. Collaboration with the
Technische Universitaet Dresden has been supported by TU-D, Indiana
University, and the Fulbright Senior Scholar's program. Any
opinions, findings and conclusions or recommendations expressed in
this material are those of the author(s) and do not necessarily
reflect the views of the National Science Foundation (NSF), Lilly
Endowment, Inc., or any other funding agency.
Indiana University Cyberinfrastructure News
-
-
Often a job's workflow has different requirements at different
stages of processing. It would be wasteful in a massively parallel
system to collect hundreds of processors, only to make them wait
while the data files are copied or other serial tasks are
performed. By the same token, if only a minimal number of
processors are requested for the bulk of the job's processing, it
would be nice to increase the number of processors for the parallel
calculations that could benefit. In LoadLeveler, this type of
workflow adjustment is called "staging" or "stepping."
Staging is accomplished in the LoadLeveler (LL) submit script. A
typical LL script has a keyword stanza, followed by the keyword
"queue" and then a shell command execution space. If, a single
stanza exists in the LL script, the job is always assigned
"step 0." This can be confirmed by noticing that job IDs always
seem to end in a zero. However, multiple stanzas can be linked
together in a single LL script to form a workflow.
Suppose we want to run a job that will:
1. Copy a [previously compiled executable] file and a data file
to a [previously created] scratch directory,
2. Run the executable on 48 processors,
3. Copy the results to a safe place, and
remove the output files from the scratch directory.
This can be completely accomplished in a single script and the job
need never reserve more processors than is required for that portion
of the workflow.
How do we link the stanzas together? We introduce two new keywords,
step_name and dependency. We will first name each stanza with
step_name, and then use the dependency keyword to require a
successful completion of the previous step prior to executing the
next one. Each step is allocated its own time block. So, if a queue
limit is one week and six job steps are used, the resulting workflow
could run as long as six weeks. There is virtually no limit to the
number of steps which may be strung together. One ocean model job
required nearly one hundred! A simple example of the case presented
above may be found at:
http://rac.uits.iu.edu/hpc/loadleveler_ex.shtml -
The Cygwin tools are ports of the popular GNU development tools for
Microsoft Windows. They run thanks to the Cygwin library, which
provides the Unix system calls and environment. With Cygwin
installed, Windows users can use tools such as hsi or htar to
connect to the Massive Data Storage System. HSI is fast, robust, and
user-friendly, and provides the best performance interface between
IU's supercomputers and MD.SS. HSI also automatically selects the
best class of service for a given upload.
Cygwin can be downloaded from http://www.cygwin.com/mirrors.html.
Versions of hsi and htar that work with Cygwin can be downloaded
from https://rfs.iu.edu/clients/, which also has HPSS.conf file
for download. HPSS.conf should be put to /usr/local/etc from
cygwin shell or /cygdrive/usr/local/etc/ from Windows command shell.
To run hsi/htar under cygwin, firewall on windows needs to be
unblocked for hsi data transfer. Use the "firewall -on" command to
get around the firewall if you can't modify the firewall rules.
HSI commands will seem familiar to UNIX and FTP users. A session
might look like the following (here % is the UNIX shell prompt,
? is the HSI prompt):
%
% hsi
Principal: jdoe
[jdoe]Password:
Username: jdoe UID: 11021 CC: 11021 Copies: 1 [hsi.3.3.3 Fri
Jan 12 13:36:06 EST 2007]
?
? put myfile1.dat
put myfile1.dat : /hpss/j/d/jdoe/myfile.dat ( 10485760 bytes,
12283.4 KBS (cos=3))
? cd test2
? get myfile2.mov
Scheduler: retrieving file(s)
get myfile2.mov : /hpss/j/d/jdoe/movies/myfile2.dat
(2005/09/29 08:49:03 10485760 bytes, 16842.8 KBS )
Below is an example to use htar once you have opened a cygwin
command shell:
Desktop> mkdir /cygdrive/c/tmp
Desktop> ./htar.exe cf test/t.tar /cygdrive/c/Documents\
and\ Settings/jdoe/My\ Documents
(That all goes on one line, on your Windows XP system.)
Principal: jdoe
[jdoe]Password:
HTAR: HTAR SUCCESSFUL
Desktop>
This backs up the user's "My Documents" folder to the
Massive Data Storage System. -
Use of Research Technologies systems requires a communications
client known as the secure shell (ssh). Graphical applications
require another package known as an X server. These software
packages can be a hassle to install and configure. For convenience,
Research Technologies has created a CD that provides both the secure
shell and an X server that are run directly from the CD without the
hassle of installing or configuring software. The CD is known as
XLiveCD, and it is available from http://xlivecd.indiana.edu/.
After you download a copy and burn it to CD, here's how to use it:
1. Insert CD into the drive
2. Accept the license and indicate that you want to run the software
3. Indicate the number of buttons your mouse has
4. Wait for a window to appear
When the window appears, you can click in it and then run the
secure shell to connect to a system. For example, user hoagyc
would use the following command to connect to Big Red:
ssh hoagyc@bigred.teragrid.iu.edu
You'll then be asked to accept the host's key of authenticity and
to supply your password. You can then run applications, and
graphical applications will create windows on your workstation.
XLiveCD does not yet run on Windows Vista. Briefly, the software
that was used to build XLiveCD does not yet support Windows Vista.
We are watching the development of that software, and we will
release a new version of XLiveCD when the software is stable.
XLiveCD has been very popular. Since December 2004 ,over 1000 copies
have been downloaded by people at IU, and over 175,000 copies have
been downloaded throughout the world. -
There are several job openings available in Indianapolis, focused on
development of applications for multi-core processors.
Technical staff are expected to be computational or computer
scientists with a Ph.D. or M.S. in an appropriate scientific or
engineering discipline, able to work with other scientists/engineers
in creating applications and middleware (tools, libraries, etc.) for
advanced multi-core technologies in targeted disciplines. Experience
developing and optimizing highly scalable parallel code is
essential. Other key skills needed/desired are:
* Initiative, self-motivated, able to work successfully and
achieve objectives without frequent supervision.
* Able to excel in working in situations where not all project
elements are fully or clearly defined.
* Successful in working in distributed collaborative teams.
Demonstrated talents and success in leading teams is valued.
* Strong communication skills, both oral and written.
* Effective in engaging with customers/business partners in
establishing projects and acheiving the project objectives.
Those interested should contact researchtechnologies@iu.edu -
Wednesday, July 25, 12:30-1:30 -- ICTC Room 497 & IMU Walnut Room:
Research Technologies Round Table
Scott McCaulay will discuss IU's role in the TeraGrid.
--------
Tuesday, July 10 -- Submission deadline
Workshop on Progress Toward Petascale Applications in Bioinformatics
and Computational Biology, to be held in conjunction with the IEEE
7th International Symposium on Bioinformatics & Bioengineering
(BIBE 2007), which will be held at Cambridge-Boston, Massachusetts,
USA, October 14-17, 2007.
Paper Submission Final Deadline has been extended to July 10, 2007.
All accepted paper will be published by IEEE indexed in EI, INSPEC,
DBLP and Library of Congress, and can be further included in journal
issues dedicated for IEEE 7th BIBE such as BMC with unique PubMed ID
for each paper (both SCI and PubMed/Medline indexed).
--------
Sunday, October 14, 2007 -- Boston, MA
Indiana University is offering a TeraGrid-related tutorial, "Using
IU's Big Red PowerPC Cluster and IU Storage Resources via the
TeraGrid" at BiBE 2007 (Bioinformatics and Biomedical Engineering),
The primary purpose of this tutorial is to enable TeraGrid users to
learn about the Big Red system so that they can easily use codes
already ported and optimized for that system (e.g. WRF, NAMD, MILC),
or rapidly migrate other applications to Big Red.
In addition, as massive computations commonly depend on massive data
sets as input, and produce massive data sets as output, it may be
useful to obtain a working knowledge of IU's archival data storage
system, and how to store and access files via gridftp.
Plan to attend to gain hands-on experience with Big Red and IU's
High Performance Storage System, as resources on the TeraGrid. For
more information, see
http://www.cs.gsu.edu/BIBE07/index.php -
Planned maintenance
-------------------
System Date Time Action
Libra 07/03 08:00-12:00 logging updates
RDC 07/03 08:00-17:00 OS Patches
Steel 07/03 06:00-10:00 OS Patches
Data Capacitor 07/03 08:00-17:00 updates and patches -
If you have questions pertaining to IU's cyberinfrastructure, or you
are encountering some difficulty, there are several ways to obtain
help.
An introduction and overview titled "Indiana University's
CyberInfrastructure: The least you need to know" has been updated
and is available at http://rc.uits.iu.edu/education_and_training/ .
The IU Knowledge Base (http://kb.iu.edu) is an excellent source of
help on how to do things.
If you have problems which the KB does not enable you to solve,
questions about system outages, or if you just have a problem and
you don't know who to contact, send email to
researchtechnologies@iu.edu.
