V9 Facilities Dec 18, 2015
In 2011, University made a 5-year commitment to build out the facility into a comprehensive cyber infrastructure for research computing, creating the department of Research Computing as a part of UF IT. UF Research Computing has grown to a staff of about 20 FTE. In 2015, the University renewed its commitment for another 5 years through 2020. Now the Research Computing supports the work of 400 faculty-led research groups.
Further details on UF Research Computing can be found at http://www.it.ufl.edu/units/ and http://www.rc.ufl.edu.
Universities in the state of Florida joined forces in the Sunshine State Education & Research Computing Alliance (SSERCA) to build a robust cyberinfrastructure to share expertise and resources (http://sserca.org). The current members are Florida Atlantic University (FAU), Florida International University (FIU), Florida State University (FSU), University of Central Florida (UCF), University of Florida (UF), University of Miami (UM), and University of South Florida (USF). The affiliate institutions are Florida Agricultural and Mechanical University (FAMU), University of North Florida (UNF), and University of West Florida (UWF), Florida Polytechnic University (FPU), Florida Institute of Technology (FIT), Nova South Eastern University, and New College of Florida.
The Florida Lambda Rail (FLR) provides the underlying fiber optic network and network connectivity between these institutions and many others. The FLR backbone completed the upgrade to 100 Gbps in June 2015. The University of Florida is connected to this backbone at the full speed of 100 Gbps and has been connected at that rate to Internet2 backbone since Jan 2013.
Research Computing operates the HiPerGator supercomputer, a cluster-based system with a combined capacity of about 21,000 cores in multi-core servers. In November 2015, this capacity was expanded by adding 30,000 new Intel cores, described at http://www.rc.ufl.edu/resources/hardware/hipergator-2-0/, bringing the total to 51,000 cores. The servers are part of an integrated InfiniBand fabric. The clusters share over 5 PetaBytes of distributed storage via the Lustre parallel file system. In addition, Research Computing houses about 2 PB of storage for the High Energy Physics collaboration of the Compact Muon Solenoid (CMS) experiment. The system includes over 100 NVIDIA GPU accelerators and 24 Intel Xeon Phi accelerators, available for experimental and production research, as well as for training and teaching.
Research projects may involve storing and processing restricted data, including intellectual property (IP), protected health information (PHI), Controlled Unclassified Information (CUI) regulated by Health Insurance Portability and Accountability Act (HIPAA), International Trade in Arms Regulation (ITAR), Export Administration Regulation (EAR), Family Educational Rights and Privacy Act (FERPA). For such projects Research Computing supports two environments
Research Shield https://shield.ufl.edu/ meets the NIST 800-53 rev4 “moderate” rating for contracts that require FISMA compliance and has been operating since June 2015, and
GatorVault http://www.rc.ufl.edu/resources/hardware/gatorvault/ is approved for PHI, FERPA, IP, and ITAR/EAR restrictions and started operating in December 2015.
More details can be found at http://www.rc.ufl.edu/services/compliant-environment/.
The Research Computing systems are located in the University of Florida data center. The machine room is connected to other campus resources by the 200 gigabit per second Campus Research Network (CRN), now commonly called Science DMZ. The CRN was created with an NSF Major Research Instrumentation award in 2004 and has been maintained by the University since the end of that award. The CRN connects the HPC systems to the Florida Lambda Rail, from which the National Lambda Rail and Internet2 are accessible. The University of Florida was the first institution (April 2013) to meet all requirements become an Internet2 Innovation Platform, which implies the use of software defined networking (SDN), the implementation of a Science DMZ, and a connection at 100 Gb/s to the Internet2 backbone. An NSF CC-NIE award in 2012 funded the 100 Gb/s switch and an NSF MRI grant awarded in 2012 funded the upgrade of the CRN (Science DMZ) to 200 Gb/s. The upgrade has been operational since the winter of 2013.
By the end of 2014, the campus network infrastructure was upgraded to support virtual network environments. These virtual environments enable extending physical networks beyond their physical boundaries that traditionally coincide with individual buildings. There are three physical networks:
The Academic network,
The Health network that allows protected health information to be stored and accessed,
The Campus Research network or Science DMZ connecting HPC resources with data generating instruments.
With the virtual network environments it is possible to connect instruments in any enabled building to the Science DMZ virtual environment, even if the instrument resides in a building that is served by the physical Health network. Similarly researchers can choose to be connected to the Academic virtual network even if their offices are in a Health network building. The virtual environments allow deployment of the correct policies and security measures on a fine-grained scale to meet the needs of the activities of the people using the network. Future virtual network environments to be added include
Administrative virtual network environment, with a level of security in between academic and health.
Industrial building control network environment will allow separating traffic for monitoring and controlling building systems from the networks used by the occupants of the buildings.
The funding model for Research Computing includes the commitment from the Provost, the VP for Research, and the VP and CIO to provide for machine-room facilities with electrical power and cooling and professional staff. The University has a substantial investment in research computing infrastructure including a data center completed in 2013 on the East Campus that provides 10,000 sq. ft. of machine room space, of which 5,000 sq. ft. is dedicated to research computing.
The University pays the salaries of the 20 highly-qualified, staff members, including several with a PhD or Master degree in science or engineering. Staff members, in addition to sharing in the system design, installation, and administration duties, provide application support and consulting services to faculty members, their research associates, and their graduate students. This support ranges from assistance with job flow management and installation of open-source software to teaching students how to improve the MPI performance of their programs.
Research Computing provides advanced support and training to the user community. Many training materials are now available online. The schedule can be found at http://wiki.rc.ufl.edu/doc/Training.
In addition, user feedback meetings are held as well as periodic training workshops called Research Computing Day are organized every semester. Several graduate courses use HiPerGator and train and prepare graduate students to use the clusters and the software for their thesis research.