The ASC Program petascale ecosystem components at LLNL where the Sequoia system will be integrated, incorporates a single enterprise wide file system to which multiple computational, visualization and archival resources read and write simulation, visualization and checkpoint/restart data. There are strong incentives for an enterprise-wide file system as it is prohibitive in cost and performance to move and/or copy multi-petabyte file sets that are created in the simulation phase for subsequent processing, as for post processing and visualization. The goal of LLNS with the development of Sequoia is to integrate this system into an existing Secure Computing Facility (SCF) simulation environment based on the Lustre5 enterprise wide file system. This simulation environment at LLNL will be based on 1 and 10 Gb/s Ethernet and possibly InfiniBand™ technology.
Figure 1 5: The Sequoia simulation environment at LLNL includes access to the Lustre enterprise wide file system, Login Nodes (LN), Service Nodes (SN) and control management network, visualization cluster (VIS), archive and WAN resources.
A schematic of the Sequoia simulation environment at LLNL is depicted in Figure 1 -5. This SOW includes the Sequoia back-end of compute nodes (CN) and I/O nodes (ION), the login nodes (LN), the control management network, and the Service Nodes (SN). Other existing and future compute, visualization and storage resources are part of the overall LLNL classified simulation environment. A Lustre based enterprise-wide file system and 1/10 Gigabit Ethernet and possibly IBA federated Storage Area Network (SAN) switch are LLNS furnished government property (LFGP).
In this Sequoia target architecture, CN are a set of nodes that run end-user scalable MPI and SMP parallel scientific simulation applications and are scaled to meet the overall peak petaFLOP/s and delivered application performance metrics in section 2.1. ION provide Lustre IO function shipping capability and high-bandwidth access to the Lustre based OSS and MDS resource for Lustre parallel file system access to applications running on the CN. The number of ION are scaled to meet the delivered IO performance requirements in sections 2.3 and 2.9.1. In addition, ION provide IP routing from the CN to the SAN. LN provide nodes for users to login (via ssh and associated tools) and interact with the system to perform code development activities, run and interact with interactive jobs and manage (launch, terminate and status) batch jobs. The number of LN are scaled to meet the number of active users and compilations required in section 2.6.1. SN are a set of nodes that provide all scalable system administration and RAS functionality. The number of required SN is determined by Offeror’s scalable system administration and RAS architecture and the overall size of the system.
The diagram explicitly shows the interconnection by SAN switch of the back-end of Sequoia, the front-end nodes, service node, and the Lustre file system. This configuration provides for the addition of future services via connection to the SAN switch.
The login nodes are the interactive resources on which users login to access Sequoia. Users will edit, configure and compile codes, create job control files, launch jobs on Sequoia, post process output, and perform other interactive activities. System administrators will also utilize the front-end nodes to control and configure Sequoia.
A large federated 1/10 Gigabit-Ethernet and possibly IBA switch is the main communications path from Sequoia to the outside world. This switch is designed to provide high-speed connectivity to the Lustre file system which is the main disk storage for Sequoia. This switch also gives other resources access to the files on the Lustre file system. Interactive users on the front-end nodes will have ready access to the files on Lustre, as well as visualization servers, archive services, and other resources on the SCF network.
A control and management network (CMN), shown in yellow in Figure 1 -5, provides system administrators with a separate command and control path to Sequoia. This private network is not available to unprivileged users.
Figure 1 6 ASC Dawn Simulation Environnent.
The ASC Program intends to integrate the Dawn system into the existing SCF 1/10 Gb/s Ethernet federated switch Storage Area Network (SAN) currently in use at LLNL for classified computing, see Figure 1-6. The ASC Program will augment this SAN and Lustre file system with the necessary networking and RAID disk resources to provide an appropriately scaled Lustre file system for Dawn and the other computing resources connected to the SCF simulation environment. Therefore, it is essential that the I/O subsystem for connections for Dawn be based on a SAN technology that can interoperate in this heterogeneous environment. At this time, it appears the leading contenders for this SAN technology are: Infiniband™ 4x QDR, 1000Base-SW and 10GBase-SW.
In addition, the ASC Program expects TCP/IP off-load engines (TOEs) to be available for these competing SAN technologies. These TOEs will allow extremely fast TCP/IP communications that don’t burden the cores/threads on the Dawn and Sequoia nodes originating the traffic. Thus the ideal Dawn and Sequoia systems will have outboard (to the IO nodes) TOE devices that interface the SAN to the external networking environment.
External networking I/O to LAN, WAN, and SAN networks in the ideal system would support multiple protocols, perform channel striping, and have sufficient bandwidth to be in balance with the other elements of the system. Depending upon system protocol support, IP version 4 and IP version 6 traffic will be carried on the LAN and WAN. These circuits will support either IP over 1000Base-SW or 10 Gb/s Ethernet.
The operating environment shall conform to DOE security requirements. Software modifications must be made in a timely manner to meet changes to those security requirements.