Table of Contents Executive Summary 3


Workload Management System



Download 337.55 Kb.
Page15/20
Date08.01.2017
Size337.55 Kb.
#7901
1   ...   12   13   14   15   16   17   18   19   20

3.12Workload Management System


The primary goal of the OSG Workload Management System (WMS) effort is to build, integrate, test and support operation of a flexible set of software tools and services for efficient and secure distribution of workload among OSG sites. There are currently two suites of software utilized for that purpose within OSG: Panda and glideinWMS, both drawing heavily on Condor software.

The Panda system continued as a supported WMS service for the Open Science Grid, and served as a crucial infrastructure element of the ATLAS experiment at LHC. We completed the migration of Panda software to Oracle database backend, which enjoys strong support from major OSG stakeholders and allows us to host an instance of the Panda server at CERN where ATLAS is located, creating efficiencies in support and operations areas. Work has started on the Panda monitoring system upgrade, which will allow for a better integration with ATLAS Production Dashboard, as well as for easier implementation of user interfaces suitable for Panda applications outside of ATLAS. The Panda system has proven itself in a series of challenging scalability tests conducted in the months preceding the LHC run in the fall of 2009.

To foster wider adoption of Panda in the OSG user community, we created a prototype of a data service that will make use easier by individual users, by providing a Web-based user interface for uploading and management of input and output data, and a secure backend that allows Panda pilot jobs to both download and transmit data as required by the Panda workflow. No additional software is required on the users’ desktop PCs or lab computers, and this will be helpful for smaller research groups, who may lack manpower to support larger software stacks. This, in part, helped us to start collaborating with BNL research group of Long Baseline Neutrino Experiment at DUSEL, who expressed strong interest in using Panda to meet their science goals. Other potential users include an interdisciplinary group at University of Wisconsin (Milwaukee), who are starting work on an advanced laser spectrometry system.

We started using Panda to facilitate the ITB (Integration Testbed) activity in OSG, which allows site administrators to monitor test job execution from a single Web location and have test results automatically documented via Panda logging mechanism.

Progress was made with the glideinWMS system approaching the project goal of pilot-based large-scale workload management. Version 2.0 has been released and is capable of servicing multiple virtual organizations with a single deployment. The FermiGrid facility has expressed interest in putting this in service for VOs based at Fermilab. Experiments such as CMS, CDF and MINOS are currently using glideinWMS in their production activities. The IceCube project recently used the glideinWMS facility at UCSD to run their jobs on OSG resources. Discussions are underway with new potential adopters including DZero and CompBioGrid. We also continued the maintenance of the gLExec (user ID management software), a collaborative effort with EGEE, as a project responsibility.

In the area of WMS security enhancements, we completed integration of gLExec into Panda. It is also actively used in glideinWMS. In addition to giving the system more flexibility from security and authorization standpoint, this also allows us to maintain a high level of interoperability of the OSG workload management software with our WLCG collaborators in Europe, by following a common set of policies and using compatible tools, thus enabling both Panda and glideinWMS to operate transparently in both domains. An important part of this activity was the integration test of a new suite of user authorization and management software (SCAS) developed by WLCG, which involved testing upgrades of gLExec and its interaction with site infrastructure. We made a significant contribution to that activity and continue to actively support further work in that area.

Work continued on the development of Grid User Management System (GUMS) for OSG. This is an identity mapping service which allows sites operate on the Grid while relying on traditional methods of user authentication, such as UNIX accounts or Kerberos. Based on our experience with GUMS in production since 2004, a number of new features have been added which enhance its usefulness for OSG. In this reporting period, we faced and addressed the following challenges:


  • The absence of a lightweight, easy-to-maintain data transfer mechanism for jobs executed in Panda pilot-based framework. This was resolved by creating such a system, which is now available to users.

  • The legacy nature of certain parts of the Panda information system, notably monitoring, which impeded integration with experiment-wide dashboard software of ATLAS, and necessary specialization of user interfaces in other cases. We addressed this by initiating work to upgrade Panda monitoring, by moving it to a modern platform and a modular structure with network API.

  • The lack of awareness of potential entrants to OSG of capabilities and advantages of Workload Managements Systems run by our organization. This will need to be rectified by producing a set of comprehensive documentation available from OSG Web site.

This program of work continues to be important for the science community and OSG for several reasons. First, having a reliable WMS is a crucial requirement for a science project involving large scale distributed computing which processes vast amounts of data. A few of the OSG key stakeholders, in particular LHC experiments ATLAS and CMS, fall squarely in that category, and the Workload Management Systems supported in the OSG serve as a key enabling factor for these communities. Second, drawing new entrants to OSG will provide benefit of access to opportunistic resources to organizations that otherwise would not be able to achieve their research goals. As more improvements are made to the system, Panda will be in a position to serve a wider spectrum of science disciplines.



Download 337.55 Kb.

Share with your friends:
1   ...   12   13   14   15   16   17   18   19   20




The database is protected by copyright ©ininet.org 2024
send message

    Main page