Core 1: Biomedical Computation Research



Download 0.87 Mb.
Page9/9
Date01.02.2018
Size0.87 Mb.
#37590
1   2   3   4   5   6   7   8   9

The trajectory analysis methods reviewed above are general and can be applied to any of our Driving Biological problems. For example, one of the major challenges in our simulation of neuroprosthetic dynamics is to relate the neural recordings and simulated arm movements. We will explore the use of topological mode analysis to relate these complex trajectories. Bringing together experts in trajectory analysis (Leo Guibas) with experts in simulation (e.g., Scott Delp) and Neuroscience (Krishna Shenoy) enable us to test the ability of existing methods to predict motor performance and motivate the development of new methods.

1.6 Software Engineering


Simbios is able to create and disseminate high-quality software for biomedical research. Our goal is to distribute industrial-quality tools that researchers may confidently employ. Our software must deliver correct answers with high performance, run on a variety of platforms, download and install smoothly, be maintainable over a long period of time, be well documented and actively supported. We employ well-established tools and best-practices methodology from industrial software engineering to achieve these goals.

1.6.1 Professional Software Staff


We employ a small group of experienced software development managers and professional programmers who have developed our software engineering protocols. Our Executive Director and Chief Software Architect each have decades of management and professional software development experience, as does the staff they hired. Establishing a rigorous software development methodology able to capture the best work of many bright and creative individuals is more a sociological exercise than one of command and control. We have developed a self-sustaining software development culture that encourages best practices.

1.6.2 Open development process


Our development methodology is open to all, transparent, and subject to discussion and modification. We have built considerable infrastructure to ensure that source code and unit test results are immediately available to everyone. For example, anyone interested in a particular source module can sign up for notification, after which they receive automatically generated emails every time any code change is made. Automated unit tests are run every night on multiple platforms, and posted on simtk.org (example shown). Notification of failed tests is sent to all interested parties, creating social pressure that encourages developers to test well.

We have instituted a variety of lightweight practices that encourage openness. For example, every staff member and most academic contributors post a weekly “AOI” email (for accomplishments, objectives, and issues) to a list to which all the other contributors, managers, and other interested parties subscribe. Each post begins with an assessment of what happened with last week’s objectives and includes an updated set of objectives for the next week. Weekly Simbios staff meetings include a rotation of staff, postdocs, and other contributors where code review, methodologies, validation issues, successes and failures, and future plans are discussed with fellow staff, the PIs and the Executive Director. These sessions have a clear problem solving orientation rather than criticism. An important consequence of these seemingly soft practices is that a large amount of high-quality data are available for management decision making. We use these data to identify problems and direct resources effectively. Our tools also provide us with strict but discreet control of what software ends up as part of our supported code base.


1.6.3 Architecture, abstraction, and shared vision


Although we have an open process, our development occurs within a defined object-oriented [30] architecture as developed by a small number of experienced software architects (Sherman & Eastman). The software’s architectural features are discussed and revised as appropriate, but it is ultimately the responsibility of the software architect to choose the set of abstractions (“objects” in OO) around which the overall code is to be built, and to communicate those abstractions and the reasoning behind them to the entire team. These can be controversial decisions, but it is important to choose a set of immutable architectural features to create a coherent vision for the software as a whole so that the pieces work together. As an example, we chose early on to separate the concepts of “system” and “state” in the SimTK architecture; this has many day-to-day consequences and affects much of the work we do and how users work with the code. Other designers might have chosen to encapsulate a system and its state into a single object; that would have led to substantially different code characteristics. Simbios’ wide range of scales (molecules to whole organisms) has mandated flexible and powerful object designs, and our goal for architectures to handle a vast range of scales has helped “future proof” the architecture of our code.

1.6.4 Validation and testing


All software requires constant testing in the form of unit tests to catch bugs. Unit tests are the responsibility of every SimTK programmer and our nightly build system automatically runs many tests and reports problems. Scientific software development also entails the much more demanding process of validation, that is, showing that the software produces numerical results that are correct with respect to some defined metric such as a the theory on which the software is based. For example, any results from a multibody code must satisfy Newton’s 2nd law F=ma. A thermostat applied to a molecular distribution should yield velocities that follow a Boltzmann distribution. Hardware-accelerated molecular force fields must produce identical answers to their unaccelerated precursors, to within an expected numerical tolerance that is itself a subject for careful analysis. Validation tests generally require domain expertise and often involve collaboration between programmers and scientists. We take validation seriously and our code reviews always involve questions of validation. Validation is part of our software engineering culture, and we provide support for time-consuming validation work. Whenever possible, the results of validation work are encapsulated in regression tests that are run nightly along with the unit tests to catch problems that may be introduced later.

SimTK software also carries validation over to the end user installation process. We include with our distributions a set of test cases that a user can run after installation to validate that the install has been performed successfully. This can be particularly important for more complex installations involving heterogenous hardware support, as is the case for GPU accelerated computation.

1.6.5 Specific tools and practices


We have adopted Subversion (http://subversion.tigris.org) for source code control and use it for code, documentation, specifications, publications and anything else of value that we want to preserve, maintain, and share. We use CMake (http://www.cmake.org) as our multi-platform build system, automated regression test facility, and release packaging system. We use Doxygen (http://www.doxygen.org) to generate reference documentation directly from source code (Figure 1.9). We built direct support for these excellent open source tools into simtk.org’s project management facility so that it is easy for anyone to set up a project that uses them, in addition to bug and feature request tracking, user forums and Wiki pages, and granular access control. This is easy enough that many student class projects choose to use simtk.org-hosted projects.

We follow a continuous build cycle, with new builds of our entire software base occurring every night, combined with a rigorous release cycle in which branches are made off the development trunk, extensively tested, used, and stabilized before becoming a posted release. We feel that this provides the benefits of an agile development process but with a level of conservatism appropriate for the release of scientific software where numerically-correct results are expected at all times. Our developers examine existing open source software, and when they find technically and legally suitable packages we incorporate them. Examples include: Lapack for linear algebra, Ipopt for constrained nonlinear optimization, OpenCL for hardware-independent parallelism, and many more. We primarily use object-oriented programming in languages that support it; primarily C++ but also Java and Python. Although we have formal coding practices, and formal presentations at times, we have found the most powerful form of review is to have people work together on the same code. We typically assign a staff professional to work with a biocomputational graduate student or postdoc to “harden” research code into something reusable by others; this has been a highly effective approach and we intend to continue that practice.



References


1. Stone, J.E., et al., Accelerating molecular modeling applications with graphics processors. J Comput Chem, 2007. 28(16): p. 2618-40.

2. Friedrichs, M.S., et al., Accelerating molecular dynamic simulation on graphics processing units. Journal of Computational Chemistry, 2009. 30(6): p. 864-872.

3. Elsen, E., et al., N-Body simulation on GPUs, in Proceedings of the 2006 ACM/IEEE conference on Supercomputing. 2006, ACM: Tampa, Florida. p. 188.

4. Harvey, M.J., G. Giupponi, and G. De Fabritiis, ACEMD: Accelerating Biomolecular Dynamics in the Microsecond Time Scale. J. Chem. Theory Comput., 2009. 5(6): p. 1632 -1639.

5. Munshi, A., ed. The OpenCL Specification. 2008, The Khronos OpenCL Working Group.

6. Bowman, G.R., X. Huang, and V.S. Pande, Using generalized ensemble simulations and Markov state models to identify conformational states. Methods, 2009. 49(2): p. 197-201.

7. Shirts, M. and V.S. Pande, COMPUTING: Screen Savers of the World Unite! Science, 2000. 290(5498): p. 1903-1904.

8. Jayachandran, G., et al., Parallelized-over-parts computation of absolute binding free energy with docking and molecular dynamics. J Chem Phys, 2006. 125(8): p. 084901.

9. Anderson, F.C. and M.G. Pandy, A Dynamic Optimization Solution for Vertical Jumping in Three Dimensions. Computer Methods in Biomechanics and Biomedical Engineering, 1999. 2(3): p. 201 - 231.

10. Hatze, H., An efficient simulation method for discrete-value controled large-scale neuromyoskeletal system models. Journal of Biomechanics, 2001. 34(2): p. 267-271.

11. Izaguirre, J.A., S. Reich, and R.D. Skeel, Longer Time Steps for Molecular Dynamics. Journal of Chemical Physics, 1999. 110(20): p. 9853-9864.

12. Sanbonmatsu, K.Y. and C.S. Tung, High performance computing in biology: Multimillion atom simulations of nanoscale systems. Journal of Structural Biology, 2007. 157(3): p. 470-480.

13. Phillips, J.C., et al., Scalable molecular dynamics with NAMD. Journal of Computational Chemistry, 2005. 26(16): p. 1781-1802.

14. Schlick, T., E. Barth, and M. Mandziuk, Biomolecular Dynamics at Long Timesteps:Bridging the Timescale Gap Between Simulation and Experimentation. Annual Review of Biophysics and Biomolecular Structure, 1997. 26(1): p. 181-222.

15. Amdahl, G.M., Amdahl, G.M. Validity of the single-processor approach to achieving large scale computing capabilities. In AFIPS Conference Proceedings, vol. 30 (Atlantic City, N.I., Apr. 18-20). AFIPS Press, Reston, Va., 1967, 1967: p. 483-485.

16. Boal, D., Mechanics of the Cell. 2002: Cambridge University Press.

17. Phillips, R., J. Kondev, and J. Theriot, Physical Biology of the Cell. 2008: Garland Science.

18. Chazal, F., et al., Proximity of persistence modules and their diagrams, in Proceedings of the 25th annual symposium on Computational geometry. 2009, ACM: Aarhus, Denmark. p. 237-246.

19. Chodera, J.D., et al., Automatic discovery of metastable states for the construction of Markov models of macromolecular conformational dynamics. J Chem Phys, 2007. 126(15): p. 155101.

20. Yao, Y., et al., Topological methods for exploring low-density states in biomolecular folding pathways. J Chem Phys, 2009. 130(14): p. 144115.

21. Nadler, B., et al., Diffusion Maps, Spectral Clustering and Eigenfunctions of Fokker-Planck Operators, in Neural Information Processing Systems (NIPS), 2005. 2005.

22. Hatcher, A., Algebraic Topology. Cambridge University Press, 2002., 2002.

23. Muhammad, A. and A. Jadbabaie. Decentralized Computation of Homology Groups in Networks by Gossip. in American Control Conference, 2007. ACC '07. 2007.

24. Sun, J., M. Ovsjanikov, and L. Guibas, A Concise and Provably Informative Multi-scale Signature Based on Heat Diffusion. Proc. Eurographics Symposium on Geometry Processing (SGP), 2009.

25. Andersen, R., F. Chung, and K. Lang. Local Graph Partitioning using PageRank Vectors. In Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science. in FOCS. IEEE Computer Society. 2006. Washington, DC.

26. Andersen, R., F. Chung, and K. Lang. Local Graph Partitioning using PageRank Vectors. in Foundations of Computer Science, 2006. FOCS '06. 47th Annual IEEE Symposium on. 2006.

27. Bendich, P., et al. Inferring Local Homology from Sampled Stratified Spaces. in Foundations of Computer Science, 2007. FOCS '07. 48th Annual IEEE Symposium on. 2007.

28. Noe, F., et al., Hierarchical analysis of conformational dynamics in biomolecules: transition networks of metastable states. J Chem Phys, 2007. 126(15): p. 155102.

29. Jain, A.K., M.N. Murty, and P.J. Flynn, Data clustering: a review. ACM Comput. Surv., 1999. 31(3): p. 264-323.

30. Koontz, W.L.G., P.M. Narendra, and K. Fukunaga, A Graph-Theoretic Approach to Nonparametric Cluster Analysis. IEEE Trans. Comput., 1976. 25(9): p. 936-944.

31. Edelsbrunner, H., D. Letscher, and A. Zomorodian, Topological persistence and simplification, in Proceedings of the 41st Annual Symposium on Foundations of Computer Science. 2000, IEEE Computer Society. p. 454.

32. Chodera, J.D., et al., Long-time protein folding dynamics from short-time molecular dynamics simulations. Multiscale Modeling & Simulation, 2006. 5(4): p. 1214-1226.




PHS 398/2590 (Rev. 06/09) Page Continuation Format Page




Download 0.87 Mb.

Share with your friends:
1   2   3   4   5   6   7   8   9




The database is protected by copyright ©ininet.org 2024
send message

    Main page