Protein Explorer: Structural Visualization and Functionalities
Sana Bashir, Graduate Student, Dept. Biotechnology, University of Texas at Dallas, Texas
Protein 3D structures are being solved at an accelerating rate, with expanding impact on understanding basic biological mechanisms as well as design of drugs. Researchers who are not specialists in crystallography or protein modeling need to be able to find, look at, and understand published 3D protein structures easily. Realistic 3D Pictures help in understanding the structure and function of biologically significant molecules. Although architects, engineers, the movie industry and computer-game developers have used visualization techniques for more than two decades, biologists are only beginning to discover three-dimensional (3D) graphics software as a powerful tool to visualize molecular and cellular structures. This paper will discuss such visualization tool called Protein Explorer; it has severed as a good tool for novices as well as researchers. There are ongoing modifications done to the software currently; and I will underlie some different improvements that could contribute further more to the efficiency of this software.
Introduction
To date, most protein configurations have been determined by experimental measurements, such as x-ray crystallography, electron crystallography, or NMR techniques. The function of a protein is primarily determined by its three-dimensional geometry, which is, in turn, determined by its one-dimensional amino acid sequence. Biological scientists and pharmacologists require best protein visualization softwares, which allow them to view and explore known protein structures in 3D so that they can apply human intuition to the folding problem and can develop drugs that interact with proteins without directly solving the folding problem. For this reason, research in computational biology is allowing scientist to create a visualization environment intended to facilitate the study of proteins, where the motion and behavior of proteins and other molecules are studied in the computer rather than in the test tube.
Before discussing Protein Explorer features, I would like to brief on what softwares came before it, to see how the concept of “protein visualization” has progressed in computers over the years. The small scale renderings were standardized in 1965 when Carroll Johnson created the Oak Ridge Thermal Ellipsoid Program, ORTEP, a computer program that renders atoms as balls or ellipsoids connected by stick bonds. Cyrus Levinthal and Robert Langridge used computer graphics for larger scale renderings in structural biology.
The first molecular visualization program based on vectors was Mage, it lacked depth of perception needed in 3D modeling. The theoretical foundations for creating realistic 3D images were laid as early as the 1960s, when the first ray-tracing algorithms to render virtual sceneries were developed (Appel, 1968; Goldstein & Nagel, 1971). In 1989, some crystallographers, replaced the original "vector" based displays with new Silicon Graphics workstations.
Biochemistry was shifting its emphasis from bonds and connectivity to surfaces and interactions in 3D. The two possible ways of increasing depth perception was to add shadowing from a light source and allow images to be manipulated in real-time. Roger Sayle worked on a new implementation on the ray-tracing algorithm. In his view, Spheres were the easiest graphical object to draw, and hopefully the parallel machines available at the time would achieve some speed-up on the problem; hence the birth of RasMol in 1992. Roger spent time comparing different algorithms for speeding up ray-tracing on parallel computers. And won the University of London prize and had the second fastest program in the world for displaying shadows on space filling molecules. The fastest at this time was an implementation of ray-trace in Raster3D running on the 800 processor Edinburgh Parallel Supercomputer. He discovered that the improvements that made Raster3D go faster and was able to apply it to RasMol.
Continually improving over the years, ray-tracing software now produces highly realistic images by a process that could be described as 'reverse seeing'. Instead of collecting the light from light sources and objects as an eye or a camera does, the computer shoots rays from a virtual camera representing the viewer's position into a scene to calculate the color of every pixel. If the ray does not hit an object, the pixel adopts the background color. If the ray hits an object, the computer assigns the color of the object to the pixel, and then creates new rays bouncing away from it, or if the object is transparent, passing through it. The process is repeated until the rays hit the background or a light source, or the number of rays exceeds a pre-determined limit. The cumulative color information from all the rays is then used to calculate the final color of each pixel.
Later, Chime, a Netscape plug-in derived from RasMol brought molecular visualization to the Internet community. RasMol was designed for advanced users in mind and had to be operated through command line scripting. Therefore, Protein Explorer, a RasMol-derivative, was developed to allow user friendly interface and faster rendering options. All features from RasMol and Chime were adopted in this software to make accessible to non-techies. Next section will discuss in detail about Protein Explorer and it’s progress.
Protein Explorer
The goal of Protein Explorer is to enable one to focus on the science, not the software. Chime adds a great deal of power to RasMol, but using the full power of either program requires advanced technical knowledge and a lot of time -- hence their power is inaccessible to most of the people who could benefit from it. Protein Explorer was developed to make the power of Chime accessible to novices. Eric Martz coordinated the development of Protein Explorer for the technically challenged, due to his interests in molecular visualization and study of RasMol. Protein Explorer helped visualize the three-dimensional structures of protein, DNA, and RNA macromolecules, and their interactions and binding of ligands, inhibitors, and drugs. It was designed to be suitable for high school and college students, yet it is also widely used by graduate students and researchers. The Protein Explorer in its current version analyzes and visualizes
(either with Chime plug-in or Java based Jmol) atomic interactions within a protein or protein complex, including resolved water molecules and attached ligands, and nucleic acids. Different levels of analysis can be chosen: contacts can be grouped and sorted by atom, residue or contact type (Hbond, hydrophobic–hydrophobic and aromatic–aromatic). The output provides characteristics for every atom–atom contact (atom properties, and contact area). A typical output is illustrated in Figure 1.
Figure 1. Graphical output of Protein Explorer. Interactions of residue from pdb entry 1D66. All water molecules are in red.
DNA and two protein strands are shown.
According the developers, the software main features are:
-
A variety of one-click renderings and color schemes help to visualize the backbone, secondary structure, distributions of hydrophobic vs. hydrophilic residues, non-covalent bonding interactions, salt bridges, amino acid or nucleotide sequences, sequence-to-structure mappings and locations of residues of interest, and patterns of evolution and conservation. See the gallery of Snapshot s of Protein Explorer in action.
-
The easy user interface means that you do not need to learn any "RasMol command language", although those who have learned some can use it freely.
-
The Explanations, color keys, and help are displayed automatically with each operation you perform. It includes an extensive glossary and index to features to help you find what you are looking for.
-
Available online through Chime plug-in application.
-
Ongoing Work
Even though Protein Explorer serves it function properly. There are still some software and visual inaccuracy issues with the software. I discussed the limitation in my presentation; below is a highlight of technical limitations that Protein Explorer developers are working on.
-
Software Issues:
-
It is not possible to load multiple PDB files into Chime
-
Protein Comparator is not functioning in new beta version
-
Saving the state of your PE session is problematic
-
Accuracy Issues:
-
The positions of covalent bonds may be shown incorrectly.
-
Cylinders are not available as a cartoon rendering for alpha helices.
-
A mechanism to locate and display all hydrogen bonds
Also, developers have ported Protein Explorer into JMol java applet to reach out to Java users. JMol is free, has an open-source license and development team, and operates on a wider range of platforms and browsers than does MDL Chime. Graphic rendering quality is easy to achieve due to Java’s built in Graphics API. Having open source code means that bugs can be fixed and enhancements added as needed by the user community. A greater variety of processing options can be added. Secondly, the interface of the program and presentation of the results can be improved; one can extend the existing GUI, and optimize the presentation of the structures.
According to JMol developers it uses a totally different means of rendering. Instead of painting a ribbon or other flat surface in 2D, and instead of using a sequence of spheres to generate a trace (PE chime version), what this does is create a 3D surface that is displayed using the same mechanism as for orbital. (It's a mesh of quadrilaterals.) Only drawback is that it uses significantly more memory. To overcome that problem; There is now a preliminary check to see if the alpha carbon of a group is well outside the window frame, and if it is, then the whole group is skipped in terms of rendering.
-
Future Implementation in Protein Explorer?
Since the current implementations mentioned above are in the works, I decided to research various visualization tools and graphic visualization techniques used in the computer world and introduce some new improvement ideas that Protein Explorer can benefit from.
-
Homologous Modeling
Currently, many visualization softwares are providing homologous modeling, also known as comparative modeling, as method for constructing an atomic-resolution model of a protein from its amino acid sequence. Comparative modeling cannot be done within Protein Explorer, the way to get around it is that a comparative model is produced outside of Protein Explorer with tools such as SWISS-Model and Deep View and the newly created pdb is loaded into Protein Explorer for visualization. Adding the functionality of these secondary tools into Protein Explorer will help get the work done in one place and the users don’t have to worry about learning new softwares.
-
Utilization of Static Data
In protein Explorer, the static data comes from the PDB files. The software extracts the atomic co-ordinates to render the accurate 3D structure of the molecule. But, not all information from the PDB is extracted. It is arguably more elucidative to browse and display such auxiliary information with a close association to the underlying 3D structure. Due to limitation of the instrumentation and experimental/recording errors, existing protein data files may contain erroneous or missing information. For example, experimentation may not reveal all the locations of the residue, or unexpected covalent bonds. Protein Explorer does not display such information. In the PDB file such augmentative information has been categorized as REMARKS when there is a missing atom or such. These remark need to be highlighted in the user interface, so the novices know which data us inaccurate. Also, when all atoms are displayed, there is not information about the angular deviation. And the user is forced to go use other applications such as VMD or Deep view to get the information accurately. Implementation of visualizing predefined remarks in PDB data files together with the bond angles and length will increase the functionality and flexibility of the software.
-
Protein Modification Capabilities
Protein explorer should not be just visualization software. It should provide mechanism for editing structures. Many visualizations tools adapt this capability by setting up a database utility that tracks the modifications made to the protein on the local machine or a server with out changing the original PDB file. Each molecule object, its atoms and bonds, has a direct pointer to its respective record in the database.
Figure 2: Database Architecture
DeepView and VMD softwares already have this capability implemented, these software have been coded in the same language as Protein Explorer (C++), therefore this system can be easily integrated in Protein Explorer.
-
Collaborative Cross-Environment Visualization
Collaborative Cross-Environment Visualization seen in SnB Visualizer Software can be adapted with Protein Explorer to allow real-time interaction with multiple clients. This addition will allow high-end interactive interface to allow students teachers to work on projects in real-time online. This can be useful for researchers worldwide as well.
Figure 3: Collaborative Cross-Environment Visualization
The main components of a visualization system are its database and visualization algorithms. However, in order to support collaboration, each user can modify the data locally without changing the data on the network. With this design, whenever the molecule object is mutated by a client user, the local database is notified and updated, and a message containing the database keys of the updated records and their new values is sent to the server. The server then updates its own local database and broadcasts the notice of change to its other clients. Each such client will update its local database, which results in an update of the local molecule objects.
-
Molecular Dynamic Visualization
Molecular visualization has been a focus of the graphics community for many years, in a recent study game developers are collaborating with molecular biologists to create high-end visualization software based on graphics texture mapping and algorithms used in Computer games. The idea is use simplify 3D rendering by introducing Texture maps instead of tracing color shades with the ray-tracing algorithm. Another reason is that gaming technology has made advances in rendering graphics. And it does not stop there: new methods and graphics-supported modeling software that make use of the highly developed human visual system could greatly enhance the understanding of any research data.
Each protein viewer available today has its own requirements for appearance and speed; it must strike a balance between rendering quality and interaction performance based on the needs of the application. Another factor that drives the design of a molecular visualization system is the intended viewing platform, which
may range from a high-end graphics workstation to a web browser running on a laptop. Protein explorer at the moment is used at low-end platform. High-end platforms can assume interactive performance with large amounts of geometry, while web browsers must be lightweight and able to create compelling results without too much computation or geometry. Furthermore, each platform may have a variety of application programming interfaces (API’s) for three-dimensional programming, such as OpenGL, Java3D, and Direct3D. Graphics hardware cards are designed for game applications, rather than for scientific visualization, this doesn’t mean that the features of the technology can not be useful for scientific applications. Future research can allow, molecular software like Protein Explorer to render 3D images with customized surface textures using DirectX API. DirectX has the advantage of a common API and can work across multiple game card platforms under the Windows operating system. The only consideration to keep in mind is that DirectX is not available on Linux or Unix systems. Therefore Protein Explorer implemented using DirectX will not be available in Linux or Unix. Below is a sample how customized texture mapping can help better separate different parts of the protein structure and better understand the structure. Currently protein Explorer has only limited color coding to represent protein structure.
Figure 3: Ribbon diagram, coloring residues individually is implemented using a one-dimensional texture map.
Conclusion
Protein Explorer has contributed a lot in educating non-technical and technical users understand the protein structures and functionality. The improvements introduced in this paper, are far more advanced but are worth mentioning. There are so many visualization softwares out there competing with Protein explorer. These improvements will give this software an edge among its competitors. The key to recent technology is to make softwares more dynamic and interactive as well as accurate in displaying data. Protein Explorer needs to make more resources available to users than for them to look for it in other tools. This can be done by implementing homology modeling tool algorithms, displaying warnings for incorrect PDB data and displaying bond lengths etc. into the Protein Explorer software, thus saving users some time and effort to look around for answers. In Addition, Allowing protein data evaluation in real-time for multiple users through the integration of online interactive database systems, will help in networking ideas across the globe among researchers and educators. As far as 3D visualization goes, simple animations of moving ball-and-stick or ribbon representations are limited in being able to provide insight and intuition about proteins. Protein Explorer’s goal should be to integrate modern commodity graphics technology to develop an environment where multiple, linked views of the protein allow a user to interactively probe and query the protein to see patterns and relationships. That will help user understand more in detail about the protein structure. Most protein visualization software parses Protein Databank (PDB) or like files in order to form rigid, 3D protein models. However, a protein is not a wholly rigid construct. Proteins naturally move and deform—sometimes a lot—and this movement is also related to the protein’s function. The ultimate path for Protein explorer is to advance towards a scalable 3D modeling software that accounts for all degrees of freedom along a protein’s carbon backbone.
References
-
E. Martz. Protein Explorer: Easy Yet Powerful Macromolecular Visualization. Trends in Biochemical Sciences, 27 (2002.02). 107-109. (http://proteinexplorer.org)
-
R. A. Sayle and E. J. Milner-White. RASMOL: Biomolecular Graphics For All. Trends in Biochemical Sciences, 20, Sep. 1995, 374–376.
-
The impact of computer games on scientific and information visualization: If you can’t beat them, join them. Panel at IEEE Visualization, 2000. Theresa-Marie Rhyne, Chair.
-
Molecular Visualization: A Microcosm of the E-Revolution http://home.earthlink.net/~shalpine/gallery/CGA/halpine.pdf
-
Nucleic Acids Research, 2005, Vol. 33, Web Server issue http://ligin.weizmann.ac.il/publications/pdf/Sobolev2005(NAR).pdf
Share with your friends: |