Datum vypracování: 2004-10-21


Construct a grid computing environment for bioinformatics



Download 0.52 Mb.
Page6/8
Date18.10.2016
Size0.52 Mb.
#828
1   2   3   4   5   6   7   8

Construct a grid computing environment for bioinformatics
Yu-Lun Kuo; Chao-Tung Yang; Chuan-Lin Lai; Tsai-Ming Tseng
Dept. of Comput. Sci. & Inf. Eng., Tunghai Univ., Taichung, Taiwan
Conference: Proceedings. 7th International Symposium on Parallel Architectures, Algorithms and Networks. I-SPAN'04 , Page: 339-44
Editor: Hsu, D.F.; Hiraki, K.; Shen, S.; Sudborough, H.
Publisher: IEEE Comput. Soc , Los Alamitos, CA, USA , 2004 , xvi+645 Pages
Conference: Proceedings. 7th International Symposium on Parallel Architectures, Algorithms and Networks. I-SPAN'04 , Sponsor: Univ. of Hong Kong , 10-12 May 2004 , Hong Kong, China
Language: English

Abstract: Internet computing and grid technologies promise to change the way we tackle complex problems. They will enable large-scale aggregation and sharing of computational, data and other resources across institutional boundaries. And harnessing these new technologies effectively will transform scientific disciplines ranging from high-energy physics to the life sciences. The computational analysis of biological sequences is a kind of computation driven science. Cause the biology data growing quickly and these databases are heterogeneous. We can use the grid system sharing and integrating the heterogeneous biology database. As we know, bioinformatics tools can speed up analysis the large-scale sequence data, especially about sequence alignment. The FASTA is a tool for aligning multiple protein or nucleotide sequences. FASTA which we used is a distributed and parallel version. The software uses a message-passing library called MPl (Message Passing Interface) and runs on distributed workstation clusters as well as on traditional parallel computers. A grid computing environment is proposed and constructed on multiple Linux PC clusters by using Globus Toolkit (GT) and SUN Grid Engine (SGE). The experimental results and performances of the bioinformatics tool using on grid system are also presented in this paper.

30)


Proceedings. 7th International Symposium on Parallel Architectures, Algorithms and Networks. I-SPAN'04
Editor: Hsu, D.F.; Hiraki, K.; Shen, S.; Sudborough, H.
Publisher: IEEE Comput. Soc , Los Alamitos, CA, USA , 2004 , xvi+645 Pages
Conference: Proceedings. 7th International Symposium on Parallel Architectures, Algorithms and Networks. I-SPAN'04 , Sponsor: Univ. of Hong Kong , 10-12 May 2004 , Hong Kong, China
Language: English

Abstract: The following topics are dealt with: routing; wireless networks; content distribution; parallel algorithms; interconnection networks; fault tolerance; graphs; load balancing; semantic Web; data distribution; communication performance; parallel architecture; Internet technology and applications; quality of service; optical networks; mobile computing; network security and management; and bioinformatics.

31)


Experiences on adaptive grid scheduling of parameter sweep applications
Huedo, E.; Montero, R.S.; Llorente, I.M.
Lab. Computacion Avanzada, CSIC-INTA, Torrejon de Ardoz, Spain
Conference: Proceedings. 12th Euromicro Conference on Parallel, Distributed and Network-Based Processing , Page: 28-33
Publisher: IEEE Comput. Soc , Los Alamitos, CA, USA , 2004 , xiii+442 Pages
Conference: Proceedings. 12th Euromicro Conference on Parallel, Distributed and Network-Based Processing , 11-13 Feb. 2004 , Coruna, Spain
Language: English

Abstract: Grids offer a dramatic increase in the number of available compute and storage resources that can be delivered to applications. This new computational infrastructure provides a promising platform to execute loosely coupled, high-throughput parameter sweep applications. This kind of applications arises naturally in many scientific and engineering fields like bioinformatics, computational fluid dynamics (CFD), particle physics, etc. The efficient execution and scheduling of parameter sweep applications is challenging because of the dynamic and heterogeneous nature of grids. We present a scheduling algorithm built on top of the GridWay framework that combines: (i) adaptive scheduling to reflect the dynamic grid characteristics; (ii) adaptive execution to migrate running jobs to better resources and provide fault tolerance; (iii) re-use of common files between tasks to reduce the file transfer overhead. The efficiency of the approach is demonstrated in the execution of a CFD application on a highly heterogeneous research testbed.

32)


Asynchronous HMM with applications to speech recognition
Garg, A.; Balakrishnan, S.; Vaithyanathan, S.
Almaden Res. Center, San Jose, CA, USA
Conference: 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing
Part: vol.1 , Page: I-1009-12 vol.1
Publisher: IEEE , Piscataway, NJ, USA , 2004 , 5 vol. (cix+1045) Pages
Conference: 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing , 17-21 May 2004 , Montreal, Que., Canada
Language: English

Abstract: We develop a novel formalism for modeling speech signals which are irregularly or incompletely sampled. This situation can arise in real world applications where the speech signal is being transmitted over an error prone channel where parts of the signal can be dropped. Typical speech systems based on hidden Markov models, cannot handle such data since HMMs rely on the assumption that observations are complete and made at regular intervals. We introduce the asynchronous HMM, a variant of the inhomogeneous HMM commonly used in bioinformatics, and show how it can be used to model irregularly or incompletely sampled data. A nested EM algorithm is presented in brief which can be used to learn the parameters of this asynchronous HMM. Evaluation on real world speech data, which has been modified to simulate channel errors, shows that this model and its variants significantly outperform the standard HMM and methods based on data interpolation

33)


An asynchronous GALS interface with applications
Smith, S.F.
Electr. & Comput. Eng. Dept., Boise State Univ., USA
Conference: 2004 IEEE Workshop on Microelectronics and Electron Devices (IEEE Cat. No.04EX810) , Page: 41-4
Publisher: IEEE , Piscataway, NJ, USA , 2004 , xii+136 Pages
Conference: 2004 IEEE Workshop on Microelectronics and Electron Devices , 16 April 2004 , Boise, ID, USA
Language: English

Abstract: A low-latency asynchronous interface for use in globally-asynchronous locally-synchronous (GALS) integrated circuits is presented. The interface is compact and does not alter the local clocks of the interfaced local clock domains in any way (unlike many existing GALS interfaces). Two applications of the interface to GALS systems are shown. The first is a single-chip shared-memory multiprocessor for generic supercomputing use. The second is an application-specific coprocessor for hardware acceleration of the Smith-Waterman algorithm. This is a bioinformatics algorithm used for sequence alignment (similarity searching) between DNA or amino acid (protein) sequences and sequence databases such as the recently completed human genome database.

34)


Computational Methods for SNPs and Haplotype Inference. DIMACS/RECOMB Satellite Workshop. Revised Papers. (Lecture Notes in Bioinformatics Vol.2983)
Editor: Istrail, S.; Waterman, M.; Clark, A.
Publisher: Springer-Verlag , Berlin, Germany , 2004 , ix+152 Pages
Conference: Computational Methods for SNPs and Haplotype Inference. DIMACS/RECOMB Satellite Workshop. Revised Papers , 21-22 Nov. 2002 , Piscataway, NJ, USA
Language: English

Abstract: The conference focused on methods for SNP and haplotype analysis and their applications to disease associations. The ability to score large numbers of DNA variants (SNPs) in large samples of humans is rapidly accelerating, as is the demand to apply these data to tests of association with diseased states. The problem suffers from excessive dimensionality, so any means of reducing the number of dimensions of the space of genotype classes in a biologically meaningful way would likely be of benefit. Linked SNPs are often statistically associated with one another (in "linkage disequilibrium"), and the number of distinct configurations of multiple tightly linked SNPs in a sample is often far lower than one would expect from independent sampling. These joint configurations, or haplotypes, might be a more biologically meaningful unit, since they represent sets of SNPs that co-occur in a population. Recently there has been much excitement over the idea that such haplotypes occur as blocks across the genome, as these blocks suggest that fewer distinct SNPs need to be scored to capture the information about genotype identity. There is need for formal analysis of this dimension reduction problem, for formal treatment of the hierarchical structure of haplotypes, and for consideration of the utility of these approaches toward meeting the end goal of finding genetic variants associated with complex diseases.

35)


IT service infrastructure for integrative systems biology
Curcin, Vasa; Ghanem, Moustafa; Guo, Yike; Rowe, Anthony; He, Wayne; Pei, Hao; Qiang, Lu; Li, Yuanyuan
Department of Computing Imperial College London, London SW7 2BZ, United Kingdom
Conference: Proceedings - 2004 IEEE International Conference on Services Computing, SCC 2004 , Shanghai, China , 20040915-20040918 , (Sponsor: IEEE Computer Society, TSC-SC; IBM T.J. Watson Research Center; Shanghai Jiao Tong University (SJTU), China; University of Hong Kong, E-Business Technology Institute, China)
Proceedings - 2004 IEEE International Conference on Services Computing, SCC 2004 Proceedings - 2004 IEEE International Conference on Services Computing, SCC 2004 2004. , 2004
Language: English

Abstract: Despite the large number of software tools and hardware platforms aiming to solve the problems that bioinformatics is facing today, there is no platform solution that can scale up to its demands, in terms of both scope and sheer volume. DiscoveryNet scientific workflow system is here extended into a service-centric component architecture that brings together cross-domain applications through web and grid services and composes them as novel service offerings. Two case studies implemented on top of the platform, SARS analysis and microarray/metabonomics, are described.

36)


Integrating text mining into distributed bioinformatics workflows: A Web services implementation
Gaizauskas, Rob; Davis, Neil; Demetriou, George; Guo, Yikun; Roberts, Ian
Department of Computer Science University of Sheffield, Sheffield, United Kingdom
Conference: Proceedings - 2004 IEEE International Conference on Services Computing, SCC 2004 , Shanghai, China , 20040915-20040918 , (Sponsor: IEEE Computer Society, TSC-SC; IBM T.J. Watson Research Center; Shanghai Jiao Tong University (SJTU), China; University of Hong Kong, E-Business Technology Institute, China)
Proceedings - 2004 IEEE International Conference on Services Computing, SCC 2004 Proceedings - 2004 IEEE International Conference on Services Computing, SCC 2004 2004. , 2004
Language: English

Abstract: Workflows are useful ways to support scientific researchers in carrying out repetitive analytical tasks on digital information. Web services can provide a useful implementation mechanism for workflows, particularly when they are distributed, i.e., where some of the data or processing resources are remote from the scientist initiating the workflow. While many scientific workflows primarily involve operations on structured or numerical data, all interpretation of results is done in the context of related work in the field, as reflected in the scientific literature. Text mining technology can assist in automatically building helpful pathways into the relevant literature as part of a workflow in order to support the scientific discovery process. In this paper we demonstrate how these three technologies - workflows, text mining, and web services - can be fruitfully combined in order to support bioinformatics researchers investigating the genetic basis of two physiological disorders - Graves' disease and Williams syndrome.

37)


Bioinformatics and Systems Biology, rapidly evolving tools for interpreting plant response to global change
Blanchard, Jeffrey L.
Conference: Linking Functional Genomics with Physiology for Global Change , Denver, CO, United States , 20031105-20031105
Field Crops Research v 90 n 1 Nov 8 2004. p 117-131 , 2004
Language: English

Abstract: Global change is impacting the evolutionary trajectory of our planet's biota. In spite of the widely appreciated magnitude of this process, we still have a limited ability to estimate biological effects of increased atmospheric CO//2 or of climate change. Many new molecular techniques, including microarrays and metabolic profiling, are emerging that allow the direct observation of the vast repertoire of an organism's cellular processes in laboratory and ecological settings. The challenge now is to integrate these large data sets containing spatial and temporal components into models that enable us to explain how organisms respond to increased atmospheric CO //2 and eventually to develop models that accurately predict their evolutionary trajectory. In response, the field of bioinformatics is expanding to better facilitate information transfer between laboratory experiments and mathematical modeling in support of the emerging field of Systems Biology. copy 2004 Elsevier B.V. All rights reserved.

38)


Integration of genomics approach with traditional breeding towards improving abiotic stress adaptation: Drought and aluminum toxicity as case studies
Ishitani, Manabu; Rao, Idupulapati; Wenzl, Peter; Beebe, Steve; Tohme, Joe
Conference: Linking Functional Genomics with Physiology for Global Change , Denver, CO, United States , 20031105-20031105
Field Crops Research v 90 n 1 Nov 8 2004. p 35-45 , 2004
Language: English

Abstract: Traditional breeding efforts are expected to be greatly enhanced through collaborative approaches incorporating functional, comparative and structural genomics. Potential benefits of combining genomic tools with traditional breeding have been a source of widespread interest and resulted in numerous efforts to achieve the desired synergy among disciplines. The International Center for Tropical Agriculture (CIAT) is applying functional genomics by focusing on characterizing genetic diversity for crop improvement in common bean (Phaseolus vulgaris L.), cassava (Manihot esculenta Crantz), tropical grasses, and upland rice (Oriza sativa L.). This article reviews how CIAT combines genomic approaches, plant breeding, and physiology to understand and exploit underlying genetic mechanisms of abiotic stress adaptation for crop improvement. The overall CIAT strategy combines both bottom-up (gene to phenotype) and top-down (phenotype to gene) approaches by using gene pools as sources for breeding tools. The strategy offers broad benefits by combining not only in-house crop knowledge, but publicly available knowledge from well-studied model plants such as arabidopsis left bracket Arabidopsis thaliana (L.) Heynh. right bracket . Successfully applying functional genomics in trait gene discovery requires diverse genetic resources, crop phenotyping, genomics tools integrated with bioinformatics and proof of gene function in planta (proof of concept). In applying genomic approaches to crop improvement, two major gaps remain. The first gap lies in understanding the desired phenotypic trait of crops in the field and enhancing that knowledge through genomics. The second gap concerns mechanisms for applying genomic information to obtain improved crop phenotypes. A further challenge is to effectively combine different genomic approaches, integrating information to maximize crop improvement efforts. Research at CIAT on drought tolerance in common bean and aluminum resistance in tropical forage grasses (Brachiaria spp.) is used to illustrate the opportunities and constraints in breeding for adaptation to abiotic stresses.

39)


From sequence to structure using PF2: Improving methods for protein folding prediction
Hussain, Saleem
Conference: Proceedings 17th IEEE Symposium on Computer-Based Medical Systems, CBMS 2004 , Bethesda, MD, United States , 20040624-20040625 , (Sponsor: IEEE Computer Society; Texas Tech University College of Engineering)
Proceedings of the IEEE Symposium on Computer-Based Medical Systems Proceedings 17th IEEE Symposium on Computer-Based Medical Systems, CBMS 2004 v 17 2004. , 2004
Language: English

Abstract: Projects dependent on proteomic data are challenged not by the lack of methods to analyze this information, but by the lack of means to capture and manage the data. A few primary players in the bioinformatics realm are promoting the use of selected standardized technologies to access biological data. Many organizations exposing bioinformatics tools, however, do not have the resources required for utilizing these technologies. In order to provide interfaces for non-standardized bioinformatics tools, open-source projects have led to the development of hundreds of software libraries. These tools lack architectural unity, making it difficult to script bioinformatics research projects, such as protein structure prediction algorithms, which involve the use of multiple tools in varying order and number. As a solution, we have focused on building a software model, named the Protein Folding Prediction Framework (PF2), which provides a unifying method for the addition and usage of connection modules to bioinformatics databases exposed via web-based tools, software suites, or e-mail services. The framework provides mechanisms that allow users to create and add new connections without supplementary code as well as to introduce entirely new logical scenarios. In addition, PF2 offers a convenient interface, a multi-threaded execution-engine, and a built-in visualization suite to provide the bioinformatics community with an end-to-end solution for performing complex genomic and proteomic inquiries.

40)


26th international conference on software engineering: ICSE 2004
Anon (Ed.)
Conference: Proceedings - 26th International Conference on Software Engineering, ICSE 2004 , Edinburgh, United Kingdom , 20040523-20040528 , (Sponsor: Institution of Electrical Engineers, IEE; British Computer Society, BCS; Association for Computing Machinery, ACM SIGSOFT; Association for Computing Machinery, ACM SIGPLAN; IEEE Computer Society Technical Council on Software Engineering)
Proceedings - International Conference on Software Engineering Proceedings - 26th International Conference on Software Engineering, ICSE 2004 v 26 2004. , 2004
Language: English

Abstract: The proceedings contains 122 papers from the 26**t**h International Conference on Software Engineering: ICSE 2004. The topics discussed include: Controlling the complexity of software designs; software engineering challenges in bioinformatics; adding high availability and autonomic behavior to Web services; grid small and large: distributed systems and global communities; a model driven approach for software systems reliability; component-based self-adaptability in peer-to-peer architectures and one more step in the direction of modularized integration concerns.

41)


Software engineering challenges in bioinformatics
Barker, Jonathan; Thornton, Janet
European Bioinformatics Institute Wellcome Trust Genome Campus, Cambridge CB10 1SD, United Kingdom
Conference: Proceedings - 26th International Conference on Software Engineering, ICSE 2004 , Edinburgh, United Kingdom , 20040523-20040528 , (Sponsor: Institution of Electrical Engineers, IEE; British Computer Society, BCS; Association for Computing Machinery, ACM SIGSOFT; Association for Computing Machinery, ACM SIGPLAN; IEEE Computer Society Technical Council on Software Engineering)
Proceedings - International Conference on Software Engineering Proceedings - 26th International Conference on Software Engineering, ICSE 2004 v 26 2004. , 2004
Language: English

Abstract: Data from biological research is proliferating rapidly and advanced data storage and analysis methods are required to manage it. We introduce the main sources of biological data available and outline some of the domain-specific problems associated with automated analysis. We discuss two major areas in which we are likely experience software engineering challenges over the next ten years: data integration and presentation.

42)


hMiDas and hMitChip: New opportunities in mitochondrial bioinformatics and genomic medicine
Alesci, Salvatore; Su, Yan A.; Chrousos, George P.
Conference: Proceedings 17th IEEE Symposium on Computer-Based Medical Systems, CBMS 2004 , Bethesda, MD, United States , 20040624-20040625 , (Sponsor: IEEE Computer Society; Texas Tech University College of Engineering)
Proceedings of the IEEE Symposium on Computer-Based Medical Systems Proceedings 17th IEEE Symposium on Computer-Based Medical Systems, CBMS 2004 v 17 2004. , 2004
Language: English

Abstract: We developed a human mitochondria-focused gene database (hMiDas) and customized cDNA microarray chip (hMitChip) to help biomedical research in mitochondrial genomics. The current version of hMiDas contains 1,242 gene entries (including mtDNA genes, nuclear genes related to mitochondria structure and functions, predicted loci and experimental genes), organized in 15 categories and 24 subcategories. The database interface allows keyword-based searches as well as advanced field and/or case-sensitive searches. Each gene record includes 19 fields, mostly hyperlinked to the corresponding source. Moreover, for each gene, the user is given the option to run literature search using PubMed, and gene/protein homology search using BLAST and FASTA. The hMitChip was constructed using hMiDas as a reference. Currently, it contains a selection of 501 mitochondria-related nuclear genes and 192 control elements, all spotted in duplicate on glass slides. Slide quality was checked by microarray hybridization with 50 mug of Cy3-labeled sample cDNA and Cy5-labeled comparing cDNA, followed by array scan and image analysis. The hMitChip was tested in vitro using RNA extracted from cancer cell lines. Gene expression changes detected by hMitChip were confirmed by quantitative real-time RT-PCR analysis.

43)


DWDM-RAM: A data intensive grid service architecture enabled by dynamic optical networks
Lavian, T.; Mambretti, J.; Cutrell, D.; Cohen, H.; Merrill, S.; Durairaj, R.; Daspit, P.; Monga, I.; Naiksatam, S.; Figueira, S.; Gutierrez, D.; Hoang, D.; Travostino, F.
Conference: 2004 IEEE International Symposium on Cluster Computing and the Grid, CCGrid 2004 , Chicago, IL, United States , 20040419-20040422 , (Sponsor: Institute of Electrical and Electronics Engineers, IEEE)
2004 IEEE International Symposium on Cluster Computing and the Grid, CCGrid 2004 2004 IEEE International Symposium on Cluster Computing and the Grid, CCGrid 2004 2004. , 2004

Download 0.52 Mb.

Share with your friends:
1   2   3   4   5   6   7   8




The database is protected by copyright ©ininet.org 2024
send message

    Main page