Abstract: The following topics are dealt with: dynamic multithreaded algorithms; cache-oblivious algorithms and data structures; graphs and trees; optimally competitive list batching; algorithmic complexity; scheduling; and approximation algorithms.
Proceedings of the IEEE 30th Annual Northeast Bioengineering Conference (IEEE Cat. No.04CH37524) Editor: Schreiner, S.; Cezeaux, J.L.; Muratore, D.M. Publisher: IEEE, Piscataway, NJ, USA, 2004, xxiii+262 Pages Conference: Proceedings of the IEEE 30th Annual Northeast Bioengineering Conference, Sponsor: BEACON, Tyco Healthcare, Reebok, BEI, The Whitaker Found, 17-18 April 2004, Springfield, MA, USA Language: English
Abstract: The following topics were dealt with: neural engineering; biomedical instrumentation; medical imaging; physiological monitoring; cardiovascular biomechanics; biosensors; bioMEMS; biomaterials tissue and cellular engineering; rehabilitation engineering; telemedicine and virtual reality in medicine; biomedical education; pharmaceutical engineering; drug delivery; bio-optics; bioinformatics; surgical devices; and the medical applications of nanosystems and nanotechnology
Applications of Evolutionary Computing. Evo Workshops 2004: EvoBIO, EvoCOMNET, EvoHOT, EvoMUSART, and EvoSTOC. Proceedings (Lecture Notes in Comput. Sci. Vol.3005) Editor: Raidl, G.R. Publisher: Springer-Verlag, Berlin, Germany, 2004, xix+562 Pages Conference: Applications of Evolutionary Computing. Evo Workshops 2004: EvoBIO, EvoCOMNET, EvoHOT, EvoMUSART, and EvoSTOC. Proceedings, Sponsor: EvoNET, Univ. of Coimbra, 5-7 April 2004, Coimbra, Portugal Language: English
Abstract: The following topics are dealt with: EvoBIO; evolutionary bioinformatics; EvoCOMNET; evolutionary computation; communications, networks, and connected systems; EvoHOT; hardware optimization techniques; binary decision diagrams; multilayer floorplan layout problem; EvoIASP; image analysis; signal processing; object recognition systems; EvoMUSART; evolutionary music; evolutionary art; EvoSTOC; evolutionary algorithms; stochastic environment; optimization problems; and dynamic environments.
Bioinformatics: a knowledge engineering approach Kasabov, N.
Sch. of Bus., Auckland Univ. of Technol., New Zealand
Conference: 2004 2nd International IEEE Conference on 'Intelligent Systems'. Proceedings (IEEE Cat. No.04EX791) Part: Vol.1, Page: 19-24 Vol.1 Editor: Yager, R.R.; Sgurev, V.S. Publisher: IEEE, Piscataway, NJ, USA, 2004, 756 Pages Conference: 2004 2nd International IEEE Conference on 'Intelligent Systems'. Proceedings, Sponsor: IEEE Instrumentation and Measurement Soc., IEEE IM/CS/SMC Joint Chapter of Bulgaria, 22-24 June 2004, Varna, Bulgaria Language: English
Abstract: The paper introduces the knowledge engineering (KE) approach for the modeling and the discovery of new knowledge in bioinformatics. This approach extends the machine learning approach with various rule extraction and other knowledge representation procedures. Examples of the KE approach, and especially of one of the recently developed techniques - evolving connectionist systems (ECOS), to challenging problems in bioinformatics are given, that include: DNA sequence analysis, microarray gene expression profiling, protein structure prediction, finding gene regulatory networks, medical prognostic systems, computational neurogenetic modeling.
Unordered tree mining with applications to phylogeny Shasha, D.; Wang, J.T.L.; Sen Zhang
Courant Inst. of Math. Sci., New York Univ., NY, USA
Conference: Proceedings. 20th International Conference on Data Engineering, Page: 708-19 Publisher: IEEE Comput. Soc, Los Alamitos, CA, USA, 2004, xx+880 Pages Conference: Proceedings. 20th International Conference on Data Engineering, Sponsor: Microsoft Res., bea, IBM, MITRE, Sun Microsystems, 30 March-2 April 2004, Boston, MA, USA Language: English
Abstract: Frequent structure mining (FSM) aims to discover and extract patterns frequently occurring in structural data, such as trees and graphs. FSM finds many applications in bioinformatics, XML processing, Web log analysis, and so on. We present a new FSM technique for finding patterns in rooted unordered labeled trees. The patterns of interest are cousin pairs in these trees. A cousin pair is a pair of nodes sharing the same parent, the same grandparent, or the same great-grandparent, etc. Given a tree T, our algorithm finds all interesting cousin pairs of T in O(|T|/sup 2/) time where |T| is the number of nodes in T. Experimental results on synthetic data and phylogenies show the scalability and effectiveness of the proposed technique. To demonstrate the usefulness of our approach, we discuss its applications to locating co-occurring patterns in multiple evolutionary trees, evaluating the consensus of equally parsimonious trees, and finding kernel trees of groups of phylogenies. We also describe extensions of our algorithms for undirected acyclic graphs (or free trees).
LDC: enabling search by partial distance in a hyper-dimensional space Koudas, N.; Ooi, B.C.; Shen, H.T.; Tung, A.K.H.
Shannon Lab., AT&T Labs Res., Basking Ridge, NJ, USA
Conference: Proceedings. 20th International Conference on Data Engineering, Page: 6-17 Publisher: IEEE Comput. Soc, Los Alamitos, CA, USA, 2004, xx+880 Pages Conference: Proceedings. 20th International Conference on Data Engineering, Sponsor: Microsoft Res., bea, IBM, MITRE, Sun Microsystems, 30 March-2 April 2004, Boston, MA, USA Language: English
Abstract: Recent advances in research fields like multimedia and bioinformatics have brought about a new generation of hyper-dimensional databases which can contain hundreds or even thousands of dimensions. Such hyper-dimensional databases pose significant problems to existing high-dimensional indexing techniques which have been developed for indexing databases with (commonly) less than a hundred dimensions. To support efficient querying and retrieval on hyper-dimensional databases, we propose a methodology called local digital coding (LDC) which can support k-nearest neighbors (KNN) queries on hyper-dimensional databases and yet co-exist with ubiquitous indices, such as B+-trees. LDC extracts a simple bitmap representation called digital code(DC) for each point in the database. Pruning during KNN search is performed by dynamically selecting only a subset of the bits from the DC based on which subsequent comparisons are performed. In doing so, expensive operations involved in computing L-norm distance functions between hyper-dimensional data can be avoided. Extensive experiments are conducted to show that our methodology offers significant performance advantages over other existing indexing methods on both real life and synthetic hyper-dimensional datasets.
Proceedings. 20th International Conference on Data Engineering Publisher: IEEE Comput. Soc, Los Alamitos, CA, USA, 2004, xx+880 Pages Conference: Proceedings. 20th International Conference on Data Engineering, Sponsor: Microsoft Res., bea, IBM, MITRE, Sun Microsystems, 30 March-2 April 2004, Boston, MA, USA Language: English
Abstract: The following topics are dealt with: XML; query processing; tree data structures; database management systems; Internet; indexing; semi-structured data; data mining; streams; sensors; middleware; workflow; Web data management; security; data warehouses; OLAP; enterprise systems; scientific and biological databases; bioinformatics; and clustering.
Design and implementation of a computational grid for bioinformatics Chao-Tung Yang; Yu-Lun Kuo; Chuan-Lin Lai
Dept. of Comput. Sci. & Inf. Eng., Tunghai Univ., Taichung, Taiwan
Conference: Proceedings. 2004 IEEE International Conference on e-Technology, e-Commerce and e-Service, Page: 448-51 Editor: Yuan, S.-T.; Liu, J. Publisher: IEEE Comput. Soc, Los Alamitos, CA, USA, 2004, xxi+575 Pages Conference: Proceedings. 2004 IEEE International Conference on e-Technology, e-Commerce and e-Service, Sponsor: IEEE Task Committee of e-Commerce, Fu-Jen Univ. of Taiwan, BIKMrdc of Fu-Jen Univ., Academia Sinica, Nat. Sci. Council of Taiwan, Ministry of Educ. of Taiwan, Information Syst. Frontiers, Microsoft, ChungHwa Data Mining Soc, 28-31 March 2004, Taipei, Taiwan Language: English
Abstract: The popular technologies, Internet computing and grid technologies promise to change the way we tackle complex problems. They enable large-scale aggregation and sharing of computational, data and other resources across institutional boundaries. And harnessing these new technologies effectively transforms scientific disciplines ranging from high-energy physics to the life sciences. The computational analysis of biological sequences is a kind of computation driven science. Cause the biology data growing quickly and these databases are heterogeneous. We can use the grid system sharing and integrating the heterogeneous biology database. As we know, bioinformatics tools can speed up analysis the large-scale sequence data, especially about sequence alignment and analysis. The FASTA is a tool for aligning multiple protein or nucleotide sequences. These two bioinformatics software, which we used is a distributed and parallel version. The software uses a message-passing library called MPI (message passing interface) and runs on distributed workstation clusters as well as on traditional parallel computers. A grid computing environment is proposed and constructed on multiple Linux PC clusters by using globus toolkit (GT) and SUN grid engine (SGE). The experimental results and performances of the bioinformatics tool using on grid system are also presented.
Aligning multiple sequences by genetic algorithm Li-fang Liu; Hong-wei Huo; Bao-shu Wang
Sch. of Comput. Sci. & Technol., Xidian Univ., Xi'an, China
Conference: 2004 International Conference on Communications, Circuits and Systems (IEEE Cat. No.04EX914) Part: Vol.2, Page: 994-8 Vol.2 Publisher: IEEE, Piscataway, NJ, USA, 2004, 1584 Pages Conference: 2004 International Conference on Communications, Circuits and Systems, Sponsor: Ministry of Educ. (MOE) of PR China, City Univ. of Hong Kong, K.C. Wong Educ. Found, 27-29 June 2004, Chengdu, China Language: English
Abstract: The paper presents a genetic algorithm for solving multiple sequence alignment in bioinformatics. The algorithm involves four different operators, one type of selection operator, two types of crossover operators, and one type of mutation operator; the mutation operator is realized by a dynamic programming method. Experimental results of benchmarks from the BAliBASE show that the proposed algorithm is feasible for aligning equidistant protein sequences, and the quality of alignment is comparable to that obtained with ClustalX.
Algorithms for estimating information distance with application to bioinformatics and linguistics Kaitchenko, A.
Dept. of Phys. & Comput., Wilfrid Laurier Univ., Waterloo, Ont., Canada
Conference: Canadian Conference on Electrical and Computer Engineering 2004 (IEEE Cat. No.04CH37513) Part: Vol.4, Page: 2255-8 Vol.4 Publisher: IEEE, Piscataway, NJ, USA, 2004, 2908 Pages Conference: Canadian Conference on Electrical and Computer Engineering 2004, Sponsor: Cisco Syst., General Elec., Ryerson Univ., AVFX Audio Visual, Bell Canada, Dofasco, Dye & Durham, Gennum Corp., IEEE Canada Found., Univ. of Toronto, Niagara College of Appl. Arts and Technol, 2-5 May 2004, Niagara Falls, Ont., Canada Language: English
Abstract: We review unnormalized and normalized information distances based on incomputable notions of Kolmogorov complexity and discuss how Kolmogorov complexity can be approximated by data compression algorithms. We argue that optimal algorithms for data compression with side information can be successfully used to approximate the normalized distance. Next, we discuss an alternative information distance, which is based on relative entropy rate (also known as Kullback-Leibler divergence), and compression-based algorithms for its estimation. We conjecture that in bioinformatics and computational linguistics this alternative distance is more relevant and important than the ones based on Kolmogorov complexity.
g the three-dimensional structures of proteins: combined alignment approach Jaehyun Sim; Seung-Yeon Kim; Jooyoung Lee; Ahrim Yoo
Sch. of Comput. Sci., Korea Inst. for Adv. Study, Seoul, South Korea
Journal of the Korean Physical Society Conference: J. Korean Phys. Soc. (South Korea), vol.44, no.3, pt.1, Page: 611-16 Publisher: Korean Phys. Soc, March 2004 Conference: 12th Thermal and Statistical Physics Workshop, 19-21 Aug. 2003, Suanbo, Chungbuk, South Korea Language: English
Abstract: Protein structure prediction is a great challenge in molecular biophysics and bioinformatics. Most approaches to structure, prediction use known structure information from the Protein Data Bank (PDB). In these approaches, it is most crucial to find a homologous protein (template) from the PDB to a query sequence and to align the query sequence to the template sequence. We propose a profile-profile alignment method based on the cosine similarity criterion, and combine this with a sequence-profile alignment, the secondary structure prediction of the query protein, and the experimental secondary structure of the template protein. Our method, which we call combined alignment, provides good results for the 1107 query-template pairs of the SCOP database and the CASP5 target proteins. They show that combined alignment significantly improves the recognition of distant homology.
The role of computer science in undergraduate bioinformatics education Burhans, D.T.; Skuse, G.R.
Dept. of Comput. Sci., Canisius Coll., Buffalo, NY, USA
SIGCSE Bulletin Conference: SIGCSE Bull. (USA), vol.36, no.1, Page: 417-21 Publisher: ACM, March 2004 Conference: Thirty-Fifth SIGCSE Technical Symposium on Computer Science Education, Sponsor: ACM Spcial Interest Group on Comput. Sci. Educ, 3-7 March 2004, Norfolk, VA, USA Language: English
Abstract: The successful implementation of educational programs in bioinformatics presents many challenges. The interdisciplinary nature of bioinformatics requires close cooperation between computer scientists and biologists despite inescapable differences in the ways in which members of these professions think. It is clear that the development of quality curricula for bioinformatics must draw upon the expertise of both disciplines. In addition, biologists and computer scientists can benefit from opportunities to carry out interdisciplinary research with one another. This paper examines the role of computer science in undergraduate bioinformatics education from the perspectives of two bioinformatics program directors. Their respective programs exemplify two substantively different approaches to undergraduate education in bioinformatics due to the fact that they are at markedly different institutions. One institution is a large, technical university, offering both undergraduate and graduate degrees in bioinformatics while the other is a small, Jesuit liberal arts college with an undergraduate program in bioinformatics. Despite these differences there is considerable overlap with respect to the role of computer science. This paper discusses the ways in which computer science has been integrated into these two undergraduate bioinformatics programs, compares alternative approaches, and presents some of the inherent challenges.
Challenges posed by adoption issues from a bioinformatics point of view Moise, D.L.; Wong, K.; Moise, G.
Dept. of Comput. Sci., Alberta Univ., Edmonton, Alta., Canada
Conference: "Fourth International Workshop on Adoption-Centric Software Engineering (ACSE 2004)" W6S Workshop - 26th International Conference on Software Engineering, Page: 75-9 Publisher: IEE, Stevenage, UK, 2004, vi+85 Pages Conference: "Fourth International Workshop on Adoption-Centric Software Engineering (ACSE 2004)" W6S Workshop - 26th International Conference on Software Engineering, Sponsor: IEEE Comput. Soc., SIGSOFT, IEE, 25 May 2004, Edinburgh, Scotland, UK Language: English
Abstract: Developing interoperability models for data is a crucial factor for the adoption of research tools within industry. In this paper, we discuss efficient data interoperability models within a field where they are highly needed: the bioinformatics field. We present the challenges that interoperability models for data must face within this field and we discuss some existing strategies built to address these challenges. The potential of a semi-structured data model based on XML is discussed. Also, a novel approach that enhances the capabilities of the data integration model by automatically identifying XML documents generated based on the same DTD is presented. Practices developed within this application domain can be used for the benefit of similar adoption issues in various other domains.
Software engineering challenges in bioinformatics Barker, J.; Thornton, J.
Eur. Bioinformatics Inst., Cambridge, UK
Conference: Proceedings. 26th International Conference on Software Engineering, Page: 12-15 Publisher: IEEE Comput. Soc, Los Alamitos, CA, USA, 2004, xviii+786 Pages Conference: Proceedings. 26th International Conference on Software Engineering, Sponsor: IEE, Assoc. for Comput. Machinery Special Interest Group on Software Eng., IEEE Comput. Soc, 23-28 May 2004, Edinburgh, UK Language: English
Abstract: Data from biological research is proliferating rapidly and advanced data storage and analysis methods are required to manage it. We introduce the main sources of biological data available and outline some of the domain specific problems associated with automated analysis. We discuss two major areas in which we are likely experience software engineering challenges over the next ten years: data integration and presentation.
BLID: an application of logical information systems to bioinformatics Ferre, S.; King, R.D.
Dept. of Comput. Sci., Wales Univ., Aberystwyth, UK
Conference: Concept Lattices. Second International Conference on Formal Concept Analysis, ICFCA 2004. Proceedings (Lecture Notes in Artificial Intelligence Vol.2961), Page: 47-54 Editor: Eklund, P. Publisher: Springer-Verlag, Berlin, Germany, 2004, ix+409 Pages Conference: Concept Lattices. Second International Conference on Formal Concept Analysis, ICFCA 2004. Proceedings, 23-26 Feb. 2004, Sydney, NSW, Australia Language: English
Abstract: BLID (bio-logical intelligent database) is a bioinformatic system designed to help biologists extract new knowledge from raw genome data by providing high-level facilities for both data browsing and analysis. We describe BLID's novel data browsing system which is based on the idea of logical information systems. This enables combined querying and navigation of data in BLID (extracted from public bioinformatic repositories). The browsing language is a logic especially designed for bioinformatics. It currently includes sequence motifs, taxonomies, and macromolecule structures, and it is designed to be easily extensible, as it is composed of reusable components. Navigation is tightly combined with this logic, and assists users in browsing a genome through a form of human-computer dialog.
The automatic generation of programs for classification problems with grammatical swarm O'Neill, M.; Brabazon, A.; Adley, C.
Biocomputing & Dev. Syst. Group, Univ. of Limerick, Ireland
Conference: Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753) Part: Vol.1, Page: 104-10 Vol.1 Publisher: IEEE, Piscataway, NJ, USA, 2004, xxx+2371 Pages Conference: Proceedings of the 2004 Congress on Evolutionary Computation, Sponsor: IEEE Neural Network Soc., Evolutionary Programming Soc., IEE, 19-23 June 2004, Portland, OR, USA Language: English
Abstract: This case study examines the application of grammatical swarm to classification problems, and illustrates the particle swarm algorithms' ability to specify the construction of programs. Each individual particle represents choices of program construction rules, where these rules are specified using a Backus-Naur Form grammar. Two problem instances are tackled, the first a mushroom classification problem, the second a bioinformatics problem that involves the detection of eukaryotic DNA promoter sequences. For the first problem we generate solutions that take the form of conditional statements in a C-like language subset, and for the second problem we generate simple regular expressions. The results demonstrate that it is simple regular expressions. The results demonstrate that it is possible to generate programs using the grammatical swarm technique with a performance similar to the grammatical evolution evolutionary automatic programming approach.
Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753) Part: Vol.1 Publisher: IEEE, Piscataway, NJ, USA, 2004, xxx+2371 Pages Conference: Proceedings of the 2004 Congress on Evolutionary Computation, Sponsor: IEEE Neural Network Soc., Evolutionary Programming Soc., IEE, 19-23 June 2004, Portland, OR, USA Language: English
Abstract: The following topics are discussed: evolutionary multiobjective optimization; evolutionary algorithms; combinatorial and numerical optimization; swarm intelligence; evolutionary computation and games; evolutionary computation in bioinformatics and computational biology; evolutionary design; evolutionary computing in the process industry; evolutionary computation in finance and economics; evolutionary scheduling; evolutionary design and evolvable hardware; evolutionary design automation; evolutionary computation in cryptology and computer security; learning and approximation in design optimization; and coevolution and collective behavior.