La descarga está en progreso. Por favor, espere

La descarga está en progreso. Por favor, espere

U N A M Universidad Nacional Autónoma de México Servicios Web con aplicaciones en Bioinformática 24 de marzo, 2009.

Presentaciones similares


Presentación del tema: "U N A M Universidad Nacional Autónoma de México Servicios Web con aplicaciones en Bioinformática 24 de marzo, 2009."— Transcripción de la presentación:

1 U N A M Universidad Nacional Autónoma de México Servicios Web con aplicaciones en Bioinformática 24 de marzo, 2009

2 Introducción. Navegando a través del tiempo en la genética Era Genómica Genoma Humano Retos Explosión de datos. Análisis integrados. Bioinformatica Qué es? Consorcios y Grupos. Herramientas Web Services web services workflows

3 Navegando a través del tiempo en la genética 1869: Friedrich Miescher isolates DNA for the first time. 1879: Mitosis observed 1865: Mendel's Peas Gregor Mendel describes his experiments with peas showing that heredity is transmitted in discrete units. Walter Flemming described chromosome behavior during animal cell division. Miescher isolated a material rich in phosphorus from the cells and called it nuclein. http://www.genome.gov/25019887

4 1900s 1900: Rediscovery of Mendel's workRediscovery of Mendel's work 1902: Orderly Inheritance of Disease Observed 1902: Chromosome Theory of Heredity 1909: The Word Gene Coined 1911: Fruit Flies Illuminate the Chromosome TheoryOrderly Inheritance of Disease ObservedChromosome Theory of HeredityThe Word Gene CoinedFruit Flies Illuminate the Chromosome Theory 1940's 1941: One Gene, One Enzyme 1943: X-ray Diffraction of DNA 1944: DNA is "Transforming Principle" 1944: Jumping GenesOne Gene, One EnzymeX-ray Diffraction of DNADNA is "Transforming Principle"Jumping Genes 1950's 1952: Genes are Made of DNA 1953: DNA Double Helix 1955: 46 Human Chromosomes 1955: DNA Copying Enzyme 1956: Cause of Disease Traced to Alteration 1958: Semiconservative Replication of DNA 1959: Chromosome Abnormalities IdentifiedGenes are Made of DNADNA Double Helix46 Human ChromosomesDNA Copying EnzymeCause of Disease Traced to AlterationSemiconservative Replication of DNAChromosome Abnormalities Identified http://www.genome.gov/25019887

5 1960's 1961: mRNA Ferries Information 1961: First Screen for Metabolic Defect in Newborns 1966: Genetic Code Cracked 1968: First Restriction Enzymes DescribedmRNA Ferries InformationFirst Screen for Metabolic Defect in NewbornsGenetic Code CrackedFirst Restriction Enzymes Described 1970's 1972: First Recombinant DNA 1973: First Animal Gene Cloned 1975-77: DNA Sequencing 1976: First Genetic Engineering Company 1977: Introns DiscoveredFirst Recombinant DNAFirst Animal Gene ClonedDNA SequencingFirst Genetic Engineering CompanyIntrons Discovered 1980's 1981-82: First Transgenic Mice and Fruit Flies 1982: GenBank Database Formed 1983: First Disease Gene Mapped 1983: PCR Invented 1986: First Time Gene Positionally Cloned 1987: First Human Genetic Map 1987: YACs Developed 1989: Microsatelites, New Genetic Markers 1989: Sequence-tagged Sites, Another MarkerFirst Transgenic Mice and Fruit FliesGenBank Database FormedFirst Disease Gene MappedPCR InventedFirst Time Gene Positionally ClonedFirst Human Genetic MapYACs DevelopedMicrosatelites, New Genetic MarkersSequence-tagged Sites, Another Marker http://www.genome.gov/25019887

6 1990's 1990: Launch of the Human Genome Project NIHLaunch of the Human Genome Project 1990: ELSI Founded 1990: Research on BACs 1991: ESTs, Fragments of Genes 1992: Second-generation Genetic Map of Human Genome 1992: Data Release Guidelines Established 1993: NEW HGP Five-year Plan 1994: FLAVR SAVR Tomato 1994: Detailed Human Genetic Map 1994: Microbial Genome ProjectELSI FoundedResearch on BACsESTs, Fragments of GenesSecond-generation Genetic Map of Human GenomeData Release Guidelines EstablishedNEW HGP Five-year PlanFLAVR SAVR TomatoDetailed Human Genetic MapMicrobial Genome Project 1995: Ban on Genetic Discrimination in Workplace 1995: Two Microbial Genomes Sequenced 1995: Physical Map of Human Genome Completed 1996: International Strategy Meeting on Human Genome Sequencing 1996: Mouse Genetic Map Completed 1996: Yeast Genome Sequenced 1996: Archaea Genome Sequenced 1996: Health Insurance Discrimination Banned 1996: 280,000 Expressed Sequence Tags (ESTs) 1996: Human Gene Map Created 1996: Human DNA Sequence BeginsBan on Genetic Discrimination in WorkplaceTwo Microbial Genomes SequencedPhysical Map of Human Genome CompletedInternational Strategy Meeting on Human Genome SequencingMouse Genetic Map CompletedYeast Genome SequencedArchaea Genome SequencedHealth Insurance Discrimination Banned280,000 Expressed Sequence Tags (ESTs)Human Gene Map CreatedHuman DNA Sequence Begins 1997: Bermuda Meeting Affirms Principle of Data Release 1997: E. coli Genome Sequenced 1997: Recommendations on Genetic Testing 1998: Private Company Announces Sequencing Plan 1998: M. Tuberculosis Bacterium Sequenced 1998: Committee on Genetic Testing 1998: HGP Map Includes 30,000 Human Genes 1998: New HGP Goals for 2003 1998: SNP Initiative Begins 1998: Genome of Roundworm C. elegans Sequenced 1999: Full-scale Human Genome Sequencing 1999: Chromosome 22Bermuda Meeting Affirms Principle of Data ReleaseE. coli Genome SequencedRecommendations on Genetic TestingPrivate Company Announces Sequencing PlanM. Tuberculosis Bacterium SequencedCommittee on Genetic TestingHGP Map Includes 30,000 Human GenesNew HGP Goals for 2003SNP Initiative BeginsGenome of Roundworm C. elegans SequencedFull-scale Human Genome SequencingChromosome 22

7 2000 - 2001 2000: Free Access to Genomic Information 2000: Chromosome 21 2000: Working Draft 2000: Drosophila and Arabidopsis genomes sequenced 2000: Executive Order Bans Genetic Descrimination in the Federal Workplace 2000: Yeast Interactome Published 2000: Fly Model of Parkinson's Disease Reported 2001: First Draft of the Human Genome Sequence Released 2001: RNAi Shuts Off Mammalian Genes 2001: FDA Approves Genetics-based Drug to Treat LeukemiaFree Access to Genomic InformationChromosome 21Working DraftDrosophila and Arabidopsis genomes sequencedExecutive Order Bans Genetic Descrimination in the Federal WorkplaceYeast Interactome PublishedFly Model of Parkinson's Disease ReportedFirst Draft of the Human Genome Sequence ReleasedRNAi Shuts Off Mammalian GenesFDA Approves Genetics-based Drug to Treat Leukemia The President and Prime Minister Blair issued a Joint Statement in an effort to ensure that the public derives the maximum possible benefit from the sequence of the human genome. http://www.genome.gov/25019887

8 2002 -2003 2002: Mouse Genome Sequenced 2002: Researchers Find Genetic Variation Associated with Prostate Cancer 2002: Rice Genome Sequenced 2002: The International HapMap Project is Announced 2002: The Genomes to Life Program is Launched 2002: Researchers Identify Gene Linked to Bipolar Disorder 2003: Human Genome Project Completed 2003: Fiftieth Anniversary of Watson and Crick's Description of the Double Helix 2003: The First National DNA Day Celebrated 2003: ENCODE Program Begins 2003: Premature Aging Gene IdentifiedMouse Genome SequencedResearchers Find Genetic Variation Associated with Prostate CancerRice Genome SequencedThe International HapMap Project is AnnouncedThe Genomes to Life Program is LaunchedResearchers Identify Gene Linked to Bipolar DisorderHuman Genome Project CompletedFiftieth Anniversary of Watson and Crick's Description of the Double HelixThe First National DNA Day CelebratedENCODE Program BeginsPremature Aging Gene Identified http://www.genome.gov/25019887

9 2004 - The Future 2004: Rat and Chicken Genomes Sequenced 2004: FDA Approves First Microarray 2004: Refined Analysis of Complete Human Genome Sequence 2004: Surgeon General Stresses Importance of Family History 2005: Chimpanzee Genomes Sequenced 2005: HapMap Project Completed 2005: Trypanosomatid Genomes Sequenced 2005: Dog Genomes Sequenced 2006: The Cancer Genome Atlas (TCGA) Project Started 2006: Second Non-human Primate Genome is Sequenced 2006: Initiatives to Establish the Genetic and Environmental Causes of Common Diseases Launched The FutureRat and Chicken Genomes SequencedFDA Approves First MicroarrayRefined Analysis of Complete Human Genome SequenceSurgeon General Stresses Importance of Family HistoryChimpanzee Genomes SequencedHapMap Project CompletedTrypanosomatid Genomes SequencedDog Genomes SequencedThe Cancer Genome Atlas (TCGA) Project StartedSecond Non-human Primate Genome is SequencedInitiatives to Establish the Genetic and Environmental Causes of Common Diseases Launched The Future http://www.genome.gov/25019887

10

11 Retos de la genómica

12 "If our strands of DNA were stretched out in a line, the 46 chromosomes making up the human genome would extend more than six feet [close to two metres]. If the... length of the 100 trillion cells could be stretched out, it would be... over 113 billion miles [182 billion kilometres]. That is enough material to reach to the sun and back 610 times." [Source: Centre for Integrated Genomics] The Human Genome Project is involved in determining the exact order of the DNA bases of the entire human genome. The human genome contains more than 3.2 billion base pairs and more than 30 000 genes.Human Genome Projectgenes Explosión de datos. El genoma humano

13 http://www.ncbi.nlm.nih.gov/sites/entrez?db=genome&cmd=search&term= Que tanta informacion hay? NCBI - National Center for Biotechnology Information Established in 1988 as a national resource for molecular biology information, NCBI creates public databases, conducts research in computational biology, develops software tools for analyzing genome data, and disseminates biomedical information - all for the better understanding of molecular processes affecting human health and disease.

14 Genoma: tamaño del genoma, número de genes Human Genome: 3 billion DNA base pairs and has a data size of approximately 750 MegabytesDNAbase pairsMegabytes

15 Mas bases de datos especializadas.

16 El futuro. Análisis integrados y aplicados Pilares Retos

17 I. Genomics to Biology. Elucidating the structure of genomes and identifying the function of the myriad encoded elements will allow connections to be made between genomics and biology and will, in turn, accelerate the exploration of all realms of the biological sciences. II. Genómica y salud La genómica encierra la promesa del desarrollo de una medicina individualizada y el manejo de ésta para cada perfil genético.

18 Los últimos avances en la investigación en Ciencias Biológicas están produciendo un enorme crecimiento en el volumen y la complejidad de la información biológica disponible. Las Tecnologías de la Información y las Comunicaciones son cruciales para posibilitar el almacenamiento e interpretación de estos datos en los centros de investigación de un modo eficiente y robusto Bioinformática

19 Pero, ¿qué es la bioinformática?

20 Una definición de Bioinformática Aplicación de las tecnologías de la información en Biología Molecular Esto incluye la compilación, mantenimiento, distribución, análisis y uso de las inmensas cantidades de información biológica disponibles

21 2 Major research areas 2.1 Sequence analysis 2.2 Genome annotation 2.3 Computational evolutionary biology 2.4 Measuring biodiversity 2.5 Analysis of gene expression 2.6 Analysis of regulation 2.7 Analysis of protein expression 2.8 Analysis of mutations in cancer 2.9 Prediction of protein structure 2.10 Comparative genomics 2.11 Modeling biological systems 2.12 High-throughput image analysis 2.13 Protein-protein docking Principales áreas de su aplicación

22 Major Organizations Bioinformatics Organization (Bioinformatics.Org): The Open-Access Institute EMBnet European Bioinformatics Institute European Molecular Biology Laboratory The International Society for Computational Biology National Center for Biotechnology Information National Institutes of Health homepage Open Bioinformatics Foundation: umbrella non-profit organization supporting certain open-source projects in bioinformatics Swiss Institute of Bioinformatics Wellcome Trust Sanger Institute Major Journals Algorithms in Molecular Biology Bioinformatics BMC Bioinformatics Briefings in Bioinformatics Evolutionary Bioinformatics Genome Research The International Journal of Biostatistics Journal of Computational Biology Cancer Informatics Journal of the Royal Society Interface Molecular Systems Biology PLoS Computational Biology Statistical Applications in Genetic and Molecular Biology Transactions on Computational Biology and Bioinformatics - IEEE/ACM International Journal of Bioinformatics Research and Applications List of Bioinformatics journalsList of Bioinformatics journals at Bioinformatics.fr EMBnet.NewsEMBnet.News at EMBnet.org EMBnet is the organisation world- wide bringing bioinformatics professionals to work together to serve the expanding fields of genetics and molecular...

23 Software tools for bioinformatics simple command-line tools, complex graphical programs, CGI Best-known algorithms: BLAST, an algorithm for determining the similarity of arbitrary sequences against other sequences, possibly from curated databases of protein or DNA sequences. EMBOSS. Software analysis package. RSAT. Regulatory Sequence Analysis Tools. Software en Bioinformática

24 A bioinformatics « world » for humans http://tux.crystalxp.net/en.id.10838-brunocb-leonard-de-vinci----tux-de-vitruve.html

25 My sweet home-made bioinformatics platform Complete datasets Download Do your analysis: scripts BLAST BLAT RSAT Clustalw MEME … Download and install Parsing HTML Web page only ressources Filtered datasets Download SQL queries Perl script

26 My nightmare (home-made) platform Complete datasets Filtered datasets Download Perl script Download SQL queries BLAST BLAT RSAT Clustalw MEME … Download and install Do your analysis: scripts Parsing HTML Web page only ressources UPDATES NEW ANNOTATION DEPENDENCIES UPDATES LIBRARIES NEW DATABASE SCHEMA

27 Bye bye home-made platform… http://www.genomequest.com/landing-pages/ODI-webinar-web.html

28 Datos masivos. Necesidad de procesarlos e integrarlos. Los datos se encuentran en distintos servidores, en distintas bases de datos, y en distintos formatos. Problema de intercambio de datos. Muchas herramientas y se encuentran en distintos servidores, en distintas formas de acceso (CGI-Forms, HTML), distintos formatos de entrada y salida, y en distintos lenguajes. Problema de interoperabilidad (comunicación entre herramientas) Problemas :

29 Solución al Problema de intercambio de datos. Intercambio de datos a través de un formato definido en XML. XML permite estructurar datos y documentos en forma de árboles de etiquetas con atributos. El modelo de datos XML consiste en un árbol que no distingue entre objetos y relaciones, ni tiene noción de jerarquía de clases. Si queremos semántica (significado) Lenguajes para la definición de ontologías y metadatos en la web. RDF Schema Query Language. OWL Ontology Web Language.

30 Solución al Problema de interoperabilidad Un servicio web (en inglés Web service) es un conjunto de protocolos y estándares que sirven para intercambiar datos entre aplicaciones. Distintas aplicaciones de software desarrolladas en lenguajes de programación diferentes, y ejecutadas sobre cualquier plataforma, pueden utilizar los servicios web para intercambiar datos en redes de ordenadores como Internet. La interoperabilidad se consigue mediante la adopción de estándares abiertos. Las organizaciones OASIS y W3C son los comités responsables de la arquitectura y reglamentación de los servicios Web.redes de ordenadoresInternetinteroperabilidadestándares abiertosOASISW3C

31

32 Programs « talking » to programs retrieve-seq -org Saccharomyces_cerevisiae -feattype CDS -type upstream -format fasta … click #!/usr/bin/perl -w RSAT server in Bruxelles login ssh Anonymous access anywhere

33 A future bioinformatics « world » for computers ? I have a dream…

34 A future bioinformatics « world » for computers ? I have a dream… Run analysis remotely Only retrieve necessary data Data always up-to-date No need for local installation A unified way to access data and programs Programs interacting with programs over the internet

35 Web Services to the rescue ? Stein. Creating a bioinformatics nation. Nature (2002) vol. 417 (6885) pp. 119-20 « Although this proposal may seem a far cry from what happens now, the technology exists to make it reality. The World Wide Web consortium, with industry heavy-weights such as IBM and Microsoft, are providing an alphabet soup of standards: SOAP/XML, WSDL, UDDI and XSDL. »

36 What are Web Services (WS) ? Definition: A Web service is a software system designed to support interoperable machine-to- machine interaction over a network Source: W3C: http://www.w3.org/TR/ws-gloss/http://www.w3.org/TR/ws-gloss/ Service provider (server) client network => internet PERL script run_BLAST () blastall call run_BLAST() send back the results #!/usr/bin/perl -w

37 SOAP-based Web Services: SOAP: Simple Object Access Protocol Standard of the W3C with specifications: messaging with XML, HTTP for transport Various types of Web services : SOAP PERL script run_BLAST () blastall #!/usr/bin/perl -w BLAST parameters $sequence $subst_matrix $threshold XML BLAST result XML $result HTTP

38 Various types of Web services : SOAP PERL script run_BLAST () blastall #!/usr/bin/perl -w XML blastp SWISS MHLEGRDGRR YPGAPAVELL QTSVPSGLAE LVAGKRRLPR GAGGADPSHS XML Request envelope Response envelope BLASTP 2.2.18 [Mar-02- 2008] Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for compositional score matrix adjustment: Altschul, Stephen F., John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis, Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109. Query= query (50 letters) Database: SWISS: SWISS sequence taken from the header [Last update Mar/02/2009] 405,506 sequences; 146,168,000 total letters Searching..................................................done Score E Sequences producing significant alignments: (bits) Value sp|Q04671|P_HUMAN RecName: Full=P protein; AltName: Full=Melanoc... 104 1e-22 >sp|Q04671|P_HUMAN RecName: Full=P protein; AltName: Full=Melanocyte-specific transporter protein; AltName: Full=Pink-eyed dilution protein homolog; Length = 838 Score = 104 bits (260), Expect = 1e-22, Method: Compositional matrix adjust. Identities = 50/50 (100%), Positives = 50/50 (100%) Query: 1 MHLEGRDGRRYPGAPAVELLQTSVPSGLAELVAGKRRLPRGAGGADPSHS 50 MHLEGRDGRRYPGAPAVELLQTSVPSGLAELVAGKRRLPRGAGGADPSHS Sbjct: 1 MHLEGRDGRRYPGAPAVELLQTSVPSGLAELVAGKRRLPRGAGGADPSHS 50 Database: SWISS: SWISS sequence taken from the header [Last update Mar/02/2009] Posted date: Mar 2, 2009 5:30 AM Number of letters in database: 146,168,000 Number of sequences in database: 405,506 Lambda K H 0.314 0.136 0.403 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 405506 Number of Hits to DB: 17,615,102 Number of extensions: 565364 Number of successful extensions: 858 Number of sequences better than 10.0: 2 Number of HSP's gapped: 858 Number of HSP's successfully gapped: 2 Length of query: 50 Length of database: 146,168,000 Length adjustment: 23 Effective length of query: 27 Effective length of database: 136,841,362 Effective search space: 3694716774 Effective search space used: 3694716774 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.2 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 42 (21.9 bits) S2: 62 (28.5 bits)

39 Various types of Web services : SOAP PERL run_BLAST () blastall SOAP::Lite SOAP::WSDL XML::Compile::WSDL11 BLAST parameters XML Client serialization ZSI SOAPpy AXIS METRO XML result deserialization PHP-SOAP

40 Various types of Web services : SOAP PERL run_BLAST () SOAP::Lite/Apache XML BLAST result Client ? AXIS / Tomcat deserialization serialization PHP-SOAP/ Apache blastall

41 Various types of Web services : SOAP PERL run_BLAST () XML BLAST result Client deserialization serialization blastall PERL BLAST parameters XML Client serialization XML result deserialization XML

42 WSDL: Web Services Description Language: XML « a machine-readable description of the operations offered by the service » The server « introduce himself » to the clients Names of the available services (=methods) Parameters of each service (name + type) Result of each service (type) Various types of Web services : SOAP-WSDL <definitions name="RSATWS" targetNamespace="urn:RSATWS" xmlns:tns="urn:RSATWS" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://schemas.xmlsoap.org/wsdl/" xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/" xmlns:html="http://www.w3.org/1999/xhtml" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> Parameters for the operation retrieve_seq. Return type. Accepted values: 'server' (result is stored on a file on the server), 'client' (result is directly transferred to the client), 'both' (result is stored on the server and transferred to the client), and ticket (an identifier, allowing to monitor the job status and retrieve the result when it is done, is returned to the client). Default is 'both'. Organism. Words need to be underscore separated (example: Escherichia_coli_K12). A list of query genes. Return sequences for all the genes of the organism if value = 1. Incompatible with query.

43 WSDL: The URL of the WSDL is necessary to « consume » a SOAP/WSDL Web Service (=write a client) Allows for automatic generation of client-side libraries « client stub » => Reduce the amount of code you have to write Various types of Web services : SOAP-WSDL parameters XML Client serialization XML result deserialization Example: to write a client for RSAT Web Services in PERL - SOAP::WSDL installed - http://rsat.ulb.ac.be/rsat/web_services/RSATWS.wsdl http://rsat.ulb.ac.be/rsat/web_services/RSATWS.wsdl - PERL library « RSATWS » downloadable on RSAT Website, generated from the WSDL

44 Example of code for RSAT PERL Client: Various types of Web services : SOAP-WSDL #!/usr/bin/perl –w use SOAP::WSDL; use lib 'RSATWS'; use MyInterfaces::RSATWebServices::RSATWSPortType; ## new soap object my $soap=MyInterfaces::RSATWebServices::RSATWSPortType->new(); ## parameters my %args = ('format' => text); ## Send the request to the server my $som = $soap->supported_organisms({'request' => \%args}); ## Get the result unless ($som) { printf "A fault (%s) occured: %s\n", $som->get_faultcode(), $som->get_faultstring(); } else { my $results = $som->get_response(); my $result = $results -> get_client(); print "Supported organism(s): \n".$result; }

45 Various types of Web services : REST RESTful Web services: HTTP transport but no messaging system Can be seen as a way to retrieve resources via their URLs Most often used for databases Often not really considered as « Web Services » Example: http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucleotide&id=U12345&rettype=fasta >gi|540023|gb|U12345.1|AMU12345 Aepyceros melampus isolate am5 D-loop, partial sequence; mitochondrial ACTACCGCTATCAATATACTCCCACAAATATCAAGAGCCTTCCCAGTATTAAATTTGCTAAAATTTTAAA AATTCAATACGAACTTCACACTCCACAGCCTCACGCGAAATTAATAATACGTATTTAAATTCTAGAGTAC ATACCATGAACTATCGTTTAGTACATGAATTTACACACGTCAGCCCGATCAAATGTTTATGTACATAACA CATTATATATGTACATTTCAGTTTGTGTATATAGACATAACATTAATGTAATAAAGACATAATATGTATA TAGTACATTAATTGATTGTCCTCAAGCATATAAGCAAGTACTAGACATTCACTAGCGGTACATAGTACAT TTCATTGTTCATCGTACATAGCGCATGTCAGNCAAATCCGTTCTTGTCAACATGCATATCCCGTCCACTA GATCAC

46 Web Services: pros and cons Advantages Independency of languages => interoperability Standard for accessing and describing the services Improved connectivity between the programs Possibility of constructing workflows Drawbacks Independency of languages not that straightforward to make a universal server Each language has its own implementation of the standard Heavy system (SOAP/WSDL), need maintenance by service providers Efficiency => heavy network traffic + serializing/deserializing

47 WS everywhere Amazon Google http://seekda.com/ Extensive search engine for Web Services (currently 27 813 services) http://demo.service-finder.eu (alpha version, promising) http://demo.service-finder.eu

48 WS in Bioinformatics http://www.ebi.ac.uk/Tools/webservices/ http://www.ncbi.nlm.nih.gov/entrez /query/static/eutils_help.html http://xml.ddbj.nig.ac.jp/index.html http://rsat.bigre.ulb.ac.be/rsat/ http://www.genome.jp/kegg/soap/ http://api.bioinfo.no/wsdl/JasparDB.wsdl

49 Los servicios web semánticos proponen extender estas tecnologías, en vías de consolidación, con ontologías y semántica que permitan la selección, integración e invocación dinámica de servicios, dotándoles así mismo de la capacidad de reconfigurarse dinámicamente para adaptarse a los cambios (p.e. interrupción de servicios o aparición de otros más adecuados) sin intervención humana. Agregando Significado…

50 ¿Qué son los servicios Web semánticos? Los Servicios Web Semánticos son una nueva tecnología resultante de la combinación de la Web Semántica y los Servicios Web. Servicios Web Semánticos = Servicios Web + WebSemántica

51 Servicios Web y Web semántica Servicios Web: Conjunto de protocolos y estándares que permiten el intercambio de datos independientemente de plataforma y lenguaje de programación. Web Semántica: Se basa en añadir semántica a los datos publicados en la Web de forma que las máquinas sean capaces de procesar la información contenida en los documentos de modo similar a como los usuarios humanos lo pueden hacer.

52 ¿Porqué surgen los servicios Web semánticos? Existen en la actualidad una gran cantidad de servicios disponibles y esto hace inviable en tiempo y eficiencia que sea un usuario humano el que determine el servicio o servicios necesarios para satisfacer una necesidad concreta. Debido a esto surgen los Servicios Web Semánticos los cuales describen a los Servicios Web con contenido semántico de forma que el descubrimiento de servicios, su composición e invocación se pueda realizar de forma automática por parte de entidades software capaces de procesar la información semántica disponible.

53 Ontologia Representa las capacidades del servicio y sus restricciones de uso. Integra la semántica del servicio con su descripción. Consta de los siguientes elementos: Información funcional del servicio: entradas, salidas, precondiciones, postcondiciones Información no funcional : Categoría, Coste,Calidad de servicio

54 Find Relevant Genes from Online Databases Find Associations between Frequent Terms Gene Expression Analysis WorkFlows. Conectando herramientas

55

56 Example of workflow Sand et al. Nature Protocol (2008) vol. 3 (10) pp. 1604-1615

57 Taverna: a workbench to design workflows http://taverna.sourceforge.net/

58 WS in bioinformatics: Utopia ? Work is on service providers Reluctancy of service providers to add/switch to WS – Takes time and human ressources to set up WS – Necessity to find people that are WS experts or willing to learn WS Lack of advertisement Lack of a global registry Various WS: SOAP/REST + BioMOBY + SOAPLAB All accessed in different ways Lack of users !!!

59 A future bioinformatics « world » for computers ? I still have a dream…

60 Acknowledgements Prof. Jacques van Helden Dr. Morgan Thomas Grupo: Luis José Muniz Rascado, Jair, Lilian, Shirley, Ale, Aura Dr. Julio Collado Vides


Descargar ppt "U N A M Universidad Nacional Autónoma de México Servicios Web con aplicaciones en Bioinformática 24 de marzo, 2009."

Presentaciones similares


Anuncios Google