Introducció a la Bioinformàtica Roderic Guigó i Serra Bioinformàtica, UPF Curs 2012-2013.

Slides:



Advertisements
Presentaciones similares
Ayudando a su Hijo(a) a Tener Éxito en la Escuela.
Advertisements

Química Biológica I - Bioquímica I
Talking about weather Discussing the weather is a very human thing to do, and every language has its own way of doing it. If you think about it, the weather.
Los números ordinales (first, second, third…). Los números ordinales  1—primero*  2—segundo  3—tercero*  4—cuarto  5—quinto  6—sexto  7—séptimo.
Present Perfect Pluperfect (Past Perfect)
REQUISITOS PARA LA GRADUATION DE LAS HIGH SCHOOLS DE ALLIANCE Alliance High School Graduation Requirements.
Question words question WORDS? Cómo Cuándo Cuánto Dónde Por qué Qué Cuál Quién A qué hora Adónde.
1 Cybernetics and Systems Theory: Challenges, Opportunities & Examples for Advancing Social Theory 5th Biennial International Congress on the Philosophical,
SOCIEDAD PARA EL DESARROLLO REGIONAL DE CANTABRIA (SODERCAN) Knowledge Management tools Knowledge management tools.
Krisjane De Jesus Presentation of Learning Part1 Pre Calculus Essential #6 PLTW Essential #4 Biology Essential #7.
Health Products Beauty Products Diet/Weight loss Financial Freedom.
Health Products Beauty Products Diet/Weight loss Financial Freedom.
Objective: I can recognize and accurately use gender agreement. Do Now: Match the following Spanish and English words: 1. Pelirroja a. Good-looking 2.
©2014 by Vista Higher Learning, Inc. All rights reserved Spanish has two verbs that mean to know: saber and conocer. They cannot be used interchangeably.
Time Expression with Hacer Grammar Essential #106.
Notes #18 Numbers 31 and higher Standard 1.2
Some “boolean” concepts The following series of slides is not supposed to give you answers, but to provide substance for thought and ponder. The placenta.
Telling Time (Cómo decir la hora). When we ask what time it is in Spanish, we say “¿Qué hora es?” Some people also say “¿Qué horas son?” (México y Centroamérica)
Federación de servicios a la ciudadanía medios de comunicación, artes, cultura y deportes The impact of noise on musicians Looking for solutions.
MORE CONVERSATION TOOLS
Para hacer ahora Identifiquen:
Department of Modern Languages. Gramática nueva When we ask what time it is in Spanish, we say “¿Qué hora es?” We reply… “Es la una…”
THE SCIENCE CAFÉ… WHERE COOL, HIP YOUNG FOLKS COME TO DISCUSS SCIENCE!
Flag Day (Mexico) By: Jason Kelley.
Hace + Time Expressions
Bioinformática Predicción estructural y funcional Máster Biomedicina
Recuperació de la informació Bioinformatics. Sequence and genome analysis David W. Mount Flexible Pattern Matching in Strings (2002) Gonzalo Navarro and.
1 Trade, Innovation and Productivity Center Sebastián Urbina Director.
Español 1 18 – 19 DE MARZO Get your city map and have it on your table, you can study the vocab on your own creation!
First Grade – High Frequency Word Reading Competition Classroom Competition Created by: Malene Golding School Improvement Officer: Kimberly Fonteno.
Science Career Research Project By Jose pineda. Name of Job: Electrical Engineering I like that I would like to invent new electric and new social networks.
Carnegie Mellon University ComputersandRobotsComputersandRobots Randal E. Bryant.
The imperfect tense. Look at the following 3 sentences. We ate at two every day The hotel was very big I used to live with my grandparents What do they.
Time in Spanish Nivel 1. Telling time inSpanish  Time is not TOO different in Spanish.  It is formatted the way time used to be told in English.  It.
Talking about weather Discussing the weather is a very human thing to do, and every language has its own way of doing it. If you think about it, the weather.
¿Qué hora es? The third-person singular of ser is used with one o’clock since it is just one hour (hora) Es la una. The singular.
Cómo decir la hora. When we ask what time it is in Spanish, we say… “¿Qué hora es?”
El calentamiento - el 6 de octubre 1.el bolígrafo la bolígrafo 2.la carpetaslas carpetas 3.el deporteslos deportes 4.el plumala pluma 5.las animaleslos.
A powerpoint to explain Life Expectancy. Level: Secondary school. Subjects: History, geography & social studies. What you need: A computer, a screen and.
Notebook Organization (Todo el trabajo de esta clase se hace en el cuaderno)
T IENEN CINCO MINUTOS Objective: I can identify weather conditions and compose sentences to describe the weather. Vocab/Ideas Hace buen tiempo Hace sol.
¡Buenos días! Hoy es el 28 de agosto oso = bear Trabajo del timbre: 1)Pongan la tarea en tus pupitres (i.e. Online HW white sheet OR Missing Work Log).
L AS EXPRESIONES DE HACE CON EL TIEMPO Español II.
©2008 The McGraw-Hill Companies, Inc. All rights reserved. Digital Electronics Principles & Applications Seventh Edition Chapter 2 Numbers We Use in Digital.
Learning Target: I will be able to determine the Difference between different ecosystems around the world.
El Asunto Una encuesta realizada por la AMAFORE señala que el 62 por ciento de los entrevistados está de acuerdo con la afirmación que dice: "No he planeado.
PART 1: 1 st slide – record your phone conversation. PART 2: 10 slides – see topics per slide Each slide will have an illustration / clip art that goes.
Un juego de adivinanzas: ¿Dónde está el tesoro? A1B1C1D1E1F1 A4B4C4D4E4F4 A2B2C2D2E2F2 A5B5C5D5E5F5 A3B3C3D3E3F3 A6B6C6D6E6F6 Inténtalo de nuevo Inténtalo.
¿New media? Lev Manovich It is responsible for one of the works of reference for the interpretation of the new media. “The language of new media (2001)”
Para hacer ahora 1) ¿Cuál es la fecha de hoy? 2) ¿Cuál es la fecha de mañana? 3) ¿Cuáles son las estaciones del año?
Time Expressions Using HACER Present, Preterite and Imperfect Spanish III.
English Language II (2). English Language I (2) Warm-up.
AIM: Why and how do cells divide? Por que y como se dividen las celulas? DN: Compare and Contrast Sexual and Asexual Reproduction. Compara y contrasta.
LO: SWBAT explain how protein shape is determined and differentiate between the different types of mutations. Objetivo: Explica como se determina la forma.
¡BIENVENIDOS! ALPHABET, COGNATES.. DO NOW Take five minutes to Silently and Independently fill out the calendar on your desk. Every Calendar should have:
Aim: How do scientists use biotechnology to manipulate genomes? Objetivo: ¿Cómo los científicos utilizan biotecnología para manipular genomas?
  Jugó=he/she played  Use the preterite tense for past actions that are viewed as over and are not being connected to the present. ¿Te acuerdas?
MY HOMETOWN My village is very small and quaint. It is called Cerro Sombrero. In this community we all know. I have beautiful and at the same time sad.
1.Hubo varios momentos importantísimos en la película que sirvieron a cambiar los pensamientos y motivos de Ernesto. Puedes identificar tres de estos momentos.
Time Expression with Hacer Grammar Essential #106.
The imperfect tense. Look at the following 3 sentences. We ate at two every day The hotel was very big I used to live with my grandparents What do they.
AQA Unit 2 Speaking Los medios La televisión La publicidad Las tecnologías de la comunicación La cultura de todos los días El cine La música La moda La.
¿Cuánto tiempo hace que…? You can ask when something happened in Spanish by using: ¿Cuándo + [preterit verb]…? ¿Cuándo llegaste a la clínica? When did.
First Grade Dual High Frequency Words
Vocabulario Podrán encontrar la lista del vocabulario en el folder “Día de los muertos Unit” en la pagina web de la escuela.
Proyecto: Mi horario Nombre Hora Fecha.
Indirect Questions First Day on the Job 11 Focus on Grammar 4 Part X, Unit 28 By Ruth Luman, Gabriele Steiner, and BJ Wells Copyright © Pearson Education,
introducció a la bioinformàtica
introducció a la bioinformàtica
a. Which job do you think pays more? I think an assistant chef earns more, as he spends all day working, while the dog walker earns according to the dogs.
Transcripción de la presentación:

Introducció a la Bioinformàtica Roderic Guigó i Serra Bioinformàtica, UPF Curs

Van Leeuwenhoek In 1676 his credibility was questioned when he sent the Royal Society a copy of his first observations of microscopic single celled organisms. Heretofore, the existence of single celled organisms was entirely unknown … The Royal Society arranged to send an English vicar, as well as a team of respected jurists and doctors to Delft, Holland to determine whether it was in fact Van Leeuwenhoek's ability to observe and reason clearly (wikipedia)

ACTCAGCCCCAGCGGAGGTGAAGGACGTCCTTCCCCAGGAGCCGGTGAGAAGCGCAGTCGGGGGCACGGGGATG AGCTCAGGGGCCTCTAGAAAGATGTAGCTGGGACCTCGGGAAGCCCTGGCCTCCAGGTAGTCTCAGGAGAGCTAC TCAGGGTCGGGCTTGGGGAGAGGAGGAGCGGGGGTGAGGCCAGCAGCAGGGGACTGGACCTGGGAAGGGCTGG GCAGCAGAGACGACCCGACCCGCTAGAAGGTGGGGTGGGGAGAGCATGTGGACTAGGAGCTAAGCCACAGCAGG ACCCCCACGAGTTGTCACTGTCATTTATCGAGCACCTACTGGGTGTCCCCAGTGTCCTCAGATCTCCATAACTGGGA AGCCAGGGGCAGCGACACGGTAGCTAGCCGTCGATTGGAGAACTTTAAAATGAGGACTGAATTAGCTCATAAATGG AAAACGGCGCTTAAATGTGAGGTTAGAGCTTAGAATGTGAAGGGAGAATGAGGAATGCGAGACTGGGACTGAGATG GAACCGGCGGTGGGGAGGGGGAGGGGGTGTGGAATTTGAACCCCGGGAGAGAAAGATGGAATTTTGGCTATGGA GGCCGACCTGGGGATGGGGAAATAAGAGAAGACCAGGAGGGAGTTAAATAGGGAATGGGTTGGGGGCGGCTTGGT AACTGTTTGTGCTGGGATTAGGCTGTTGCAGATAATGGAGCAAGGCTTGGAAGGCTAACCTGGGGTGGGGCCGGGT TGGGGTCGGGCTGGGGGCGGGAGGAGTCCTCACTGGCGGTTGATTGACAGTTTCTCCTTCCCCAGACTGGCCAATC ACAGGCAGGAAGATGAAGGTTCTGTGGGCTGCGTTGCTGGTCACATTCCTGGCAGGTATGGGGCGGGGCTTGCTCG GTTTTCCCCGCTTCTCCCCCTCTCATCCTCACCTCAACCTCCTGGCCCCATTCAAGCACACCCTGGGCCCCCTCTTC TTCTGCTGGTCTGTCCCCTGAGGGGAAAGCCCAGGTCTGAGGCTTCTATGCTGCTTTCTGGCTCAGAACAGCGATTT GACGCTCTGTGAGCCTCGGTTCCTCCCCCGCTTTTTTTTTTTCAGCCAGAGTCTCACTCTGTCGCCCAGGCTGGAGT GCAGTGGCGCAATCTCAGCTCACTGCAAGCTCCGCCTCCCGGGTTCACGCTATTCTCCCGCCTCAGCCTCCCGAGT AGCTGGGACTACAGGCGCCCGCCACCATGCCCGGCTAATTTTTTGTACTTTGAGTAGGGAAGGGGTTTCACTGTATT ATCCAGGATGGTCTCTATCTCCTGACCTCGTGATCTGCCCGCCTGGCCTCCCAAAGTGCTGGAATTACAGGCGTGAG CCTCCGCGCCCGGCCTCCCCATCCTTAATATAGGAGTTAGAAGTTTTTGTTTGTTTGTTTTGTTTTGTTTTTGTTTTGTT TTGAGATGAAGTCCCTCTGTCGCCCAGGCTGGAGTGCAGTGGCTCCCAGGCTGGAGTTCAGTGGCTGGATCTCGGC TCACTGCAAGCTCCGCCTCCCAGGTTCACGCCATTCTCCTGCCTCAGCCTCCGGAGTAGCTGGGACTACAGGAACA TGCCACCACACCCGACTAACTTTTTTTGTATTTTTAGTAGAGACGGGGTTTCACCATGTTGGCCAGGCTGGTCTGGAA CTCCTGACCTCAGGTGATCTGCCTGCTTCAACCTCCCAAAGTGCTGGGATTACAGACGTGGGCCACCGCGCCCGGC TGGGAGTTAAGAGGTTTCTAATGCATTGCATTAGAATACCAGACACGGGACAGCTGTGATCTTTATTCTCCATCACCC CACACAGCCCTGCCTGGGGCACACAAGGACACTCAATACACGCTTTTCGGGCGCGGTGGCTCAAGCTGTAATCCCA GCACTTTGGGAGGCTGAGGCGGGTGGTACATGAGGTCAGGAGATCGAGACCATCCTGGCTAACATGGTGAAACCC CGTCTCTACTAAAAATACAAAAAACTAGCCCGGGCGTGGTGGCGGGCGCCTGTAGTCCCAGCTACTCGGAGGCTGA GGCAGGAGAATGGCGTGAACCTGGGAGGCGGAGCTTGCAGTGAGCCGAGATCGCGCCACTGCACTCCAGCCTGG GTGACACAGCGCGAGACTCCGTCTCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAATACACGCTTTTCCGCTAGGCA CGGTGGCTCACCCCTGTAATCCCAGCATTTTGGGAGGCCAAGGTGGGAGGATCACTTGAGCCCAGGAGTTCAACAC CAGACTCAGCAACATAGTGAGACTCTCTCTACTAAAAATACAAAAATTAGCCAGGCCTGGTGCCACACACCTGTGGT CCCAGCTACTCAGAAGGCTAAGGCAGGAGGATCGCTTAAGCCCAGAAGGTCAAGGTTGCAGTGAACCACGTTCAG GCCACTGCAGTCCAGCCTGGGTGACAGAGCAAGACCCTGTCTGTAAATAAATAACGCTTTTCAAGTGATTAAACAGA CTCCCCCCTCACCCTGCCCACCATGGCTCCAAAGCAGCATTTGTGGAGCACCTTCTGTGTGCCCCTAGGTACTAGCT GCCTGGACGGGGTCAGAAGGAACCTGAACCACCTTCAACTTGTTCCACACAGGATGCCAGGCCAAGGTGGAGCAA CCGGTGGAGCCAGAGACAGAACCCGACGTTCGCCAGCAGGCTGAGTGGCAGAGCGGCCAGCCCTGGGAGCTGG CACTGGGTCGCTTTTGGGATTACCTGCGCTGGGTGCAGACACTGTCTGAGCAGGTGCAGGAGGAGCTGCTCAGCC CCCAGGTCACCCAGGAACTGACGTGAGTGTCCCCATCCCGGCCCTTGACCCTCCTGGTGGGCGGCTATACCTCCCC AGGTCCAGGTTTCATTCTGCCCCTGCCACTAAGTCTTGGGGGCCTGGGTCTCTGCTGGTTCTAGCTTCCTCTTCCCAT TTCTGACTCCTGGCTTTAGCTCTCTGGAATTCTCTCTCTCAGTTCTGTTTCTCCCTCTTCCCTTCTGACTCAGCCTGTC ACACTCGTCCTGGCGCTGTCTCTGTCCTTCACTAGCTCTTTTATATAGAGACAGAGAGATGGGGTCTCACTGTGTTGC CCAGGCTGGTCTTGAACTTCTGGGCTCAAGCGATCCTCCCACCTCGCCTCCCAAAGTGCTGGGAATAGAGACATGA GCCACCTTGCTCGGCCTCCTAGCTCTTTCTTCGTCTCTGCCTCTGCTCTCTGCGTCTGTCTTTGTCTCCTCTCTGCCTC TGTCCCGTTCCTTCTCTCTTGGTTCACTGCCCTTCTGTCTCTCCCTGTTCTCCTTAGGAGACTCTCCTCTCTTCCTTCT CGAGTCTCTCTGGCTGATCCCCATCTCACCCACACCTATCC

La matèria cromosòmica és “un cristall aperiòdic”, constituït per la successió d'un nombre petit d'elements isomèrics*, la seqüència concreta dels quals és la responsable de la seva funcionalitat. (*) “ the number of atoms in such a structure need not to be very large to produce an almost unlimited number of possible arrangements. For illustration, think of the Morse code…” La matèria cromosòmica és “un cristall aperiòdic”, constituït per la successió d'un nombre petit d'elements isomèrics*, la seqüència concreta dels quals és la responsable de la seva funcionalitat. (*) “ the number of atoms in such a structure need not to be very large to produce an almost unlimited number of possible arrangements. For illustration, think of the Morse code…” 1943: Schroëdinger, “What is life?”

ENIAC Late 40s: first digital computers

MALWTRLRPLLALLALWPPPPARAFVNQHLCGS HLVEALYLVCGERGFFYTPKARREVEGPQVGAL ELAGGPGAGGLEGPPQKRGIVEQCCASVCSLYQ LENYCN Amino acid sequence of the bovine insuline

Early 60s: the genetic code

GAGTTTTATCGCTTCCATGACGCAGAAGTTAACACTTTCGGATATTTCTGATGAGT CGAAAAATTATCTTGATAAAGCAGGAATTACTACTGCTTGTTTACGAATTAAATCG AAGTGGACTGCTGGCGGAAAATGAGAAAATTCGACCTATCCTTGCGCAGCTCGA GAAGCTCTTACTTTGCGACCTTTCGCCATCAACTAACGATTCTGTCAAAAACTGA CGCGTTGGATGAGGAGAAGTGGCTTAATATGCTTGGCACGTTCGTCAAGGACTG GTTTAGATATGAGTCACATTTTGTTCATGGTAGAGATTCTCTTGT MALWTRLRPLLALLALWPPPPARAFVNQHLCGSHLVEALYLVCGERGFFY TPKARREVEGPQVGALELAGGPGAGGLEGPPQKRGIVEQCCASVCSLYQ LENYCN

1957: invention of the programming language FORTRAN

Computers become smaller and therefore faster and cheaper During the 60s computers are introduced into bancs, financial institutions, universities and research centers 60s: Transistors and integrated circuits

Sequence alignment and comparison

substitution matrices

Sequence alignment The substitution matrices provided a model under which the concept of optimal alignment could be formalized, and computed. The optimal alignment between two sequences is the alignment that maximizes the sum of the amino acid substitution values at each aligned position. A R N D C Q A R N D C Q S K - E A E - S K E A E = =3

The total number of possible alignments between two sequences of length 100 is approximately With DP the number of operations required to obtain the optimal alignment is aproximately 3x100 2 Query: 25 IPREVIERLARSQIHSIRDLQRLLEIDSVGSEDSLDTSLRAHGVHATKHVPEKRPLPIRR 84 IP E+ + L+ I S DLQRLL+ DS G ED + L H+ + R Sbjct: 10 IPEELYKMLSGHSIRSFDDLQRLLQGDS-GKEDGAELDLNMTRSHSGGELESLA----RG 64 Query: 85 KRSI------EEAVPAVCKTRTVIYEIPRSQVDPTSANFLIWPPCVEVKRCTGCCNTSSV 138 KRS+ E A+ A CKTRT ++EI R +D T+ANFL+WPPCVEV+RC+GCCN +V Sbjct: 65 KRSLGSLSVAEPAMIAECKTRTEVFEISRRLIDRTNANFLVWPPCVEVQRCSGCCNNRNV 124 Query: 139 KCQPSRVHHRSVKVAKVEYVRKKPKLKEVQVRLEEHLECAC 179 +C+P++V R V+V K+E VRKKP K+ V LE+HL C C Sbjct: 125 QCRPTQVQLRPVQVRKIEIVRKKPIFKKATVTLEDHLACKC 165 DYNAMIC PROGRAMMING, Nedleman and Wunsch, 1970 Smith and Waterman, ’s: Optimal sequence alignment

mid70’s: DNA sequencing, Sanger. Maxam and Gilbert By the end of the sixties, hundreds of proteins had been sequenced, but the sequencing on nucleic acids remained elusive Sanger (Cambridge) Maxam and Gilbert (Harvard)

Anys 70: Internet. Advanced Research Projects Agency

gagttttatcgcttccatgacgcagaagttaacactttcggatatttctgatgagtcgaaaaattatcttgataaagcaggaattactactgcttgtttacgaattaaat cgaagtggactgctggcggaaaatgagaaaattcgacctatccttgcgcagctcgagaagctcttactttgcgacctttcgccatcaactaacgattctgtcaaaaactg acgcgttggatgaggagaagtggcttaatatgcttggcacgttcgtcaaggactggtttagatatgagtcacattttgttcatggtagagattctcttgttgacatttta aaagagcgtggattactatctgagtccgatgctgttcaaccactaataggtaagaaatcatgagtcaagttactgaacaatccgtacgtttccagaccgctttggcctct attaagctcattcaggcttctgccgttttggatttaaccgaagatgatttcgattttctgacgagtaacaaagtttggattgctactgaccgctctcgtgctcgtcgctg cgttgaggcttgcgtttatggtacgctggactttgtgggataccctcgctttcctgctcctgttgagtttattgctgccgtcattgcttattatgttcatcccgtcaaca ttcaaacggcctgtctcatcatggaaggcgctgaatttacggaaaacattattaatggcgtcgagcgtccggttaaagccgctgaattgttcgcgtttaccttgcgtgta cgcgcaggaaacactgacgttcttactgacgcagaagaaaacgtgcgtcaaaaattacgtgcggaaggagtgatgtaatgtctaaaggtaaaaaacgttctggcgctcgc cctggtcgtccgcagccgttgcgaggtactaaaggcaagcgtaaaggcgctcgtctttggtatgtaggtggtcaacaattttaattgcaggggcttcggccccttacttg aggataaattatgtctaatattcaaactggcgccgagcgtatgccgcatgacctttcccatcttggcttccttgctggtcagattggtcgtcttattaccatttcaacta ctccggttatcgctggcgactccttcgagatggacgccgttggcgctctccgtctttctccattgcgtcgtggccttgctattgactctactgtagacatttttactttt tatgtccctcatcgtcacgtttatggtgaacagtggattaagttcatgaaggatggtgttaatgccactcctctcccgactgttaacactactggttatattgaccatgc cgcttttcttggcacgattaaccctgataccaataaaatccctaagcatttgtttcagggttatttgaatatctataacaactattttaaagcgccgtggatgcctgacc gtaccgaggctaaccctaatgagcttaatcaagatgatgctcgttatggtttccgttgctgccatctcaaaaacatttggactgctccgcttcctcctgagactgagctt tctcgccaaatgacgacttctaccacatctattgacattatgggtctgcaagctgcttatgctaatttgcatactgaccaagaacgtgattacttcatgcagcgttacca tgatgttatttcttcatttggaggtaaaacctcttatgacgctgacaaccgtcctttacttgtcatgcgctctaatctctgggcatctggctatgatgttgatggaactg accaaacgtcgttaggccagttttctggtcgtgttcaacagacctataaacattctgtgccgcgtttctttgttcctgagcatggcactatgtttactcttgcgcttgtt cgttttccgcctactgcgactaaagagattcagtaccttaacgctaaaggtgctttgacttataccgatattgctggcgaccctgttttgtatggcaacttgccgccgcg tgaaatttctatgaaggatgttttccgttctggtgattcgtctaagaagtttaagattgctgagggtcagtggtatcgttatgcgccttcgtatgtttctcctgcttatc accttcttgaaggcttcccattcattcaggaaccgccttctggtgatttgcaagaacgcgtacttattcgccaccatgattatgaccagtgtttccagtccgttcagttg ttgcagtggaatagtcaggttaaatttaatgtgaccgtttatcgcaatctgccgaccactcgcgattcaatcatgacttcgtgataaaagattgagtgtgaggttataac gccgaagcggtaaaaattttaatttttgccgctgaggggttgaccaagcgaagcgcggtaggttttctgcttaggagtttaatcatgtttcagacttttatttctcgcca taattcaaactttttttctgataagctggttctcacttctgttactccagcttcttcggcacctgttttacagacacctaaagctacatcgtcaacgttatattttgata gtttgacggttaatgctggtaatggtggttttcttcattgcattcagatggatacatctgtcaacgccgctaatcaggttgtttctgttggtgctgatattgcttttgat gccgaccctaaattttttgcctgtttggttcgctttgagtcttcttcggttccgactaccctcccgactgcctatgatgtttatcctttgaatggtcgccatgatggtgg ttattataccgtcaaggactgtgtgactattgacgtccttccccgtacgccgggcaataacgtttatgttggtttcatggtttggtctaactttaccgctactaaatgcc gcggattggtttcgctgaatcaggttattaaagagattatttgtctccagccacttaagtgaggtgatttatgtttggtgctattgctggcggtattgcttctgctcttg ctggtggcgccatgtctaaattgtttggaggcggtcaaaaagccgcctccggtggcattcaaggtgatgtgcttgctaccgataacaatactgtaggcatgggtgatgct ggtattaaatctgccattcaaggctctaatgttcctaaccctgatgaggccgcccctagttttgtttctggtgctatggctaaagctggtaaaggacttcttgaaggtac gttgcaggctggcacttctgccgtttctgataagttgcttgatttggttggacttggtggcaagtctgccgctgataaaggaaaggatactcgtgattatcttgctgctg catttcctgagcttaatgcttgggagcgtgctggtgctgatgcttcctctgctggtatggttgacgccggatttgagaatcaaaaagagcttactaaaatgcaactggac aatcagaaagagattgccgagatgcaaaatgagactcaaaaagagattgctggcattcagtcggcgacttcacgccagaatacgaaagaccaggtatatgcacaaaatga gatgcttgcttatcaacagaaggagtctactgctcgcgttgcgtctattatggaaaacaccaatcttcccaagcaacagcaggtttccgagattatgcgccaaatgctta ctcaagctcaaacggctggtcagtattttaccaatgaccaaatcaaagaaatgactcgcaaggttagtgctgaggttgacttagttcatcagcaaacgcagaatcagcgg tatggctcttctcatattggcgctactgcaaaggatatttctaatgtcgtcactgatgctgcttctggtgtggttgatatttttcatggtattgataaagctgttgccga tacttggaacaatttctggaaagacggtaaagctgatggtattggctctaatttgtctaggaaataaccgtcaggattgacaccctcccaattgtatgttttcatgcctc caaatcttggaggcttttttatggttcgttcttattacccttctgaatgtcacgctgattattttgactttgag 1977:  X174 virus genome

1982: the first electronic databases

FASTA, 1982: Wilbur and Lipman, 1985: Lipman and Pearson BLAST, 1990: Altschul, Gish, Miller, Myers and Lipman accelerating database searches hash methods WATSNANDCRICK ACDIKNRSTW Query Sequence Hash table K=1

Search of the Platelet Derived Growth Factor sequence 1982, Doolittle: relationship between oncogenes and growth factors

1990:The human genome project THE HUMAN GENOME PROGRAM (HGP) is producing large quantities of complex map and DNA sequence data. Informatics projects in algorithms, software, and databases are crucial in accumulating and interpreting these data in a robust and automated fashion at genome and sequencing centers Computer systems play essential roles in all aspects of genome research, from data acquisition and analysis to data management. Without powerful computers and appropriately designed data–management systems, high– volume genome research cannot proceed.

This proposal concerns the management of general information about accelerators and experiments at CERN. It discusses the problems of loss of information about complex evolving systems and derives a solution based on a distributed hypertext system (Tim Berners-Lee) 1990:WWW at CERN

Human Genome Project Milestones

2001: la culminació del projecte

bioinformatics Medline articles with keyword Bioinformatics. year# articles To 19900

bioinformatics Medline articles with keyword Bioinformatics. year# articles To

bioinformatics Medline articles with keyword Bioinformatics. year# articles To

bioinformatics Medline articles with keyword Bioinformatics. year# articles To

bioinformatics Medline articles with keyword Bioinformatics. year# articles To

Bioinformatics, Genomics, Systems Biology in Medline

What is past, is a prologue W. Shakespeare, La Tempestad,

mid70’s: DNA sequencing, Sanger. Maxam and Gilbert By the end of the sixties, hundreds of proteins had been sequenced, but the sequencing on nucleic acids remained elusive Sanger (Cambridge) Maxam and Gilbert (Harvard)

ABI PRISM 3700 DNA Analyzer

2008: Major genome centers can sequence the same number of base pairs every 4 days 1000 Genome project launched World-wide capacity dramatically increasing Further Evolution of Large-scale Genome Sequencing 2000: Human genome working drafts Data unit of approximately 10x coverage of human –10 years and cost about $3 billion 2009: Every 4 hours ($25,000) 2010: Every 14 minutes ($5,000) Illumina HiSeq2000 machine produces 200 gigabases per 8 day run (BGI have ordered have 128) Slide from Paul Flicek. EBI Bioinformatics Advisory Council

ENIAC, 1950s 2.4 x 0.9 x 30 (m)  385 operations/second operations/second/cm 3

ENIAC, 1950s 2.4 x 0.9 x 30 (m)  385 operations/second operations/second/cm 3 MAC AIR, 2010s ~1 x 32.5 x 22.7 (cm)  133,656,056 operations/second operations/second/cm 3

CELERA GENOMICS, year ,000 m 2. 2 yr.  3GB at 10x 5x10 -6 Gb/day/m 3

CELERA GENOMICS, year ,000 m 2. 2 yr.  3GB at 10x 5x10 -6 Gb/day/m 3 HISEQ year x 94 x 76 (cm). 1 day  120 Gb 10 2 Gb/day/m 3

Moore’s Law