Bioinformática www.geocities.com/mirkozimic/bioinfo Introducción, Bases de datos biológicas Prof. Mirko Zimic.

Slides:



Advertisements
Presentaciones similares
Physical Science. Electricity Electricity is the flow of electrons or electric power or charge. The basic unit of charge is based on the positive charge.
Advertisements

BLAST.
Química Biológica I - Bioquímica I
Los números ordinales (first, second, third…). Los números ordinales  1—primero*  2—segundo  3—tercero*  4—cuarto  5—quinto  6—sexto  7—séptimo.
Detección de Secuencias Reguladoras en el Genoma
Writing A Lab Report.
Science Meeting Junta de la Ciencia March 3, de marzo 2009 Bridges Academy at Melrose March 3, de marzo 2009 Bridges Academy at Melrose.
ESTRATEGIAS PARA EL MANEJO DE CLASE Y LA CREACIÓN DE AMBIENTES DE APRENDIZAJE MCDISL UNIDAD I.
Preguntas Esenciales 1.What are essential questions? What are their purpose in learning? 2.Look over the essential questions and think about how they might.
Building a Library of Alexandria Scientific Supercourse "One of the greatest gifts we can give to another generation is our experiences, our wisdom" Desmond.
SOCIAL DEVELOPMENT IN HONDURAS: Towards an integrated framework of social policies and tools to reduce poverty. SOCIAL DEVELOPMENT IN HONDURAS: Towards.
Introducción a la Bioinformática 2002 Universidad Nacional San Cristobal de Huamanga, Ayacucho Mirko Zimic.
10 de abril de 2014 Bioinformática Nuevas Capacidades para un Nuevo País Carlos Vásquez Director Nacional de Tecnología y Competitividad Microsoft Colombia.
Helping Your Child at Home with Math Agenda Welcome and Overview Math Tools Using Math Strategies Homework Grade Level Games Closing: Mathematics Vision.
1 Cybernetics and Systems Theory: Challenges, Opportunities & Examples for Advancing Social Theory 5th Biennial International Congress on the Philosophical,
SOCIEDAD PARA EL DESARROLLO REGIONAL DE CANTABRIA (SODERCAN) Knowledge Management tools Knowledge management tools.
TELEFÓNICA Research (I+D ) © 2008 Telefónica Investigación y Desarrollo, S.A. Unipersonal ICT 2008 – Collective Intelligence Networking Nov. 26, 2008 ©
What can you work out from this government campaign?
To compare people or things that are equal to one another, you use: Making comparisons En mi club, levantar pesas es tan popular como correr.
The CATH Domain Structure Database Ana Gabriela Murguía Carlos Villa Soto.
TOMANDO PASOS MÁS ALLÁ DE SOLO LECTURA Y A LA FORMACIÓN DEL DISCIPULADO CREATIVA Creative Teaching Methods Métodos Creativos de Enseñanza iTeenChallenge.
(c) P. Gomez-Gil, INAOEP DISEÑO DE SOFTWARE 2ª. parte NOTAS DEL CURSO Ingeniería de Software I DRA. MARIA DEL PILAR GÓMEZ GIL INAOEP Versión:
Federación de servicios a la ciudadanía medios de comunicación, artes, cultura y deportes The impact of noise on musicians Looking for solutions.
What has to be done today? It can be done in any order. Make a new ALC form Do the ALC Get two popsicle sticks Get 16 feet of yarn. That is 4 arms width.
Hace + Time Expressions
Bioinformática Predicción estructural y funcional Máster Biomedicina
TRANSCRIPTOMICA & PROTEOMICA
Bioinformática Introducción. Bioinformática Definición intuitiva Conjunto de herramientas informáticas que sugieren soluciones a problemas biológicos.
Day 4: Unidad 3. Vámonos Describen las personas famosas en españolDescriben las personas famosas en español Describe the famous people (1 complete sentence.
ALC #7 Do the math problems and write the answer in Spanish.
Hoy es viernes, el 26 de septiembre
Alineamiento local: búsqueda de homologías
Definition
ALC 73 Hoy es lunes el 16 de abril de Completa la información usando frases completas. Tu nombre completo Tu fecha de nacer El lugar de nacer El.
Workshop: Llevando Responsible Care y la Gestión de Producto a un nivel más alto en Sud América.
Bioinformática.
¿Qué haces en la escuela? Question words, objects, yo-go’s.
Sistemas de Información Agosto-Diciembre 2007 Sesión # 10.
ALC 63: Traducir Hoy es miércoles el 25 de marzo, 2015 I text on the phone every day. What do you do everyday? I play video games in my living room. Where.
Spanish Sentence Structure How can we make better sentences?
A powerpoint to explain Life Expectancy. Level: Secondary school. Subjects: History, geography & social studies. What you need: A computer, a screen and.
© Copyright Ebiointel,SL 2006 Recursos para el análisis de secuencias The Biocatalog.
The spHPP MRM-DB: A database to support MRM assays Vital Vialas Fernandez – Bioinformatics UCM La Cristalera – Miraflores – 11/12/12.
Portafolios E E- Portfolios What is - Qué es e-portfolio? e-Portfolio: A portfolio is a collection of work developed across varied contexts over.
Welcome to EFNEP and SNAP-Ed New Educator Training !
¡Hola, buenas tardes! Please write your desk number on your disclosure paper to the left of the large C. Place the paper on the correct color, on the table,
Español II Segundo tema.  You will be able to tell about what you do everyday in your daily routine.  You will be able to identify items you use to.
I can… Listen and respond to questions in L2 Review concepts from Spanish 1 Recognize when to use SER or ESTAR Use SER/ESTAR in context Recognize when.
Hacer Ahora. Usa saber o conocer para completar las oraciones.
IHR Risk Communication Capacity Building Workshop Bryna Brennan, Senior Advisor, Risk and Outbreak Communication PAHO/WHO Lima, Peru – March 2010.
Essential ?: How do I use these irregular verbs? How are they different than the verbs I already know?
ALC 69 Hoy es miércoles el 28 de marzo de Escribe en español e inglés 1.With me 2.With you 3.Hey! 4.I’m sorry. 5.I have to study. 6.What a shame!
What are some other organic molecules? Lipids/ Lipidos Fats/ Grasas.
GENE MUTATIONS/ MUTACIONES GENICAS
Un juego de adivinanzas: ¿Dónde está el tesoro? A1B1C1D1E1F1 A4B4C4D4E4F4 A2B2C2D2E2F2 A5B5C5D5E5F5 A3B3C3D3E3F3 A6B6C6D6E6F6 Inténtalo de nuevo Inténtalo.
¡Hola, buenos días! Bienvenidos a la clase de español Agenda Bienvenida Examen de la ciudad Parte II Video Buscando a Nemo.
SCAFFOLDING & DIFFERENTIATION
AIM: How do comparative studies help trace evolution? Como ayuda la comparacion a establecer relaciones evolutivas?
Aim: How do scientists use biotechnology to manipulate genomes? Objetivo: ¿Cómo los científicos utilizan biotecnología para manipular genomas?
What is Genetic Engineering? Que es la Ingenieria Genetica? Genetic Engineering is a new process that scientists use to alter the genetic instructions.
100 Book Challenge Desafío de leer 100 libros. Cada niño tendrán: Una bolsa de libros Una carpeta 2 libros para leer cada noche Un informe de lectura.
Campanada guidelines in your composition notebook ¿Cómo es tu familia? (10 points) Ex. Hoy es Miercoles el 27 de enero First line will start with the date.
Martes, 4 de octubre WALT: how to tell the time in Spanish WILF: to be able to understand and begin to say the time in Spanish Can you match these times.
Examen 4A El mapa gráfico.
First Grade Dual High Frequency Words
GRAPHIC MATERIALS 1. GRAPHIC MATERIALS. GRAPHIC MATERIALS 1. GRAPHIC MATERIALS.
Youden Analysis. Introduction to W. J. Youden Components of the Youden Graph Calculations Getting the “Circle” What to do with the results.
Eines bioinformàtiques i estadístiques per a la investigació biomèdica
Quasimodo: Tienes que hacer parte D de la tarea..
How to write my report. Checklist – what I need to include Cover page Contents page – with sections Introduction - aims of project - background information.
Transcripción de la presentación:

Bioinformática Introducción, Bases de datos biológicas Prof. Mirko Zimic

What is Bioinformatics? What is Bioinformatics? - Research, development, or application of computational tools and approaches for expanding the use of biological, medical, behavioral or health data, including those to acquire, store, organize, archive, analyze, or visualize such data. What is Computational Biology? - The development and application of data- analytical and theoretical methods, mathematical modeling and computational simulation techniques to the study of biological, behavioral, and social systems. (Working Definition of Bioinformatics and Computational Biology - July 17, 2000).

Molecular Biology Basic concepts, Genomic and Proteomic structure Core Bioinformatics Biological Databases, Sequence Analysis, Functional Genomics Advanced Bioinformatics Molecular Evolution and Phylogeny Protein Structure Prediction The Transcriptome The Proteome Informatics Information Theory Basic Statistics Database Technologies Knowledge Representation Biocomputing The “Ideal” Syllabus

Konrad Zuse con la Z1 reconstruída. Zurich

Durante la II Guerra mundial los ingleses construyen en respuesta al codificador Enigma, el Colossus. Enigma

En 1944 IBM y la Universidad de Harvard estrenan Mark I, la primera computadora que responde a la moderna definición. Medía.15 metros de largo, 2.40 mts de alto y pesaba 10 toneladas. Utilizaba relays electromecánicos.

Este es uno de los relay que se usaron en la Mark I

Sumaba en menos de un segundo, multiplicaba en cerca de seis, y dividía en cerca de doce.

Costo Efectividad ! La Bioinformática resulta ser una disciplina muy favorable en cuanto a costo-efectividad.

On Life... “Living things are composed of lifeless molecules” (Albert Lehninger) La Biología puede reducirse a las leyes Físicas fundamentales?

La Bioinformática se inicia con el desarrollo de bases de datos biológicas, seguido del desarrollo de herramientas de búsqueda rápida de información… Actualmente la Bioinformática busca el desarrollo de algoritmos de predicción basado en la información almacenada en las bases de datos biológicas.

Historical Perspective Key developments: Dayhoff, Atlas of Protein Sequence and Structure ( ) Genbank/EMBL nucleic-acid sequence databases ( ) Entrez (early 90’s – date) Sequence alignment algorithms: Needleman/Wunsch (1970), Smith/Waterman (1981), FASTA (Pearson/Lipman, 1988), BLAST (Altschul, 1990) Genomes (1995 – date)

Collecting Sequence Data Genome (DNA-level): Genomic sequencing  Complete picture of genome  Generates physical map  Includes regulatory and other silent regions Transcriptome (RNA-level): Expression-library sequencing  Expressed genes only  Splicing / variant forms  Can correlate with levels of expression Proteome (protein-level): Protein sequencing  Insight into biological function  Gives information on protein-protein interactions  Post-translational modifications detected

The exponential growth of molecular sequence databases & cpu power — Year BasePairs Sequences doubling time ~ one year

Databases contain more than just DNA & protein sequences

The “omics” Series Genomics –Gene identification & charaterisation Transcriptomics –Expression profiles of mRNA Proteomics –functions & interactions of proteins Structural Genomics –Large scale structure determination Cellinomics –Metabolic Pathways –Cell-cell interactions Pharmacogenomics –Genome-based drug design

Structural Genomics What is structural genomics? Genomes and folds: –Finding folds in genomes –Structural properties of entire proteomes –Comparing genomes in terms of structure Selection of targets for structural genomes –Covering the sequence space with structures –Using structure to understand function –Systematic structure determination for complete genomes –Special targets –Predicting success of structure determination Adaptation of proteins to extreme environments Structural genomics resources on the internet

Functional Genomics Development and application of global (genome-wide or system-wide) experimental approaches to assess gene function by making use of the information provided by structural genomics.

Commercial Structural Genomics Initiatives IBM (Blue Gene project: 2000) –Computational protein folding Geneformatics (1999) –Modeling for identifying active sites Prospect Genomics (1999) –Homology modeling Protein Pathways (1999) –Phylogenetic profiling, domain analysis, expression profiling Structural Bioinformatics Inc (1996) –Homology modeling, docking

Proyecto Genoma Humano La secuencia del genoma está casi completa! –aproximadamente 3.5 billones de pares de bases.

Raw Genome Data

Implications for Biomedicine Physicians will use genetic information to diagnose and treat disease. –Virtually all medical conditions (other than trauma) have a genetic component. Faster drug development research –Individualized drugs –Gene therapy All Biologists will use gene sequence information in their daily work

Bioinformatics Challenges  Lots of new sequences being added - automated sequencers - Human Genome Project - EST sequencing  GenBank has over 10 Billion bases and is doubling every year!! (problem of exponential growth...)  How can computers keep up? The huge dataset

Genome comparisons Designed for looking at complete bacterial genomes.

AT content Forward translations Reverse Translations DNA and amino acids Gene finding

Bringing a New Drug to Market Review and approval by Food & Drug Administration 1 compound approved Phase III: Confirms effectiveness and monitors adverse reactions from long-term use in 1,000 to 5,000 patient volunteers. Phase II: Assesses effectiveness and looks for side effects in 100 to 500 patient volunteers. Phase I: Evaluates safety and dosage in 20 to 100 healthy human volunteers. 5 compounds enter clinical trials Discovery and preclininal testing: Compounds are identified and evaluated in laboratory and animal studies for safety, biological activity, and formulation. 5,000 compounds evaluated Years 16

Impact of Structural Genomics on Drug Discovery

Epitopes … B-cell epitopesT h -cell epitopes

Vaccine development In Post-genomic era: Reverse Vaccinology Approach.

How a molecule changes during MD

In Silico Analysis Gene/Protein Sequence Database Disease related protein DB Candidate Epitope DB VACCINOME Peptide Multitope vaccines Epitope prediction

Biological Research in 21st Century “ The new paradigm, now emerging is that all the 'genes' will be known (in the sense of being resident in databases available electronically), and that the starting point of a biological investigation will be theoretical.” - Walter Gilbert

II. El papel del Biólogo en la Era de la Información

El Internet provee abundante información biologica  Puede resultar abrumador… - - Web  Necesidad de nuevas habilidades = localizar información necesaria de manera eficiente

Computing in the lab - everyday tasks (vs. computational biology)  ordering supplies  reference books  lab notes  literature searching

Training "computer" scientists  Know the right tool for the job  Get the job done with tools available  Network connection is the lifeline of the scientist  Jobs change, computers change, projects change, scientists need to be adaptable

The job of the biologist is changing As more biological information becomes available … –The biologist will spend more time using computers –The biologist will spend more time on data analysis (and less doing lab biochemistry) –Biology will become a more quantitative science (think how the periodic table and atomic theory affected chemistry)

Implementación de una estación de trabajo para análisis bioinformáico -Windows vs. Linux -Software freeware / open source -Bases de datos online, gratuitas -Clusters computacionales -GRIDS

Un ejemplo … Cisteíno proteasa de la fasciola hepática: En busca de un péptido inmunogénico

Alineamiento: cisteíno proteasas de mamífero Vs. cisteíno proteasa de Fasciola hepatica. AA IdénticosAA divergentes

Epítope Discontinuo, formado por porciones distantes de la secuencia. Denaturación El epítope se pierde con la denaturación.

Denaturación El epítope se conserva como tal. Epítope Continuo, formado por una porción de la secuencia

Modelaje tridimensional por homología. Identidad de secuencia de 56% con quimopapaína (1YAL)

AA idénticosAA divergentes Análisis de Superficie: vista de átomos por radio de van der Waals

TMEGQYMKNERTSISFS YYTVQSGSEVELK NLIGSE QSQTCSPLRVN RYNKQLGVAKV Selección de secuencias (1)divergentes, (2)accesibles al solvente y (3)contínuas.

Otro ejemplo… Sensibilidad de la aspartyl proteasa del HIV-1 a los inhibidores más frecuentes

Representación en “cartoon” de la enzima proteasa de HIV-1

Enzima proteasa de HIV-1 mostrando los elementos de estructura secundaria, flaps y sitio activo

Enzima proteasa de HIV-1 indicando los residuos consenso de unión inhibidor-enzima

INDINAVIR

RITONAVIR

COMPARACION ENTRE UNA ENZIMA SENSIBLE Y UNA RESISTENTE A RITONAVIR

Un ejemplo más… Ordenamiento filogenético y el contenido de GC en tripanosomátidos

Reported %GC variation for each codon position in Trypanosomatids (Alonso et al,1992)

Codon usage in Trypanosomatids leucine

Codon usage in Trypanosomatids serine

Phylogeny of Trypanosomatid lineage (Maslov & Simpson)

“Hole” formation by DNA replication

GC content variation in time Restriction: AA family conservation and AA conservation

%GC variation in Trypanosomatid lineage (Nuclear coding DNA)