La présentation est en train de télécharger. S'il vous plaît, attendez

La présentation est en train de télécharger. S'il vous plaît, attendez

Bioinformatique Structurale

Présentations similaires


Présentation au sujet: "Bioinformatique Structurale"— Transcription de la présentation:

1 Bioinformatique Structurale
Elodie Laine Licence de Biologie Semestre 2, Computational and Quantitative Biology, UMR 7238, CNRS-UPMC e-documents:

2 Les protéines: plusieurs dimensions
Les protéines sont des objets biologiques qui peuvent être vus/représentés en plusieurs dimensions. Proteins play essential and very diverse roles in all processes governing life. They are made of essential building blocks, namely amino acids: organic compounds containing amine (NH2) + carboxylic acid (COOH) + radical. In proteins they are linked by a peptidic bond so that they are not amino acids per se but amino acid residues. The common scaffold constitues the backbone of the protein and the radicals represent the side chains. There exist 20 different types of aa encoded in protein genes displaying different radicals and thus different physico chemical properties. 3V686 –

3 Les acides aminés un acide aminé Liaison peptidique
aRginine lysine (K) aspartate (D) glutamate (E) asparagiNe glutamine (Q) Cysteine Methionine Histidine Serine Threonine Valine Leucine Isoleucine phenylalanine (F) tYrosine tryptophane (W) Glycine Alanine Proline Liaison peptidique Les acides aminés sont les briques de base des protéines Il en existe 20 types, avec des tailles et propriétés physico-chimiques différentes A l'intérieur des protéines, ils sont connectés séquentiellement par la liaison peptidique et appelés "résidus" d'acides aminés Chaque résidu contient atomes Proteins play essential and very diverse roles in all processes governing life. They are made of essential building blocks, namely amino acids: organic compounds containing amine (NH2) + carboxylic acid (COOH) + radical. In proteins they are linked by a peptidic bond so that they are not amino acids per se but amino acid residues. The common scaffold constitues the backbone of the protein and the radicals represent the side chains. There exist 20 different types of aa encoded in protein genes displaying different radicals and thus different physico chemical properties. 3V686 –

4 Les acides aminés un acide aminé
Les 20 types d'aas sont codés par 3 ou 1 lettre(s) ARG (R) LYS (K) ASP (D) GLU (E) ASN (N) GLN (Q) CYS (C) MET (M) HIS (H) SER (S) THR (T) VAL (V) LEU (L) ILE (I) PHE (F) TYR (Y) TRP (W) GLY (G) ALA (A) PRO (P) Proteins play essential and very diverse roles in all processes governing life. They are made of essential building blocks, namely amino acids: organic compounds containing amine (NH2) + carboxylic acid (COOH) + radical. In proteins they are linked by a peptidic bond so that they are not amino acids per se but amino acid residues. The common scaffold constitues the backbone of the protein and the radicals represent the side chains. There exist 20 different types of aa encoded in protein genes displaying different radicals and thus different physico chemical properties. 3V686 –

5 ~10 à ~1000 résidus d’acides aminés
Les protéines : plusieurs niveaux d'organisation 1er niveau d’organisation : structure primaire …QNCQLRPSGWQCRPTRGDCDLPEFCPGDSSQCPDVSLGDG… ~10 à ~1000 résidus d’acides aminés Liaisons covalentes 1 protéine = 1 chaîne polypeptidique 3V686 –

6 Liaisons chimiques faibles de squelette à squelette
Les protéines : plusieurs niveaux d'organisation 2ème niveau d’organisation : structure secondaire Feuillet β Hélice α Liaisons chimiques faibles de squelette à squelette Autres éléments: hélice 310 > coudes > boucles > pelote statistique 3V686 –

7 Liaisons chimiques faibles de squelette à squelette
Les protéines : plusieurs niveaux d'organisation 2ème niveau d’organisation : structure secondaire Feuillet β Hélice α Liaisons chimiques faibles de squelette à squelette Autres éléments: hélice 310 > coudes > boucles > pelote statistique 3V686 –

8 Les protéines : plusieurs niveaux d'organisation
3ème niveau d’organisation : structure tertiaire Une séquence protéique adopte un repliement particulier en solution, qui correspond au minimum d’énergie libre Types d’interactions non-covalentes : pont salin liaison hydrogène contact hydrophobe van der Waals empilement pi-pi 3V686 –

9 4ème niveau d’organisation : structure quaternaire
Les protéines : plusieurs niveaux d'organisation 4ème niveau d’organisation : structure quaternaire Arrangement des domaines au sein d’une protéine ou des protéines au sein d’une assemblée macro-moléculaire 3V686 –

10 Structure tridimensionnelle
Les structures protéiques : un bref historique Structure tridimensionnelle de la myoglobine Kendrew et al. (1958) Nature Modèle d’hélice α Pauling & Corey (1951) PNAS 3V686 –

11 Hydrophobic-polar (HP) 2D-lattice model
Répliement des protéines La configuration géométrique de l’état natif d’une protéine détermine ses propriétés macroscopiques, son comportement dynamique et sa fonction. Le nombre de conformations possibles pour une protéine donnée est astronomique. ex: 100aa, 3 conf/aa => conf 1 repliement/10-13 sec => 1027 années (âge de l’univers : 1010 années) Et pourtant les protéines se replient spontanément en quelques millisecondes. Comment est-ce possible ? The space of biologically accessible conformations is much smaller than that of all possible conformations ? rapid fomation of local interactions = folding of modules / nucleation points presence of intermediate/transition states (molten globules) funnel-like energy landscape where the native state corresponds to a deep free energy minimum HP model : square where each bead is an amino acid, 2 types of amino acids and non-zero interaction energies only for HH contacts Hydrophobic-polar (HP) 2D-lattice model Paradoxe de Levinthal 3V686 –

12 Paradigme séquence-structure-fonction
Dynamique Today, this and other similar programs predict that about 40% of all human proteins contain at least one intrinsically disordered segment of 30 amino acids or more, and that some 25% are likely to be disordered from beginning to end. Les segments désordonnés de la protéine suppresseur de tumeur p53 lui permettent d’interagir avec plusieurs centaines de partenaires différents. 3V686 –

13 Fonctions des protéines
pompe les substances chimiques hors des cellules stocke le fer dans les cellules reconnaît les corps étrangers hormones supports organs and tissues senseur de lumière digère la nourriture dans l’estomac copie l’information contenue dans un brin d’ADN forme des piliers structuraux moteur rotatif alimenté par de l’énergie électrochimique 3V686 –

14 Complexité du vivant Plusieurs isoformes d'une même protéine peuvent être produits à partir d'un seul gène, par épissage alternatif Les protéines sont des objets dynamiques : elles peuvent adopter plusieurs conformations en solution Les protéines n'agissent pas seules : elle forment un réseau complexe d'interactions, entre elles, avec l'ADN/ARN et avec de petites molécules (ATP...) Une protéine peut assurer plusieurs fonctions complètement différentes (moonlighting proteins) 3V686 –

15 Fossé entre séquences et structures
3V686 –

16 Détermination expérimentale de structures
Carte de densité électronique issue de cristallographie aux rayons X Modèles multiples issus de résonance magnétique nucléaire  fold recognition and ab initio protein structure prediction, classification of structural motifs, and refinement of sequence alignments. The accuracy of current protein secondary structure prediction methods is assessed in weekly benchmarks such as LiveBench and EVA. Ces deux techniques expérimentales sont les plus utilisées pour déterminer les coordonnées tridimensionnelles des structures protéines. Elles nécessitent l'utilisation de méthodes computationnelles pour générer des modèles. 3V686 –

17 Cristallographie aux rayons X
The first step is to grow crystals of the molecule that we are interested in. There are lots of ways to produce crystals. The next step is to put the crystals in a special X-ray beam. The crystal scatters the X-rays onto an electronic detector, which functions as a recorder. With specialized computer programs (or in today’s jargon “apps”), it is possible to use the information gathered on the detector to construct a so-called “electron density map,” which is basically a roadmap that tells us what the molecule looks like in three-dimensions. From the map, a model of the molecule is constructed using specialized computer graphics programs. The process is summarized in the figure below. 3V686 –

18 Cliché de diffraction Les intensités relatives des spots fournissent l'information nécessaire à la détermination des positions x,y,z de chaque atome de la protéine cristallisée. Each spot corresponds to a different type of variation in the electron density; the crystallographer must determine which variation corresponds to which spot (indexing), the relative strengths of the spots in different images (merging and scaling) and how the variations should be combined to yield the total electron density (phasing). When a crystal is mounted and exposed to an intense beam of X-rays, it scatters the X-rays into a pattern of spots or reflections. The relative intensities of these spots provide the information to determine the arrangement of molecules within the crystal in atomic detail. One image of spots is insufficient to reconstruct the whole crystal : to collect all the necessary information, the crystal must be rotated step-by-step through 180°, with an image recorded at every step. La distance minimale entre deux spots définit la résolution 3V686 –

19 Résonnance magnétique nucléaire (RMN)
Les noyaux des atomes (1H ou 15N) possèdent un moment angulaire de spin intrinsèque, qui est modifié sous l'effet d'un champ magnétique externe. When the nuclear magnetic moment associated with a nuclear spin is placed in an external magnetic field, the different spin states are given different magnetic potential energies. In the presence of the static magnetic field which produces a small amount of spin polarization, a radio frequency signal of the proper frequency can induce a transition between spin states. This "spin flip" places some of the spins in their higher energy state. If the radio frequency signal is then switched off, the relaxation of the spins back to the lower state produces a measurable amount of RF signal at the resonant frequency associated with the spin flip. This process is called Nuclear Magnetic Resonance (NMR). 3V686 –

20 Expérience de RMN The precession of the proton spin in the magnetic field is the interaction which is used in proton NMR. As a practical technique, a sample containing protons (hydrogen nuclei) is placed in a strong magnetic field to produce partial polarization of the protons. A strong RF field is also imposed on the sample to excite some of the nuclear spins into their higher energy state. When this strong RF signal is switched off, the spins tend to return to their lower state, producing a small amount of radiation at the Larmor frequency associated with that field. The emission of radiation is associated with the "spin relaxation" of the protons from their excited state. It induces a radio frequency signal in a detector coil which is amplified to display the NMR signal. 3V686 –

21 Base de données de structures protéiques
The precession of the proton spin in the magnetic field is the interaction which is used in proton NMR. As a practical technique, a sample containing protons (hydrogen nuclei) is placed in a strong magnetic field to produce partial polarization of the protons. A strong RF field is also imposed on the sample to excite some of the nuclear spins into their higher energy state. When this strong RF signal is switched off, the spins tend to return to their lower state, producing a small amount of radiation at the Larmor frequency associated with that field. The emission of radiation is associated with the "spin relaxation" of the protons from their excited state. It induces a radio frequency signal in a detector coil which is amplified to display the NMR signal. 3V686 –

22 Exemple d'entrée Structure résolue par microscopie électronique
The precession of the proton spin in the magnetic field is the interaction which is used in proton NMR. As a practical technique, a sample containing protons (hydrogen nuclei) is placed in a strong magnetic field to produce partial polarization of the protons. A strong RF field is also imposed on the sample to excite some of the nuclear spins into their higher energy state. When this strong RF signal is switched off, the spins tend to return to their lower state, producing a small amount of radiation at the Larmor frequency associated with that field. The emission of radiation is associated with the "spin relaxation" of the protons from their excited state. It induces a radio frequency signal in a detector coil which is amplified to display the NMR signal. 3V686 –

23 Structures cristallographiques
L'unité asymétrique du cristal peut contenir plusieurs protéines (chaînes) différentes et/ou plusieurs copies de la même protéine. Cet arrangement est déterminé par les contraintes physiques du cristal. asymetric unit (AU) {A-D, B-E, C-F} The precession of the proton spin in the magnetic field is the interaction which is used in proton NMR. As a practical technique, a sample containing protons (hydrogen nuclei) is placed in a strong magnetic field to produce partial polarization of the protons. A strong RF field is also imposed on the sample to excite some of the nuclear spins into their higher energy state. When this strong RF signal is switched off, the spins tend to return to their lower state, producing a small amount of radiation at the Larmor frequency associated with that field. The emission of radiation is associated with the "spin relaxation" of the protons from their excited state. It induces a radio frequency signal in a detector coil which is amplified to display the NMR signal. L'unité biologique correspond à l'arrangement fonctionnel de la ou des protéines dans la cellule. Elle est connue à travers des expériences ou prédite. {A-D} {B-E} {C-F} 3V686 –

24 Format PDB la 1ère colonne indique la section
informations sur la ou les molécules présentes dans l'entrée PDB The precession of the proton spin in the magnetic field is the interaction which is used in proton NMR. As a practical technique, a sample containing protons (hydrogen nuclei) is placed in a strong magnetic field to produce partial polarization of the protons. A strong RF field is also imposed on the sample to excite some of the nuclear spins into their higher energy state. When this strong RF signal is switched off, the spins tend to return to their lower state, producing a small amount of radiation at the Larmor frequency associated with that field. The emission of radiation is associated with the "spin relaxation" of the protons from their excited state. It induces a radio frequency signal in a detector coil which is amplified to display the NMR signal. 3V686 –

25 Format PDB section des coor-données tridimensionnelles des atomes protéiques id de l'atome type d'atome type d'aa chaîne id de l'aa coord x coord y coord z occupation facteur B élément The precession of the proton spin in the magnetic field is the interaction which is used in proton NMR. As a practical technique, a sample containing protons (hydrogen nuclei) is placed in a strong magnetic field to produce partial polarization of the protons. A strong RF field is also imposed on the sample to excite some of the nuclear spins into their higher energy state. When this strong RF signal is switched off, the spins tend to return to their lower state, producing a small amount of radiation at the Larmor frequency associated with that field. The emission of radiation is associated with the "spin relaxation" of the protons from their excited state. It induces a radio frequency signal in a detector coil which is amplified to display the NMR signal. 3V686 –

26 Visualisation 3D Logiciels de visualisation : Pymol, Chimera, VMD...
sticks spheres surface cartoon Logiciels de visualisation : Pymol, Chimera, VMD... 3V686 –

27 Unité de base : le domaine
Un domaine protéique est une unité stable d’une structure de protéine qui peut se replier de manière indépendante. Les petites protéines et la plupart de celles de taille moyenne possèdent un seul domaine. Historiquement, les domaines protéiques ont été décrits sur la base de la compaction de leur structure, leur fonction, évolution ou repliement. The concept of the domain was first proposed in 1973 by Wetlaufer after X-ray crystallographic studies of hen lysozyme [1] and papain [2] and by limited proteolysis studies of immunoglobulins.[3][4] Wetlaufer defined domains as stable units of protein structure that could fold autonomously. Each definition is valid and will often overlap, i.e. a compact structural domain that is found amongst diverse proteins is likely to fold independently within its structural environment. Nature often brings several domains together to form multidomain and multifunctional proteins with a vast number of possibilities. 90%, have less than 200 residues[25] with an average of approximately 100 residues.[26] An appropriate example is pyruvate kinase, a glycolytic enzyme that plays an important role in regulating the flux from fructose-1,6-biphosphate to pyruvate. It contains an all-β regulatory domain, an α/β-substrate binding domain and an α/β-nucleotide binding domain, connected by several polypeptide linkers [9] (see figure, right). Each domain in this protein occurs in diverse sets of protein families. The central α/β-barrel substrate binding domain is one of the most common enzyme folds. It is seen in many different enzyme families catalysing completely unrelated reactions.[10] The α/β-barrel is commonly called the TIM barrel named after triose phosphate isomerase, which was the first such structure to be solved.[11] It is currently classified into 26 homologous families in the CATH domain database.[12] The TIM barrel is formed from a sequence of β-α-β motifs closed by the first and last strand hydrogen bonding together, forming an eight stranded barrel. There is debate about the evolutionary origin of this domain. One study has suggested that a single ancestral enzyme could have diverged into several families,[13] while another suggests that a stable TIM-barrel structure has evolved through convergent evolution.[14] The TIM-barrel in pyruvate kinase is 'discontinuous', meaning that more than one segment of the polypeptide is required to form the domain. This is likely to be the result of the insertion of one domain into another during the protein's evolution. It has been shown from known structures that about a quarter of structural domains are discontinuous.[15][16] The inserted β-barrel regulatory domain is 'continuous', made up of a single stretch of polypeptide. Covalent association of two domains represents a functional and structural advantage since there is an increase in stability when compared with the same structures non-covalently associated.[17] Other, advantages are the protection of intermediates within inter-domain enzymatic clefts that may otherwise be unstable in aqueous environments, and a fixed stoichiometric ratio of the enzymatic activity necessary for a sequential set of reactions.[18] Pyruvate kinase 3V686 –

28 Classification structurale des protéines
Quelle est la motivation pour une classification des structures de protéines ? mieux comprendre les fonctions biologiques des protéines déterminer les relations évolutionnaires entre les protéines Les structures ont tendance à moins diverger que les séquences. Des protéines partageant une similarité de séquence adoptent des formes similaires. Généralement, au-delà de 40% d’identité de séquence, les structures sont très ressemblantes. Décarboxylases ayant 21% d’identité de séquence : Évolution convergente ou divergente ? Nearly all proteins share some degree of structural similarity. Grouping proteins that adopt the same 3D shape enables to better understand their function and determine the evolutionary relationship between them. Ribbon diagram of the structure of a monomer of benzoylformate decarboxylase (BFD) and pyruvate decarboxylase (PDC) BFD (top) and PDC (bottom) share a common fold and overall biochemical function, but they recognize different substrates and have low (21%) sequence identity. The bound thiamine pyrophosphate cofactor is shown in space-filling representation in both structures. The green spheres are metal ions. (PDB 1bfd and 1pvd) 3V686 –

29 Classification structurale des protéines
structure secondaire hélice α, feuillet β, boucle… Local/global domaine unité structurale protéique classe contenu en structure secondaire repliement/topologie Similarité croissante forme globale Nearly all proteins share some degree of structural similarity. Grouping proteins that adopt the same 3D shape enables to better understand their function and determine the evolutionary relationship between them. Ribbon diagram of the structure of a monomer of benzoylformate decarboxylase (BFD) and pyruvate decarboxylase (PDC) BFD (top) and PDC (bottom) share a common fold and overall biochemical function, but they recognize different substrates and have low (21%) sequence identity. The bound thiamine pyrophosphate cofactor is shown in space-filling representation in both structures. The green spheres are metal ions. (PDB 1bfd and 1pvd) superfamille fonction similaire & homologie ressources : 3V686 –

30 Unité de base : le domaine
Un domaine protéique est une unité stable d’une structure de protéine qui peut se replier de manière indépendante. Les petites protéines et la plupart de celles de taille moyenne possèdent un seul domaine. Historiquement, les domaines protéiques ont été décrits sur la base de la compaction de leur structure, leur fonction, évolution ou repliement. (2) (3) Classes: All alpha All beta Alpha and beta – mixed (a/b) Alpha and beta proteins – segregated (a+b) Small – metal ligand, heme and/or disulfide bridges (1) (2) (3) (4) (5) The concept of the domain was first proposed in 1973 by Wetlaufer after X-ray crystallographic studies of hen lysozyme [1] and papain [2] and by limited proteolysis studies of immunoglobulins.[3][4] Wetlaufer defined domains as stable units of protein structure that could fold autonomously. Each definition is valid and will often overlap, i.e. a compact structural domain that is found amongst diverse proteins is likely to fold independently within its structural environment. Nature often brings several domains together to form multidomain and multifunctional proteins with a vast number of possibilities. 90%, have less than 200 residues[25] with an average of approximately 100 residues.[26] An appropriate example is pyruvate kinase, a glycolytic enzyme that plays an important role in regulating the flux from fructose-1,6-biphosphate to pyruvate. It contains an all-β regulatory domain, an α/β-substrate binding domain and an α/β-nucleotide binding domain, connected by several polypeptide linkers [9] (see figure, right). Each domain in this protein occurs in diverse sets of protein families. The central α/β-barrel substrate binding domain is one of the most common enzyme folds. It is seen in many different enzyme families catalysing completely unrelated reactions.[10] The α/β-barrel is commonly called the TIM barrel named after triose phosphate isomerase, which was the first such structure to be solved.[11] It is currently classified into 26 homologous families in the CATH domain database.[12] The TIM barrel is formed from a sequence of β-α-β motifs closed by the first and last strand hydrogen bonding together, forming an eight stranded barrel. There is debate about the evolutionary origin of this domain. One study has suggested that a single ancestral enzyme could have diverged into several families,[13] while another suggests that a stable TIM-barrel structure has evolved through convergent evolution.[14] The TIM-barrel in pyruvate kinase is 'discontinuous', meaning that more than one segment of the polypeptide is required to form the domain. This is likely to be the result of the insertion of one domain into another during the protein's evolution. It has been shown from known structures that about a quarter of structural domains are discontinuous.[15][16] The inserted β-barrel regulatory domain is 'continuous', made up of a single stretch of polypeptide. Covalent association of two domains represents a functional and structural advantage since there is an increase in stability when compared with the same structures non-covalently associated.[17] Other, advantages are the protection of intermediates within inter-domain enzymatic clefts that may otherwise be unstable in aqueous environments, and a fixed stoichiometric ratio of the enzymatic activity necessary for a sequential set of reactions.[18] (3) Pyruvate kinase 3V686 –

31 Classes structurales All Alpha All Beta Alpha/Beta Alpha+Beta
myohemerythrin All Alpha Neuraminidase Beta Propeller All Beta TATA Binding Protein Alpha+Beta Aspartate Semi-Aldehyde Dehydrogenase Alpha/Beta 3V686 –

32 Comment comparer 2 structures ?
Déviation standard (RMSD) après superposition Cette mesure exprime la distance moyenne minimale globale entre les n atomes correspondants des structures superposées a et b, où (x,y,z) sont les coordonnées atomiques cartésiennes. Le RMSD peut être calculé sur une sélection d’atomes (squelette, atomes lourds…). Le calcul du RMSD requiert qu’exactement n atomes de la structure a correspondent à n atomes de la structure b. 3V686 –

33 Bioinformatique structurale : pour quoi faire ?
Prédire les structures de protéines Comparer les structures de protéines Simuler les mouvements des protéines et... Caractériser leurs interactions, pour... Concevoir de nouveaux médicaments 3V686 –

34 Conclusion Les protéines sont des objets biologiques à plusieurs niveaux d'organisation Elles sont composées de résidus d'acides aminés, eux- même composés d'atomes Elles assurent une grande variété de fonctions biologiques Elles adaptent leur forme et leurs mouvements aux conditions environnementales Elles interagissent entre elles et avec d'autres molécules dans la cellule Prédire et caractériser la structure des protéines permet de décrire et comprendre les mécanismes moléculaires qui sous-tendent les processus biologiques. 3V686 –


Télécharger ppt "Bioinformatique Structurale"

Présentations similaires


Annonces Google