Showing posts with label genes. Show all posts
Showing posts with label genes. Show all posts

Wednesday, October 27, 2010

Big data


It took 13 years to crunch through the 3 billion base pairs that make up the human genome. These data have been violating our assumptions ever since. My introductory biology textbook, published in 1996, speculates that there might be up to 100,000 genes in the genome. It turns out there are a lot less: about 20,000-30,000 by more recent estimates. The Human Genome Project sequenced only a few individuals, and combined all into one genome. However, many of the big questions we have about genetics concerns the differences between individuals.

We are starting to get answers to these questions. In today’s issue of Nature, a paper was published from the 1000 Genome Project, a massive collaborative effort from three continents that is designed to describe and explain the variance between individual’s genomes.

In this work, several types of variance were investigated as independent pilot studies. First, patterns between several mother-father-child trios were examined. Second, a group of 179 people had their whole genomes sequences. Last, more sparse sequencing was done on ~700 people from very diverse genetic backgrounds. While this paper is mostly serving as a progress report, and proof of concept, one very interesting bit is the finding that on average, any given person carries 50-100 gene variants that have been associated with higher risk of illness. This is very reminiscent of last week’s PNAS article showing that possessing such “risky” alleles does not decrease your lifespan to a statistically significant degree.



Durbin, R., Altshuler, D., Durbin, R., Abecasis, G., Bentley, D., Chakravarti, A., Clark, A., Collins, F., De La Vega, F., Donnelly, P., Egholm, M., Flicek, P., Gabriel, S., Gibbs, R., Knoppers, B., Lander, E., Lehrach, H., Mardis, E., McVean, G., Nickerson, D., Peltonen, L., Schafer, A., Sherry, S., Wang, J., Wilson, R., Gibbs, R., Deiros, D., Metzker, M., Muzny, D., Reid, J., Wheeler, D., Wang, J., Li, J., Jian, M., Li, G., Li, R., Liang, H., Tian, G., Wang, B., Wang, J., Wang, W., Yang, H., Zhang, X., Zheng, H., Lander, E., Altshuler, D., Ambrogio, L., Bloom, T., Cibulskis, K., Fennell, T., Gabriel, S., Jaffe, D., Shefler, E., Sougnez, C., Bentley, D., Gormley, N., Humphray, S., Kingsbury, Z., Koko-Gonzales, P., Stone, J., McKernan, K., Costa, G., Ichikawa, J., Lee, C., Sudbrak, R., Lehrach, H., Borodina, T., Dahl, A., Davydov, A., Marquardt, P., Mertes, F., Nietfeld, W., Rosenstiel, P., Schreiber, S., Soldatov, A., Timmermann, B., Tolzmann, M., Egholm, M., Affourtit, J., Ashworth, D., Attiya, S., Bachorski, M., Buglione, E., Burke, A., Caprio, A., Celone, C., Clark, S., Conners, D., Desany, B., Gu, L., Guccione, L., Kao, K., Kebbel, A., Knowlton, J., Labrecque, M., McDade, L., Mealmaker, C., Minderman, M., Nawrocki, A., Niazi, F., Pareja, K., Ramenani, R., Riches, D., Song, W., Turcotte, C., Wang, S., Mardis, E., Wilson, R., Dooling, D., Fulton, L., Fulton, R., Weinstock, G., Durbin, R., Burton, J., Carter, D., Churcher, C., Coffey, A., Cox, A., Palotie, A., Quail, M., Skelly, T., Stalker, J., Swerdlow, H., Turner, D., De Witte, A., Giles, S., Gibbs, R., Wheeler, D., Bainbridge, M., Challis, D., Sabo, A., Yu, F., Yu, J., Wang, J., Fang, X., Guo, X., Li, R., Li, Y., Luo, R., Tai, S., Wu, H., Zheng, H., Zheng, X., Zhou, Y., Li, G., Wang, J., Yang, H., Marth, G., Garrison, E., Huang, W., Indap, A., Kural, D., Lee, W., Fung Leong, W., Quinlan, A., Stewart, C., Stromberg, M., Ward, A., Wu, J., Lee, C., Mills, R., Shi, X., Daly, M., DePristo, M., Altshuler, D., Ball, A., Banks, E., Bloom, T., Browning, B., Cibulskis, K., Fennell, T., Garimella, K., Grossman, S., Handsaker, R., Hanna, M., Hartl, C., Jaffe, D., Kernytsky, A., Korn, J., Li, H., Maguire, J., McCarroll, S., McKenna, A., Nemesh, J., Philippakis, A., Poplin, R., Price, A., Rivas, M., Sabeti, P., Schaffner, S., Shefler, E., Shlyakhter, I., Cooper, D., Ball, E., Mort, M., Phillips, A., Stenson, P., Sebat, J., Makarov, V., Ye, K., Yoon, S., Bustamante, C., Clark, A., Boyko, A., Degenhardt, J., Gravel, S., Gutenkunst, R., Kaganovich, M., Keinan, A., Lacroute, P., Ma, X., Reynolds, A., Clarke, L., Flicek, P., Cunningham, F., Herrero, J., Keenen, S., Kulesha, E., Leinonen, R., McLaren, W., Radhakrishnan, R., Smith, R., Zalunin, V., Zheng-Bradley, X., Korbel, J., Stütz, A., Humphray, S., Bauer, M., Keira Cheetham, R., Cox, T., Eberle, M., James, T., Kahn, S., Murray, L., Chakravarti, A., Ye, K., De La Vega, F., Fu, Y., Hyland, F., Manning, J., McLaughlin, S., Peckham, H., Sakarya, O., Sun, Y., Tsung, E., Batzer, M., Konkel, M., Walker, J., Sudbrak, R., Albrecht, M., Amstislavskiy, V., Herwig, R., Parkhomchuk, D., Sherry, S., Agarwala, R., Khouri, H., Morgulis, A., Paschall, J., Phan, L., Rotmistrovsky, K., Sanders, R., Shumway, M., Xiao, C., McVean, G., Auton, A., Iqbal, Z., Lunter, G., Marchini, J., Moutsianas, L., Myers, S., Tumian, A., Desany, B., Knight, J., Winer, R., Craig, D., Beckstrom-Sternberg, S., Christoforides, A., Kurdoglu, A., Pearson, J., Sinari, S., Tembe, W., Haussler, D., Hinrichs, A., Katzman, S., Kern, A., Kuhn, R., Przeworski, M., Hernandez, R., Howie, B., Kelley, J., Cord Melton, S., Abecasis, G., Li, Y., Anderson, P., Blackwell, T., Chen, W., Cookson, W., Ding, J., Min Kang, H., Lathrop, M., Liang, L., Moffatt, M., Scheet, P., Sidore, C., Snyder, M., Zhan, X., Zöllner, S., Awadalla, P., Casals, F., Idaghdour, Y., Keebler, J., Stone, E., Zilversmit, M., Jorde, L., Xing, J., Eichler, E., Aksay, G., Alkan, C., Hajirasouliha, I., Hormozdiari, F., Kidd, J., Cenk Sahinalp, S., Sudmant, P., Mardis, E., Chen, K., Chinwalla, A., Ding, L., Koboldt, D., McLellan, M., Dooling, D., Weinstock, G., Wallis, J., Wendl, M., Zhang, Q., Durbin, R., Albers, C., Ayub, Q., Balasubramaniam, S., Barrett, J., Carter, D., Chen, Y., Conrad, D., Danecek, P., Dermitzakis, E., Hu, M., Huang, N., Hurles, M., Jin, H., Jostins, L., Keane, T., Quang Le, S., Lindsay, S., Long, Q., MacArthur, D., Montgomery, S., Parts, L., Stalker, J., Tyler-Smith, C., Walter, K., Zhang, Y., Gerstein, M., Snyder, M., Abyzov, A., Balasubramanian, S., Bjornson, R., Du, J., Grubert, F., Habegger, L., Haraksingh, R., Jee, J., Khurana, E., Lam, H., Leng, J., Jasmine Mu, X., Urban, A., Zhang, Z., Li, Y., Luo, R., Marth, G., Garrison, E., Kural, D., Quinlan, A., Stewart, C., Stromberg, M., Ward, A., Wu, J., Lee, C., Mills, R., Shi, X., McCarroll, S., Banks, E., DePristo, M., Handsaker, R., Hartl, C., Korn, J., Li, H., Nemesh, J., Sebat, J., Makarov, V., Ye, K., Yoon, S., Degenhardt, J., Kaganovich, M., Clarke, L., Smith, R., Zheng-Bradley, X., Korbel, J., Humphray, S., Keira Cheetham, R., Eberle, M., Kahn, S., Murray, L., Ye, K., De La Vega, F., Fu, Y., Peckham, H., Sun, Y., Batzer, M., Konkel, M., Walker, J., Xiao, C., Iqbal, Z., Desany, B., Blackwell, T., Snyder, M., Xing, J., Eichler, E., Aksay, G., Alkan, C., Hajirasouliha, I., Hormozdiari, F., Kidd, J., Chen, K., Chinwalla, A., Ding, L., McLellan, M., Wallis, J., Hurles, M., Conrad, D., Walter, K., Zhang, Y., Gerstein, M., Snyder, M., Abyzov, A., Du, J., Grubert, F., Haraksingh, R., Jee, J., Khurana, E., Lam, H., Leng, J., Jasmine Mu, X., Urban, A., Zhang, Z., Gibbs, R., Bainbridge, M., Challis, D., Coafra, C., Dinh, H., Kovar, C., Lee, S., Muzny, D., Nazareth, L., Reid, J., Sabo, A., Yu, F., Yu, J., Marth, G., Garrison, E., Indap, A., Fung Leong, W., Quinlan, A., Stewart, C., Ward, A., Wu, J., Cibulskis, K., Fennell, T., Gabriel, S., Garimella, K., Hartl, C., Shefler, E., Sougnez, C., Wilkinson, J., Clark, A., Gravel, S., Grubert, F., Clarke, L., Flicek, P., Smith, R., Zheng-Bradley, X., Sherry, S., Khouri, H., Paschall, J., Shumway, M., Xiao, C., McVean, G., Katzman, S., Abecasis, G., Blackwell, T., Mardis, E., Dooling, D., Fulton, L., Fulton, R., Koboldt, D., Durbin, R., Balasubramaniam, S., Coffey, A., Keane, T., MacArthur, D., Palotie, A., Scott, C., Stalker, J., Tyler-Smith, C., Gerstein, M., Balasubramanian, S., Chakravarti, A., Knoppers, B., Abecasis, G., Bustamante, C., Gharani, N., Gibbs, R., Jorde, L., Kaye, J., Kent, A., Li, T., McGuire, A., McVean, G., Ossorio, P., Rotimi, C., Su, Y., Toji, L., Tyler-Smith, C., Brooks, L., Felsenfeld, A., McEwen, J., Abdallah, A., Juenger, C., Clemm, N., Collins, F., Duncanson, A., Green, E., Guyer, M., Peterson, J., Schafer, A., Abecasis, G., Altshuler, D., Auton, A., Brooks, L., Durbin, R., Gibbs, R., Hurles, M., & McVean, G. (2010). A map of human genome variation from population-scale sequencing Nature, 467 (7319), 1061-1073 DOI: 10.1038/nature09534

Thursday, September 16, 2010

Why scientists aren’t going to find my neurotic gene any time soon

There has been some talk lately about a recent study that found that no robust statistical relations could be found between the whole genome and a standard personality test.

Seem surprising? Not so much.

So, a gene codes a protein. What does the protein do? Such a great number of things that it’s difficult to even list: a protein can become a structural element of a cell (such as actin or myosin that make up muscle tissue), or they can become neurotransmitters or other messengers in a complicated cascade of signaling events. For example, the PubMed description of the protein neuregulin 1 (statistically associated with schizophrenia, and high creativity) starts with “The protein encoded by this gene was originally identified as a 44-kD glycoprotein that interacts with the NEU/ERBB2 receptor tyrosine kinase to increase its phosphorylation on tyrosine residues.” Aside from the technical language, the first description of this protein is the relationship that it has to another protein, to provide a specific biochemical context (phosphorylation).

Alright. So a gene codes a protein, and this protein is a widget that works in concert with other such widgets in a particular biochemical and environmental context.  Does it even make sense to say that there is a gene for a complex behavioral phenomenon such as schizophrenia, depression, or a neurotic personality?  Not very much, really. Not at least in the sense of “I have this computer for writing this blog post”. Kenneth Kendler points out the lack of causal link further by making the following analogy:

“A jumbo jet contains about as many parts as there are genes in the human genome. If someone went into the fuselage and removed a 2-foot length of hydraulic cable connecting the cockpit to the wing flaps, the plane could not take off. Is this piece of equipment then a cable for flying?”

While most people would answer that no, the tube does not directly cause the jet to fly, this is the exact same logic that is used when we try to find a gene for X.

The issue is that we expect genes to have very lawful 1:1 correspondences with specific traits because in school we learned about Mendel’s pea pods, or cystic fibrosis, or Huntington’s disease that show such a relationship. This type of inheritance seems to be the exception, rather than the rule. There exists a wide distribution of association strengths between a single gene and a particular outcome. Scientists express this strength using a statistic called the odds ratio. Briefly, this is the odds that someone with gene A will have disease X, versus the odds that someone without gene A will have disease X. For a completely Mendelian disease (one like cystic fibrosis that cannot be contracted through the environment), the odds ratio is infinite because if you have the gene, you will always have the disease, and if you don’t have the gene, you never will. Statistical associations that we perceive as strong (such as the link between heavy smoking and lung cancer) have an odds ratio of about 20.  Psychiatric associations, on the other hand have an odds ratio of 1-2.  In other words, don’t go rush out to get genetically tested for depression. It won’t do you much good.

 
Partially, this lack of association is due to complex interactions between genes and the environments. For example, people with a particular variant of a serotonin transporter are more likely to experience depression, but only in the context of having experienced a stressful life event

A possible exception to the single-genes-don’t-change-behavior-in-isolation rule might be the COMT gene. This gene makes the enzyme that breaks down several neurotransmitters in the brain, including dopamine.  Like many genes, individuals may have different variations (or alleles) of the gene.  However, unlike alleles that change an individual’s hair color, different alleles in the COMT gene have been associated with striking differences in cognitive function.  Incredibly, these differences arise due to a single amino acid difference in the enzyme!  Substituting valine for methionine at position 158 in the gene is associated with a host of poorer psychological outcomes.  As each person inherits one copy of the gene from each of his parents, individuals can either have two valines, two methionines, or one of each.  Interestingly enough, the number of valines correlates with the degree of negative outcome.  For example, a 2008 study was conducted in which people recorded the events that were taking place in their lives, and rated how positive these events were.  The authors found that valine-valine individuals found a very pleasant event only as positive and methionine-methionine people found a sort-of positive event.  Given these results, it is easy to see how these individuals have difficulties with major depression and addiction.

 
So, genes code proteins which work together in an incredibly complex biochemical context created by other genes, the environment, and interactions of the genes, the biochemical milieu and the environment. Instead of asking ourselves why we haven’t found the gene for X, we should really be asking ourselves why we keep asking that question.