Cold Spring Harbor labs on Long Island has a diverse offering of conferences that attract experts from all areas of biology. For the last six years, there has been a sister group organizing conferences in Suzhou, China. I spent last week at the 2016 CSH Asia conference on Systems Biology.
While I have been to many conferences on aging and a few on evolution, this was my first Systems Biology conference. I looked forward to learning how biologists think about whole-body issues of homeostasis, organization, and (maybe if I was lucky) blood signals that regulate aging.
What I found instead was that researchers in systems biology are doing what other biologists are doing: they are babes in toyland, exploring the potential of a seductive array of new biomolecular tools. They are compiling catalogs and making maps and correlating every chemical they can find with every other chemical, and collaborating with statisticians to look for patterns in the data. If “systems thinking” is from the top down, what I found at this meeting was just the opposite.
A hundred years ago, Lord Rutherford said that “All science is either physics or stamp collecting.” He was mocking the biologists’ program of collecting specimens, classifying and cataloging them. Twentieth century biology turned this around; it was neither physics nor stamp collecting, but model-building. Systems biologists in particular have analyzed living organisms in terms of signals and networks and energy flows, and have generated a great deal of understanding. At its best, biology has forged a new mode of science.
So why is it that 21st Century systems biology is looking once more like stamp collecting? It’s a question I asked one of the conference organizers (in more polite terms), and he responded that “we are working from a reductionist framework. We are trying to build understanding from the bottom up.”
There’s a deeper answer
Why, after the revolutionary successes of late 20th century biology, should bioscience find itself back at square one, trying to build a foundation? The short answer: genetics is simple; epigenetics is complicated.
The heyday of genetics started from Crick’s decoding of the genetic code in 1961. DNA was revealed to be a blueprint for producing proteins that would do the body’s work. The code was just as simple and elegant as it could be, and the machinery to do the translation was segregated in ribosomes, which could be isolated and picked apart. The era of genetics ended in 2003 with the completion of the human genome project. Results were a surprise to everyone, and the message took awhile to filter into conceptual thinking: The genome is 3% genes and 97% gene regulation. All the impressive tasks of development, homeostasis, and metabolism are performed by an exquisitely adapted system that turns genes on and off in the right place at the right time.
2003 ended the era of genetics and began the era of epigenetics. How is gene expression regulated and contolled? DNA methylation was the first mechanism discovered. But as clues appeared and mechanisms were partially elucidated, it has become apparent that epigenetics is as complicated and intractable as it can possibly be. Besides methylation, there are more than 100 modifications of the DNA and its associated proteins (histones) that affect gene expression. There is also the way in which DNA folds around itself, leaving some regions open where they can be transcribed and keeping other regions under tight wraps. Finally, there is a variety of post-translational modifications; even after a stretch of DNA has been transcribed into RNA and translated into a protein, the protein can be turned on or off by adding a phosphate group or a methyl or acetate at any number of receptor sites.
Metabolism is now seen as a dense web of interacting processes, intertwined causes and effects. Gene network maps draw lines between genes that are co-expressed, and can divide the territory into subsets (modules) that are more closely related to each other than to other modules.
But this is a picture that only a computer could love; it contains intricacies on a scale that human consciousness cannot grasp with conceptual understanding.
Contrast this to the naive simplicity of Crick’s Central Dogma of Molecular Biology: Information flows in one direction, from DNA to RNA to proteins. Crick lived to see his insight de-dogmatized by exceptions, but it has been since his death (2004) that the essential, bewildering complexity of biochemical networks has been revealed.
No wonder the community of systems biologists feels that they are starting over again, collecting, classifying and cataloging stamps.
Biochemical science this last few years seems to be driven by newly available technologies. These are so powerful and coming so fast that just exploring what they can tell us is occupying the lion’s share of available funding and lab space. I knew about some of these, and several more were new to me last week.
- CRISPR-Cas9. This is a tool adapted from bacterial defenses against viruses. It has made it easy to delete a particular gene within a living cell culture, and perhaps within a living, breathing animal. It has been adapted to insert an exogenous gene at a particular location on a particular chromosome, and even to modify a particular section of DNA to turn a gene on or off. Cas9 has been limited to small “payloads”, adding short sequences of DNA, but just last year, techniques were reported for splitting a larger payload among multiple Cas9 vectors [Ref1, Ref2] in such a way that they piggyback.
- Hi-C is a modern, computer-intensive version of a 20-year-old procedure for mapping a coiled and folded chromosome in 3-space. First, the chromosome is frozen by introducing random bonds between nearby strands. Then it is fragmented with a DNA slicing enzyme. Then the pieces are sequenced (this is the new part) with high throughput sequencing that can tell you which genes are physically closer to which other genes in the folded, 3-D configuration. Finally, the computer can be used to reconstruct a picture of the 3-D configuration.
- ATAC-Seq. This is a tool for finding the genes that, at any given time, are in open stretches of DNA (euchromatin), available to be transcribed. Chromosomes are peppered with an enzyme that slices up DNA. The fragments are then collected and sequenced, and a computer program matches the fragments to map where they came from. The DNA that was tightly-packed (heterochromatin) contributes few fragments because the enzyme can’t reach it. Thus the genes that are seen are representative of what is available for transcription.
- Methylation mapping. Methylation of C’s in stretches of C-G-C-G-C-G-C-G-C-G within a chromosome is the best-studied mechanism of epigenetic regulation. Just in the last few years, it is possible to map the methylation state of an entire genome. A chemical transformation transforms only unmethylated C’s to uracil in a strand of DNA (with bisulfite), leaving the methylated C’s unconverted. By sequencing the strand before and after this transformation, and using a computer to map the differences, the places where C’s were methylated can be identified.
- ChIP-Seq. If you know of a particular transcription factor—a protein that binds to DNA and turns certain genes on or off—then this technique can tell you where on the DNA the protein attaches itself. The technique combines two older technologies: immunoprecipitation, where an antibody is introduced that picks out one particular protein, and high-throughput sequencing, which identifies and locates the patch of DNA that is stuck to the CHIPped protein.
Gene expression maps have been around for a few years. They provide an enormous amount of information about what genes are being expressed where and when, but they are notoriously difficult to make sense of. More recent are correlation maps, in which every gene is correlated with every other gene in a huge matrix that shows how likely they are to be expressed at the same time. I am intrigued by principal component analysis. You can start with a set of genes and measure all the cross correlations, and the math comes up with a combination of the genes most likely to be expressed together (Principal Component #1).
Sometimes a single gene sticks out, and the authors conclude that this particular cancer can be treated by targeting this particular gene.
More often, the results show a combination of hundreds of genes that tend to be expressed together in a particular proportion, say 1.2% gene #1, 0.04% gene #2, 0.15% gene #3…and so on, with coefficients for hundreds of genes. This is the output of a principal component analysis. If such a profile is identified with a healthy state, or a young state, we do not yet have the capability to shape this profile of gene expression in a cell culture, let alone in a living animal. But it is not inconceivable that we will acquire this ability with advancing biotechnology of the coming decade(s).
The Harvard laboratory of Brenda Andrews is fully automated, with robotic handling of yeast culture plates, robotic data collection, computerized data analysis. All that’s left for the human to do is to write the historical introduction and submit the manuscript. I am a fan of artificial intelligence and computer learning for some applications. Computers have their own ideas of what constitutes a pattern or a trend. They often come up with unexpected solutions to problems, even simple solutions on rare occasions. But AI never produces elegant theories or new ways to look at the big picture. We give that up when we rely on computers to do science for us.
Growth, Development and Aging
Growth and development are programmed, to be sure, but (so far as we know at present) there appears to be no central coordinator of the process. Rather, the tangled web of chemical signals adapts and responds to changes in the body and in itself. The intelligence is not in a central brain, but is distributed through the system itself. There is no one calling the shots. The metabolism behaves intelligently the way a beehive behaves intelligently, though no single bee has a a clue concerning the hive’s plans and strategies.
I have bet my career on the thesis that aging is a metabolic program, a continuation of the process of development into a phase of self-destruction. I used to think that this meant there were genes for aging. I was the most optimistic speaker at the anti-aging meetings. “All we have to do is find the aging genes and turn them off.”
Then I accepted the new picture centered on epigenetics. I thought that chemical signals were arranged in hierarchies, with a few hox genes and transcription factors controlling a much larger number of workhorse proteins that actually get the job done. The job of the anti-aging scientist is to re-balance the transcription factors to create a more youthful profile, and the workhorse proteins would dutifully take care of the rest.
But more recently, I learned that there are thousands of transcription factors, comparable to the number of genes they regulate. And the lines between promoters, enhancers, transcription factors, and metabolites has been blurred. A less optimistic scenario is beginning to come into focus for me. I believe that aging is a continuation of the developmental program, but development is inscrutably complex, and it seems to be controlled by a web of interacting molecules that play multiple roles. Each one is a cause and an effect. Many have roles both a regulatory agents and also as workhorses. There’s no one in charge of the factory. The factory is designed so ingeniously that it runs itself.
Study of Development may be a Key to Aging
Much is known about details of development, but there is no systemic understanding of how the process is put together
- How much is predetermined in cell lineage?
- How much is self-organization?
- How much is centrally organized, through internal secretions?
- How do these three interact?
Of course, the same questions may be asked about aging. It may be more feasible to approach these questions through development than through aging, (1) because development happens on a faster time scale, (2) because aging contains a stochastic element not present in development, and (3) because the phenotypes of development can be observed clearly and locally. I recognize that this suggestion means going back to basic science to make a long-term investment in understanding of aging, but maybe that’s what we need.