A New Approach to Methylation Clocks

A new approach to methylation clocks from Morgan Levine uses massive computer resources and sophisticated mathematics. I am enthusiastic about it, not just because it produces better results than previous methods, but because I suspect it is better aligned with the way that biological systems actually work.

The Clockmaker’s Dilemma

Your goal is a robust and accurate measure of biological age. You start with a sample of (for example) 5,000 people, and let’s suppose for now you know their “biological age” (we’ll come back to this). For each person, you also have 850,000 methylation levels — a number for each of 850,000 spots on the human chromosome where methylation is known to vary, called “CpG’s”.

Here’s the paradox. You could easily come up with a formula that “predicts” biological age precisely for all 5,000 people, because you have 850,000 parameters to play with. In general, you can always make a formula that works perfectly in N cases if you have N different knobs you can turn to make the formula fit. In this case, you have a lot more knobs than cases. In mathematical terms, you have more unknowns than you have constraints.

In this way, you could construct a clock that is perfectly accurate for all 5,000 in your group. But it’s been jerry-rigged to do that. The clock you develop will be unreliable for predicting the age of someone who is not in this group of 5,000. Peculiarities about this particular set of 5,000 have been incorporated into the model, distorting its priorities. Statisticians call this phenomenon “overfitting”.

The opposite approach would be to look through the 850,000 methylation sites (CpG’s) and find the one that best correlates with “biological age” for your sample population,  The result will be an aging clock with much less accuracy; but chances are strong that it will work just as well with a new set of people as it did with your original 5,000.

Between these two extremes, you, the Clockmaker, look for a formula relating multiple CpGs that fits your 5,000 sample subjects well, while avoiding overfitting. But how can you know if you’re overfitting? There is no standard answer, and various methods are used with names like LASSO and elastic net and “Leave one out” . Some general principles are

  • The total number of CpGs referenced should be much less the total number of calibration subjects. The best clocks reference a few hundred CpGs, and are calibrated with many thousands of subjects.
  • Don’t get too fancy. If a single CpG or combination of 3 or 4 is highly correlated with age, that is probably real, but more complex combinations are suspect.
  • Spread the algorithm out, so that no single CpG (or a few CpGs) can have a big effect on the outcome.
  • Large positive and large negative components that cancel each other out just right to produce the age prediction tend to be fragile, and can produce large errors if a single CpG is mismeasured.

The methods listed above are common to all the best methylation clocks (and also to their cousins, based on the proteome or the microbiome or the immune system). The differences among methylation clocks are based on what is defined ahead of time as the target “biological age”. Steve Horvath’s first clock was calibrated with chronological age — which is already a pretty good surrogate for biological age. The Levine/Horvath PhenoAge clock was calibrated using a combination of metabolic factors that correlate with health, including inflammation, DNA transcription, DNA repair, and mitochondrial activity. The Lu/Horvath GrimAge clock was calibrated with actual mortality statistics, derived from banked blood samples from decades in the past, so the future lifespan of the donors was now known. Other mortality-related data were also involved, and the GrimAge clock is presently most accurate for predicting all-cause mortality.

Creation of methylation clock algorithms illustrate this open secret: Statistics is as much an art as a science. Experience and sound judgment are more important than mathematical sophistication.

Interlude: How biological systems are different from machines

How many of us have had the experience of sitting in a plane, leaving the gate, and after a long time out on the tarmac, a voice comes over the PA system saying, there’s a bad valve in the hydraulic system for the left aileron and we’re waiting for them to locate the part in the warehouse?

Yes? How many have been on that plane when the pilot comes on again a minute later — never mind, we have a spare capacitor for the on-board radar and it’s the same size and shape as the valve, so we’ll use that instead. “Flight attendants, prepare for takeoff!”

No? You haven’t had that happen? It’s because airplanes, like computers and washing machines and radio telescopes, are engineered from parts that are individually optimized for one function only, and then the parts are assembled and linked together in one very particular way that makes the machine work.

But evolution is not an engineer, and living things are not constructed out of parts that are separately optimized for exactly one function. Your bones support the body’s frame, but they also store calcium and manufacture blood cells. Your lymph nodes collect and channel cellular waste products, but they also generate an army of lymphocytes to fight infection, and they are responsible for fluid homeostasis. Your liver stores glycogen and also generates hundreds of different molecules important for digestion, regulation, and metabolism, even clotting factors for the blood.

Early geneticists were “flying blind”, with no knowledge of the molecular mechanisms of inheritance; still they figured out very early that the body doesn’t work the way a human designer would have designed it. The word “epistasis” was coined in 1909 by Gregory Bateson. It meant gene interactions. Several genes combine to create one phenotype. The very next year, Ludwig Plate coined the word “pleiotropy”. It is the converse: A single gene has multiple effects. At the time, these words were coined because they were thought to be exceptions to the rule of one-gene-one-trait.

Now we know that one-gene-one-trait is the exception. The body is not engineered the way a machine is engineered. Every molecule has multiple functions. Every function is regulated by multiple pathways. Before we curse the body for being organized this way, consider the benefit: We’re not waiting on the tarmac every time there’s a single part that doesn’t work. The body is wonderfully, amazingly, robustly homeostatic. Far more so than any human-engineered machine that is designed for maximum “fault tolerance”.

Levine’s Innovation for more robust aging clocks

For aging clock technology, the message from the above story is that using individual CpGs for a starting point may not be optimal. We suspect that CpGs, like other biological entities, work together closely in teams. Anything that we might identify as a function (e.g. growth, inflammation, aging itself) might be regulated not by a single CpG, but by a team. Just as the members of a sports team might vary from day to day, the particular CpGs on a team might vary slightly from one individual to the next. But the team has a function and an identity and a signature that is robust.

This spring, the following paper appeared on BioRxiv: A computational solution for bolstering the reliability of epigenetic clocks by scientists at Yale and Elyisium Health, headed by Morgan Levine. Levine’s innovation was to use the same statistical methodology I described above (LASSO, etc) as applied to CpGs, but instead she applied this methodology to teams of CpGs. How do you identify the team members? This leads to Principal Component Analysis (PCA), which is the mathematical part of the story. (more mathematical treatment here)

Simple example of PCA

Imagine charting for 1,000 children their age, height, and weight. Imagine a 3D graph where the x, y, and z axes are age, height and weight. Each child is a point in this space. The points fill a blob in 3-dimensional space, and perhaps the blob is cigar-shaped, because age, height, and weight all tend to increase together. The cigar doesn’t point toward any of the three axes (x, y, z), but it points obliquely out into 3-space.

The direction that the cigar points is called the first principal component. Chances are that the cross-section of the cigar is not round but oval shaped, because taller children tend to be heavier at the same age. The direction in which the cigar is widest is called the second principal component, and the direction in which it is flattened is the third principal component. This example has only 3 principal components, because it exists in 3-space.

PCA for methylation CpGs

Levine uses 78,464 CpGs that vary with age, so instead of 3-space, each person’s methylation profile represents a point in 78,464-dimensional space. Some combination of these tends to vary together most consistently, and that combination is the first principal component. There are 78,464 principal components, but perhaps only a few hundred that are interesting.

The mathematical procedure for finding principal components proceeds in two steps. Step 1 is to compute the correlation coefficient between each of the 78,464 CpGs and every other one, and laying all these numbers out in a giant square matrix. Step 2 is to diagonalize the matrix. The directions of the principal components are called eingenvectors of the matrix.

For you computer geeks, the number of arithmetic operations required to diagonalize a matrix goes up with the cube of the rank of the matrix. So the number of operations for a 78,464 square matrix is in the range 500 trillion. On a desktop computer capable of 1TFlops = 1 trillion floating point operations per second, this suggests the diagonalization might require just a few minutes.

Why I’m enthusiastic about PCA methylation clocks

As I wrote above, this approach seems to be well-aligned with biological complexity. It is a departure from the tendency of most scientists to be more comfortable with reductionist paradigms. Biology works with teams of molecules, The set of CpGs that form a principal component tend to vary together, turning on and off in a coordinated way. It is reasonable to think of a principal component as a “team”. We expect the team to function more consistently than any of its individual members.

Second, there are quirks and errors in lab technique and in quality control for individual bead chips (Illumina Corp) that process the DNA samples and measure methylation. These can have large effects on any single CpG site, but they are unlikely to affect an entire principal component in a consistent way. So we expect the PCA methodology to be more robust against variations in lab technique and variability from one bead chip to the next.

Third, in practice the Levine team reports that their computational method already produces the most precise age measurements yet. PCA computation slashes the uncertainty introduced by technical and lab issues by a factor of 6.

…And why I’m cautious

Steve Horvath is the father of methylation clocks and also the person who has published more research in this area than anyone else. Several years ago, I asked Steve about PCA analysis, and he said that he had tried using it a decade ago, and abandoned the PCA methodology on the way to his groundbreaking 2013 clock.

When we work with individual CpGs, we usually have some sense of what genes are associated with each CpG, and we have a sense of what those genes do. When we work with PCs, we are flying blind — they are just mathematical constructs, so there is no known associated physiological function. “Correlation is not the same as causation”, so it is even possible that these thousands of CpGs are correlated, but that they don’t really work together as a team at all. The major danger of using CpGs is excessive abstraction. We are manipulating mathematical objects formally and trusting that the results will be meaningful. This increases the risk of the “overfitting” problem I described above. Here is a rather technical cautionary editorial.

Of course, there is no guarantee that the first principal component or the second will be correlated with age. Looking for PCs that correlate well with age is just like looking for individual CpGs that correlate well with age. And certain PC’s will be found to work well together to predict age, just as in the classical method certain CpGs work well together.

PCA methylation clocks are a new technology without a track record, and for now the established and validated clocks should serve us well.

The future

The Levine paper already contains many computational tests and interesting results, but it is new and not yet peer reviewed. Still, I’m hopeful that this represents a new direction for methylation and other aging clocks. It has the feel of a right approach.

Levine is committed to open science even though she is affiliated with the for-profit Elysium Health, which has its own proprietary methylation clock, and even though universities are jealously guarding IP rights in this era. The good news is that the peer-reviewed version of her paper will be published shortly, and full details of the algorithms will be available on GitHub and script in the R programming language will be released for the use of other researchers. I hope there are others who pick up on this technology so it moves rapidly forward.

If PCA clocs correlate well with previously validated clocks but offer tighter uncertainties, we’ll know we’re on the right track.