A fold makes something more compact, as anyone who knows how to pack will attest. Folds also are important in the field of genetics. For example, we know genomic DNA, which is roughly 2 meters long, is folded so it can fit inside a cell nucleus that has a diameter of about 10 μm (micrometers).
Casually look at a cell’s genome, its collection of chromosomes, through a standard microscope. We seem to see a chaotic jumble of noodles. But, using much more powerful imaging techniques, we see how the genome folds into about 10,000 loops that do not become entangled with one another.
The genome of an organism is the whole of its hereditary information encoded in its DNA. The human genome is stored on 23 chromosome pairs in the cell nucleus and in the small mitochondrial DNA – that’s 46 chromosomes that contain two sets of roughly 20,000 genes. Each gene spells out a coded message telling the cell how to make a particular protein.
Looping helps to determine which genes get expressed or activated in different cells. That activation influences the functions the cells perform. Or, so we believe.
Erez Lieberman Aiden told the JHV that understanding the role of looping might have huge implications in our understanding of health and disease. Here in Houston, Aiden and the researchers at the Center for Genome Architecture are making breakthroughs in understanding the mechanisms and rules for loop formation.
Aiden is assistant professor in the Department of Genetics at the Baylor College of Medicine, where he directs the Center for Genome Architecture, and in the Department of Computer Science and the Center for Theoretical Biological Physics at Rice University.
He’s at the cutting edge of mapping the loops in tissues across the human body. These loops obey a simple code, hidden in the sequence of the genome itself. It appears these loops turn out to be ancient structures, and are seen in other mammals, such as mice.
It’s a striking discovery that half the loops in mice cells are at the same corresponding position in the human genome, said Aiden.
“Mice diverged from humans roughly 60 million years ago. Many loops have been preserved from that time. If loops weren’t doing anything, then mutations would probably have gradually effaced those loops. Since loops haven’t disappeared, that indicates mutation somehow is unable to do this because these things are actually doing something useful – so, if the bases get mutated, the organism won’t survive.”
At the Center for Genome Architecture, Aiden and his laboratory team are working to create the first atlas of looping in the human genome.
The genome in every cell is the same. However, different cells do different things.
“Although the genes are the same, somehow the genes are regulated differently. Their patterns of activity differ. A heart muscle cell works differently from a brain cell. Given the fact that the gene sequences are the same, how does that work?
“We know that small segments of DNA, known as enhancers, are needed to activate genes. Enhancer elements can turn a gene on or off. In our maps, we’ve discovered the enhancer elements are frequently far away from the genes they regulate. How can a far away enhancer element turn that gene on or off? Through our 3D mapping, a thought emerged – what actually happens is since the genome is folded up, things that seemed to be far apart in 1D are actually close in 3D because of the looping. If we understood the loops, we’d better understand how genetic regulation works.
“For many years, the major obstacle was nobody knew how to map the loops. About five years ago, our group in Houston published the first reliable maps of looping. That led the NIH [National Institutes of Health] to support us in creating an atlas, showing the position of loops across the genome in a wide variety of cell types.”
The hope was if researchers understood the mechanism and process of looping, they could unlock the secret of gene regulation. But, scientific discovery often doesn’t work in a straight line.
In an article published in the March 2019 issue of Scientific American, Aiden wrote that until recently, researchers thought loops regulated genes. But, now that we have seen loops in action, he said, the thinking is gene regulation may be only one aspect of what loops do.
“That’s one of the big questions on the agenda,” said Aiden. “There’s some evidence that supports a role for loops in regulation. But, the evidence is not as dramatic as we expected. One of the major things we hope to get from this 3D mapping is how loops contribute to regulation, and whether there are other functions they perform in addition.”
Loops now appear to be one of the pattern controllers – “a conductor of the genetic orchestra” – influencing when particular genes become active enough to affect cell function.
“As we continue to explore the loops, we expect to better understand gene regulation, and to find clues about how many diseases arise,” said Aiden. “Like any explorers in uncharted territory, we need better maps.”
Better maps are on the way. Recently, Olga Dudchenko, a post-doctorate associate at the Aiden lab, developed a new method to sequence and assemble the genome of any organism or person, to a similar level of quality as the Human Genome Project, for less than $1,000. The original Human Genome Project, which determined the sequence of DNA letters in a typical person, took decades and cost more than $3 billion.
Aiden and Dudchenko have created the DNA Zoo, a consortium of academic labs, zoos and aquariums around the world that is working to assemble the genomes of hundreds of species, chronicling the evolution of loops across the tree of life.
In addition to his work on genomes, Aiden has co-authored a book about big data as a new way of understanding the world. The book describes a database, undertaken with Google, consisting of the 500 billion words contained in books published between the years 1500 and 2008 in English, French, Spanish, German, Chinese and Russian. By plugging in a string of up to five words into the database on a computer, one can see a graph that charts a phrase’s use over time.
The project demonstrates how vast digital databases can transform our understanding of language, culture and the flow of ideas as recorded in books, said Aiden.
“Two hundred years ago, many of the books were written by the clergy about religious subjects. The range of our intellectual life reflected in books has expanded considerably.
“We can identify cultural trends on the upswing, but we can’t predict what the human story will be with perfect certainty. We know two things: Aspects of our culture are predictable. And, our ability to collect and analyze data about our society will increase.”
Of all the unanswered questions he’s encountered, what question would Aiden most like to know the answer to?
Aiden was silent for almost 3 minutes. Then, he answered:
“I’d like to know what my kids will become when they grow up. I believe I’ll get an answer to that question someday, and I’m incredibly curious about it.
“I have four children: Gabriel Galileo, (age) 9; Maayan Amara, 7; Judah Abraham, 5; and Elijah Amichai, 3. I really don’t have specific goals or plans for them. And, if I had specific plans, I’m sure my kids would do something else.”
How loop formation works
Loop formation is believed to activate when an “extrusion complex” lands on DNA. That starts the loop growing. Two ring-shaped sub-units, called cohesin rings, slide in opposite directions, causing the actual loops to grow until the process is brought to a close by a protein called CTCF.
All cells have the same genes and the same CTCF-binding motifs. A cell that does not need particular genes can suppress those genes by deactivating CTCF motifs.
Visualize the cohesin rings working like tri-glides or a buckle on a belt. At first, they attach anywhere on the genome, with the DNA going in one ring and out the other. Then, the two rings slide in opposite directions (one to the left along the linear molecule, and one to the right), extruding a growing loop as they go.
Eventually, one approaches a site where a CTCF protein is bound. If the underlying CTCF-binding motif is pointing toward the approaching ring, then the sliding ring stops on contact. If the motif is facing the other way, the cohesin ignores it and keeps going. In this way, a CTCF-binding motif is like a stop sign for cohesin traffic, thereby guiding the formation of loops.