The Beekeeper’s Companion Since 1861

The Curious Beekeeper

CRISPR for the Curious: A Primer

- July 1, 2020 - Rusty Burlew - (excerpt)

honey bee genome and DNA

CRISPR headlines make the news every day, served with a side order of alphabet soup that can make your head spin. To say the acronyms and initialisms are off-putting is an understatement. However, CRISPR technology is here to stay and it will change our lives forever, so a basic understanding is worth the effort.

At its core, CRISPR is a gene-editing tool. A gene is a string of genetic material, usually DNA, which resides at a specific spot on a chromosome and has a particular function. The instructions carried by the gene direct the formation of proteins or RNA molecules. Taken together, all the genes of an organism make up its genome. The genome of the honey bee, Apis mellifera, comprises roughly 10,000 genes, while the human genome sports approximately 20,000 to 25,000 genes.

In effect, editing a gene is not much different from editing an ABJ article. The editor reads through the strings of letters and changes the ones he doesn’t like. For example, I always write “further” and my editor (name withheld for privacy) always changes it to “farther,” a correction that improves the quality of my writing. Farthermore, it makes me sound smarter than I am.

Similarly, gene editing is a process in which the DNA or RNA can be modified to change its characteristics. Sequences can be altered, added, or deleted, depending on what the researcher is trying to do. Gene editing has been around since the 1970s, but the early techniques were time-consuming, costly, and often produced random and unpredictable results. But CRISPR technology, first developed about 15 years ago, revolutionized our ability to reliably edit the genes of nearly any organism.


Gene edits in the wild

CRISPR, pronounced like the plastic box in which you keep lettuce, is an acronym for Clustered Regularly Interspaced Short Palindromic Repeats. Crazy, right? Who would think of that? But oddly, the name helps explain how CRISPR was discovered back in 1987, and why it exists. Bear with me here because the story is cool in a nerdy sort of way.

CRISPR evolved naturally in some species of bacteria and archaea so they could protect themselves from invasion by viruses (called bacteriophages). Yes, you read that right. Bacteria and archaea get viruses just like we do, so they needed a way of coping. Since they don’t have complex immune systems, the bacteria and archaea devised a method of dealing with the intruders by editing their own genomes.


A short alphabet

As you know, the genes of living organisms are built with just four types of bases: adenine (A), cytosine (C), guanine (G), and thymine (T). The order in which the bases are strung together dictates the recipe for building a protein. But with only four “letters” in the alphabet, the words can be long and ungainly.

Early geneticists, especially those working with bacteria, sometimes noticed strange sequences of bases that formed palindromes. In case you forgot, a palindrome is a sequence of letters or numbers that read the same forward and backward. The ultimate palindrome appeared on your calendar earlier this year on 02/02/2020. If you ignore the punctuation, that date reads the same whether you start from the front or the back. Single words can be palindromes, too. Radar, kayak, racecar, and civic are perfect palindromes.

The discovery of palindromes in genetic code was odd, but even more perplexing was the code the scientists found sandwiched between the palindromes. For example, let’s say you found a sequence of the four bases that read a-c-c-t-a-g-t-a-g-c and then, a little farther along the chain, you found its palindrome: c-g-a-t-g-a-t-c-c-a. All well and weird, but between the two was more code, let’s say g-c-a-t-g-g-c-t.

All strung together, you would see c-g-a-t-g-a-t-c-c-a-g-c-a-t-g-g-c-t-a-c-c-t-a-g-t-a-g-c. Not knowing the meaning of any of it, some early researchers named the end pieces — wait for it — “clustered regularly interspaced short palindromic repeats (CRISPR),” while others called them bookends. The bookend analogy works well since they are mirror images bracketing some stuff in the middle.

The pieces in the middle didn’t seem to belong, so the researches called them spacers. The term “spacer” makes them seem insignificant, but it turns out that the most important part of the entire sequence — the “meat,” if you will — is actually the spacer. We’re getting to that.

Some enterprising researchers ran the middle pieces — the spacers — against enormous databases of genetic code. It took a while, but they found strings of similar code in the genomes of some viruses. But what was virus code doing inside bacterial DNA? And why was it offset by brackets of genetic palindromes? Even Hercule Poirot would be mystified.


Spoils of war

A battle between a bacterium and a virus is like a duel. First, a hapless bacterium gets infected with a virus. Not good, as we know. But by engaging its primitive immune system, the bacterium puts up a fight. During the fray, the bacterium excises a piece of the virus’s genetic code — Slash, slash. Take that, you vicious viral varmint! — which the bacterium copies and stores in its own DNA for future reference.

The stolen code is like a password. If the bacterium survives the viral attack, this stored password will allow the bacterium to recognize the same virus in the future and “remember” how to fight it. The one problem is, how will the bacterium know which piece of code is relevant? Since code comes in strings, how can the bacterium know where the password starts and stops? Well, duh! It stores the code between bookends — two palindromic sequences that can be readily recognized.


Coded for life

<p>When you write a paragraph in HTML, the language of most websites, you start off with a code that means “the paragraph starts here.” Then, at the end of your paragraph, you use a similar code that includes a leading slash, meaning “the paragraph ends here.”</p>

CRISPR does the same thing. The palindromes mark the beginning and the end of the stolen password that resides in the middle, and the password identifies the particular virus that the bacteria may have to fight in the future. And since the bacteria — single-celled organisms — store the password within their genetic code, the information gets passed into future generations of baby bacteria.

Over time, strains of bacteria acquire defenses against many types of virus. Within the DNA of these bacteria, long strings of alternating bookends and passcodes are called ….