Short and Simple Introduction to 3mers and DFTs in Engineering and DNA

It is amazing that a Cable TV wire can simultaneously carry hundreds of TV channels. This is made possible because several signals can be merged into a single signal by the transmitter and then separated by the receiver at the end. The process of doing this involves very complex levels of coding and decoding machinery.

There are some aspects of coding and decoding machinery of the modern internet and cable TV culture that is related to the detection of 3mers in biology. I explore them briefly here.

Suppose I pressed middle C on an electronic piano keyboard that was set to produce pure sine waves. If I had an oscilloscope picture of the sound, it would look like:

middle c

If I pressed C that was 3 octaves above middle C it would have a frequency that is 23 (or 8) times higher. Thus on the oscilloscope, the sound would look like:

3 octaves above

If I pressed both keys at the same time, the resulting sound is simply the addition of both notes, and the combined sound merges into one wave form would look like:

merged sounds

If all I had was the merged signal, would I be able to separate the merged signal into the individual parts? The answer is “yes”. Joseph Fourier formalized the math of such a procedure and created the equations associated with his name such as the Fourier Transform:

fourier slide

In principle, an infinite number of “notes” can be merged into one signal by a hypothetical transmitter and then separated back into the constituents by a receiver. In practice, this cannot happen because of physical limitations.

In nature, there are mechanisms which can implement Fourier decompositions such as the human ear for decomposing sound waves into various periodicity components (aka notes):

ear

or prisms for decomposing light waves into various periodicity components (aka colors):

prism

In the world of engineering designs, Fourier decomposition are accomplished by circuits

equalizer

or computer algorithms implementing Fourier Transform math:

dft

As far as DNA, detecting periodicities (like 3mers) using Fourier transforms is a bit forced and perhaps in the end will not be the best tool for detecting exons, and in my opinion needlessly complex. More math complexity does not necessarily create a more accurate tool for identifying exons. In fact 3mers can be seen even in short stretches of synthetic genomes created by random number generators.

One of the reasons the Fourier transform is an imperfect tool for detecting periodicities in DNA is that the periodicities are so short-lived in DNA and only approximately periodic. Worse, noise due to sampling DNA will create lots of false periodicity signals. And worse yet, each nucleotide position presents only two possible discrete states for a given nucleotide, i.e. Guanine or not-Guanine, Adenine or not-Adenine — which is essentially like having only two possible amplitudes of 0 or 1. Most Fourier Transforms assume an infinite number of amplitudes from 0 to infinity. The fact that there are only 2 discrete amplitudes makes it virtually impossible to overlap information through adding periodic signals together as can easily be done with sound waves (like the example above). One cannot take a DNA strand with a strong 3mer pattern and somehow add it to a DNA strand with a 4mer pattern and get a DNA strand containing both patterns.

The search for overlapping information in DNA must go beyond the use of Discrete Fourier Transforms and more accessible constructs. The BioLanguages website hopes to explore methods of identifying overlapping information in DNA.