Saturday, March 15, 2014

The Neural Sound of Music

The awaron hypothesis suggests that bilateral neural packets are the basic mechanism of sensation and of thought. Neural packets essentially form to or mimic sensory input and those mimes then persist in what we call thought, which is an awaron packet for a given moment of time. Each organ of sensation will have unique mimes but there will be similar characteristics and the human ear will sense sound in ways that are similar to how the eye processes vision.

Although there are many components within the human ear, the basic neural organ of hearing is called the cochlea. The cochlea has a spiral shape (see Figure) and with both impulse and response fluid canals with the very well known frequency response shown in the figure. Tiny hairs along the length of the basilar membrane, which is the wall between to the spiral cochlea impulse and response canals, are the neurons that that sense sound by fluid deflection of a tiny hair. The neural patterns in the EEG that come from basilar excitation are many and varied, but the neural network is still not well understood. In fact, science does not seem to have a very clear understanding of the underlying neural impulse patterns for even simple organisms.

One of the characteristics of human hearing, for example, is that we sense and enjoy music with the seven notes of the octave. We recognize that this particular 7-mer tonality is pleasing for humans, but science does not yet understand the neural network that is the basis for the 7-mer tonality. The basilar regions associated with sensation of sound from 200 to 20,000 Hz are shown in the figure along with how they map into the topology of the cochlea. The frequency diagrams show where along a hypothetical uncoiled basilar membrane we sense sound frequencies. However, there are frequencies below 200 Hz that are very important for enjoying music and there seems to be a problematic compression of these longer wavelengths beyond 200 Hz into the tip of the basilar.

The 7 notes of the octave are very suggestive, though, of a binary or bilateral difference sampling between groups of selected neurons and the top three rows of Pascal's triangle of the binomial theory describe how this binary sampling adds up to 7. This implies that the same kind of bilateralism that we recognize as the left-right symmetry is a part of hearing. Indeed, bilateralism is very common in all higher organisms and indeed also in the binary frequency analysis of sound and other spectral data with the Cooley-Tukey fast Fourier transform (FT) algorithm. The basic FT algorithm processes spectral data by sampling a time series with powers of two averaging and bilateralism is therefore an efficient way to sample and compress time series data into frequency amplitudes for representation in thought packets.

Therefore it seems very reasonable that along the basilar membrane, neurons from 20 mm to the end would be progressively paired into 7-mer's with the midpoint defined by middle C at 262 Hz, which peaks at 27 mm from the stapes. These progressive bilateral neural pairings would then form difference modes that would complement the sum and total modes and enhance sensation of the frequencies lower than ~1000 Hz as shown. These 7-order difference pairings would then effectively provide for our pleasure hearing the tones and chords of music.


It is no coincidence that there is likewise a 7-mer compression of retinal information from the eye just like there is apparently a 7-mer compression in the ear. This means that both our auditory and visual sensations end up using the same neural bandwidth at 7x the EEG delta wave, which is the EEG alpha wave at 11 Hz.

The sensation of sound results in a awaron packet of bilateral neurons. There are approximately 30,000 neurons in the auditory fiber, 90% or 27,000 innervate sensation while the balance of 3,000 neurons provide for feedback and gain control by stiffening gain hairs in the membrane. Each cochlear hair cell synaptically couples to about 10 other neurons, which then provides 270,000 neural nodes per frame or heartbeat, or 430,000 nodes/s. With a Hopfield reduction factor of 0.14 and a frame of 0.6 s, this is an overall effectively sampling rate of 7.9 kB/s or 4.7 kB/frame.

The Nyquist cutoff for human hearing is twice the 20,000 Hz upper range and would correspond to a neural network of 290,000 nodes/s with a Hopfield reduction of 0.14, which suggests that we use about 67% of the neural bandwidth for pure frequency response. Therefore, we use the balance of 33% neural bandwidth combination for tonality and phase, attributes that are especially critical for music.

In addition to sensations of tones or frequencies, which are the vowels of speech, there are also the sensations of sound starting and stopping, which are the consonants of speech. Starting and stopping of sound involve very high frequencies that are clipping sounds and starts and stops are quite a bit simpler to compress than tonal sounds. As the figure shows, a possible three difference modes sum to three for the top of the Pascal triangle, which would be a so-called theta EEG mode at 4.8 Hz, shown in the actual EEG spectrum below. Start and stop encoding is likely due to bilateral coupling of just 3 difference modes for hairs from 0 to 20 mm from the stapes.



This compression would be consistent with the total 10 interconnections associated with each auditory neuron, 7 for tone and 3 for stop and start or phase. All of these sensations, though, would be subject to the overall phase of the delta mode at 1.6 Hz. While start and stop data is important for sensation of all sounds, start and stop or phase encoding is especially important for the low frequencies of music since phase sets the tempo of music.

Although this hypothesis or conjecture for the auditory neural network is not yet validated, it does appear to be consistent with the much of the data that is available. The prevalence of cochlear implants now provides a basis for testing this hypothesis. While encoding frequencies above about 1000 Hz (soprano C6 is 1047 Hz) is pretty straightforward with implants by this hypothesis, frequencies below 1000 Hz would need a special folding algorithm around middle C at 262 Hz or wherever a person's tonal connections midpoint would happen to be.

It is possible that people with what is called perfect pitch have a natural tonal midpoint that is very close to that of standard middle C at 262 Hz. Most people without perfect pitch, though, need to shift their hearing reference tone just like we do with color vision by feedback to the gain hair neurons. Since cochlear implants do not respond to neural feedback, this tonal shift must be performed electronically by the device and would need to be tuned for each person.