Objection to Computational Approach to Auditory ...

4 downloads 0 Views 479KB Size Report
This Post: Objection to Computational Approach to Auditory Perception. Akpan Jimmy Essien. 1. Introduction. Scientists seek invariants to offer scientific ...
Project: Direct Auditory Perception. This Post: Objection to Computational Approach to Auditory Perception Akpan Jimmy Essien 1. Introduction Scientists seek invariants to offer scientific explanations for auditory analysis. However, in the absence of invariance, all known hearing theories, whether psychoacoustic or ecological, repose on computations of spectral or mechanical parameters. This option hangs on the belief that the ear does computations to arrive at auditory precepts. In this regard, Ullman (1980), like some of his predecessors cited by him, took a stand against direct auditory perception. A careful examination of the body of evidence upon which he founded his objection reveals serious deficiencies, the most serious of all being total negligence of invariance in auditory analysis. To better portray the way the ear apprehends auditory sensations directly, two main veils that eclipse direct auditory perception need to be removed: (1) Computations; and (2) The Object/Medium Controversy. The present update to the above project addresses the first problem. From the standpoint of invariance, it will be shown that a calculator (or any device) carries out no computations at all to arrive at the results it displays. 2. Evidence 1 Take, for example, a weighing machine. Suppose we place element x on the scale and the dial indicates the weight as 3 kg. If we place another element y whose weight is 5 kg on the scale, the device indicates 8 kg. When another element z, whose weight is 7 kg, is placed on the scale, the machine indicates 15 kg. How does the human observer, by human intelligence (HI), interpret the performances of the weighing machine (or AI)? For HI, the AI adds (or computes) the weights of the elements on the scale to arrive at the sum of 15 kg. But does the device add in the sense of computations? Let us see. If we remove elements x, y, and z from the scale, the scale indicates 0 kg. Now, we put all the elements in one sac and place the sac on the scale; the device indicates 15 kg. What do we conclude? To answer, let us bring the matter closer to sensory behaviour. 3. Evidence 2 A heating engineer installs radiator x in a room and a thermometer gives the room temperature as 9oC. That temperature is too uncomfortable! The engineer installs another radiator y; the thermometer indicates the temperature as 15 oC. That’s better but still it is not warm enough! The engineer installs a third radiator z and the thermometer reads 21oC. That’s better!! The relevant question arising is: Does the thermometer compute the output from individual radiators x, y, and z to arrive at the room temperature of 21oC? Let us put it to the test. The engineer, on attaining the desired room temperature realises that the radiators have taken up all the space he needed on the wall for other purposes. To resolve the problem, he takes down the radiators and, after some calculations of device sizes and outputs, he installs one powerful radiator, and the thermometer indicates the room temperature as 21oC. How do these changes portray the performance of the thermometer? Before answering these questions, let us examine a case involving sound. 3. Frequency Estimation in pitch perception research Auditory research is perhaps the most controversial science today; after some 2,500 years since the maiden scientific study in the form of the Pythagorean string ratios, the science is still investigating the origin of pitch. The most exploited parameter is frequency of vibration. How easy is it to estimate frequency of vibration using mechanical devices? There are many known methods of frequency estimation, among them ‘peak picking’, ‘zero crossing calculations’, etc. In the early years of perception studies using spectrographic or glottographic images, estimations were arrived at

manually. Today, hearing research has acquired sophisticated technology for frequency estimation. The algorithms perform well on synthetic signals, but fail woefully when faced with unlawful spectral variability of naturally produced stimuli. The failure is not attributable to the programs, but rather, their writers must admit inadequate knowledge of the raw materials they aim to process. Are these artefacts intelligent? Not really. After all, they only do what they were asked to do. Outside such instructions, they can do nothing at all. Let us now consider all the questions raised above. 4. Observations One may assemble a host of examples of the activities and performances of AI, and argue that the devices compute to produce results. In the case of the weighing machine, 3+5+7=15; and in the case of the thermometer, we can work out the contributions of radiators x, y, and z to the temperature of the room. At the same time, the scenarios show that AI indicates what is, but not the individual contributory elements to the sum. In other words, the scenarios indicate that AI operates on the basis of invariance, that is, the output is the result of the different inputs regardless of how the sum was arrived at. In this regard, Ullman (1980) mentioned the calculator. He specifically acknowledged that the human system does not necessarily function in that manner. The acknowledgment vitiates the analogy between brain functions and the principle of a calculator. Interestingly, he recognized the presence of “currents and voltages” in a calculator. The currents and voltages are comparable to elements with different weights on a weighing scale, or radiators with different outputs in the hypothetical heating scenario above. In very simple terms, the inputs to a calculator are different magnitudes of the same parameter that the device has been made to adjust. The device is conscious of nothing; it displays the end quantity; the computations are done, not by the device per se, but by the device operator. For the device to perform any operations on its own, the algorithm for such operations must have been established by the maker, otherwise the device can do nothing at all. This fact is borne out by the failure of pitch extraction algorithms that fail because they had not been educated on certain aspects of the tasks assigned to them. They are not intelligent at all. 5. Application If, now, we transplant this principle from AI to human auditory perception, we can also draw the conclusion that the auditory mechanism was created and equipped with a robust algorithm to do what it does; and like AI, the human auditory mechanism does it unconsciously. This cannot be sensibly denied, otherwise we would not have spent the past 2,500 years since Pythagoras seeking to understand what we hear as pitch or how we hear it. The traditional computational approach to hearing may be traced back to the string ratio theory with all the computations of physical parameters of sound sources and their ratios, leading to frequency of vibration and their ratios for pitch intervals, all without knowledge of what determines pitch. The comfort of computational approach to hearing was suddenly disrupted by Gibson’s direct realism. 6. Implication Direct realism, however, is unsustainable, in fact utterly useless, without establishing the object of perception. Without the underlying cause of an auditory sensation, speculations on the principle of the auditory system is futile. The evidence supports the view that in music production, it is the music performer who computes; all the mechanical adjustments in music performance are different ways leading to the invariant that underlies a given auditory sensation. The auditory system, for its part, simply extracts and measures the magnitude of the invariant—the property of the sound source that underlies the sensation, for example, pitch. In speech, all the articulatory gestures are here recognized as different ways leading to the invariant that is perceived as a speech code. The computations leading to the invariant are carried out by the human speaker, not the ear. The brain, for its part, does not do any computations at all. Rather, it simply extracts the invariant that determines the speech code. Because all the mechanical and articulatory computations lead to the same code, the code does not change across musical instruments or human speakers (Essien, 2016).

7. Conclusion This update to the project Direct Auditory Perception argues from the standpoint of invariance that mechanical devices (AI) do not carry out any computations to arrive at the results they display Whether in music or speech, different operations (or computations of different parameters) by the musician or speaker lead to the same auditory code. The auditory system does not seem to carry out any computations at all; rather, it apprehends the invariant to which all computations lead. The computations are done, not by the auditory system but by the musician or human speaker. To explain direct auditory perception scientifically, it is compelling to identify the parameter of the sound source that underlies every auditory attribute of sound.

PS The next update will address the Object/Medium Controversy and help us move closer to a better understanding of Direct Auditory Perception.

References Ullman, S. (1980): Against direct perception. The Behavioural and Brain Sciences, 3.3:373-415. Essien, A. J. (2016): The mechanical invariance factor in music production and perception. ICA-2016, Paper 1179. 22nd International Congress on Acoustics,. Buenos Aires, Argentina, 5-9 September, 2016.