THE EFFECT OF SOURCE TIMBRE ON PITCH

1 downloads 0 Views 12MB Size Report
programmed using the 'Psychopy' software (see Figure 4) and presented to the participant on the researcher's laptop, this was commenced after the warm-up ...
Running head: THE EFFECT OF SOURCE TIMBRE ON PITCH REPRODUCTION

1

The Effect of Source Timbre on Pitch Reproduction Priyanka Shekar Center for Computer Research in Music and Acoustics Stanford University Stanford, California, USA

Author Note I wish to acknowledge the helpful direction of my instructor, Takako Fujioka, and teaching assistant, François Germain, while pursuing this study. I would also like to thank my classmates and peers for participating in my experiment and giving candid feedback.

THE EFFECT OF SOURCE TIMBRE ON PITCH REPRODUCTION

2

Abstract This study investigates the effect of source timbre, namely sine (pure) tone and vocal, on the accuracy of pitch reproduction. It further questions whether musical and vocal experience bear influence on this ability. 24 Stanford University students formed the participants, and were grouped into categories by musical experience - singers, musicians and non-musicians. Stimuli were presented as sine and vocal tones of 1-second length in matched pitch pairs (same fundamental frequency), and participants were required to reproduce stimuli as faithfully as possible. Vocal responses were processed with pitch extraction algorithms in order to calculate mean pitch inaccuracy, or deviation of response pitch from stimulus pitch. The results strongly conveyed that sine tones were significantly less accurately reproduced as compared to vocal tones, with significantly more variability in the reproduction. It is indicative that more vocal and/or musical experience may cause less dependence on timbre for pitch reproduction, however more samples are required to draw a conclusion on this. Keywords: pitch, timbre, reproduction, perception, singing

THE EFFECT OF SOURCE TIMBRE ON PITCH REPRODUCTION

3

The Effect of Source Timbre on Pitch Reproduction Introduction The ability to sing on key may define a benchmark when characterizing people as singers and non-singers. Matching fundamental frequency, or pitch, is one of the most basic skills one must acquire, and an un-matched or off-key performance is immediately recognizable, even to the untrained ear [1]. Many people claim to have problems when expected to sing, such as at public gatherings or celebrations. The term ‘tone deafness’ is generalized to mean poor-pitch singing, and implies the cause is poor perception ability. However, this process of tune imitation also involves production, memory and sensori-motor integration [2].

Pitch and timbre are elementary building blocks in music. Pitch modulations are normally heard as a progressing melody, whereas modulation of timbre may be perceived as different instrumentations. Interestingly, in a vocal melody, the varying timbre of the human voice does not detract from the perception of a single melodic line. However, given that both pitch and timbre lie in the spectral domain, some interaction between the two is expected [4]. Pitch Discrimination Warrier & Zatorre’s [4] study examined the effect of tonal context on pitch perception, and the influence of perception on the interrelationship between pitch and timbre. For the small changes in fundamental frequency examined in this study, timbre or spectral shape always influenced discrimination of pitch. In all sets of experiments, participants felt that differenttimbre tones had more pitch difference than identically pitched same-timbre tones. Tonal context was found to facilitate pitch perception, in good agreement with previous studies.

THE EFFECT OF SOURCE TIMBRE ON PITCH REPRODUCTION

4

Pitch Reproduction Pfordresher & Brown [2] conducted two experiments in vocal pitch imitation with participants with who had no vocal training and none to little musical training, in order to discriminate vocal ability and training. The first experiment investigated ability to discriminate pitches. Stimuli were high-pitched pure tones to facilitate discrimination. The second experiment presented participants with a novel melody in synthetic voice. Comparison of the results between the two experiments enabled discovery of whether the root cause was perception, production or the mapping process between the two. 10-15% of participants were defined as “poor-pitch singers”, imitating pitch at least one semi-tone off in each experiment. The poor-pitch singers were found to reproduce the intervals in a compressed manner, although, interestingly, they did not differ from good singers in pitch discrimination accuracy. The authors interpreted findings to conclude that poor-pitch singing finds basis in incorrectly mapping input pitch into action, rather than perceptual, memory or motor systems in a specific sense.

Murry [1] examined the accuracy of pitch-matching in singers and non-singers resulting from repeated trials. Singers and non-singers were matched in age against one another, consisting of those who were pursuing vocal careers with an average of 6 years of professional experience. Stimuli were presented as pure sine tones in the participant’s vocal range. Results demonstrated that singers improved between the first 3 trials and the final 3, where as non-singers showed no such trend. As may be expected, this suggests that singers rely on skill and training to aid pitchmatching.

THE EFFECT OF SOURCE TIMBRE ON PITCH REPRODUCTION

5

Motivation for Study It is apparent that previous studies have examined pitch reproduction with either a sine (pure) tone stimulus [1], or human voice stimulus [2]. However, there has been no observation regarding the effect of the stimulus timbre in facilitating or hindering accurate reproduction. Stemming from this, this study forms the question of whether source timbre affects pitch reproduction, namely for sine tone and vocal timbres. Subsequently, it seeks to examine whether there is any distinction between categories of singers, non-singer musicians and non-singer nonmusicians. It is hypothesized that singers are largely unaffected by timbre for pitch reproduction. Non-singer musicians are somewhat affected, and non-singer non-musicians are significantly affected. In the latter cases, vocal timbre is foreseen to better facilitate correct pitch reproduction as compared to sine tone. An experiment was formulated to test this hypothesis at Stanford University (Stanford, CA, USA). Participants from the student population were asked to listen to 2 sets of pitch-matched stimuli at various fundamental frequencies (pitches) within an octave of a comfortable voice part, being Soprano Alto Tenor Bass (SATB), in sine tone and voice timbres for 1 second each. Each participant was then required to attempt to reproduce what they hear vocally on the vowel /ɑː/ (ahh, father), as immediately and as faithfully as possible. Reproduction efforts were recorded and analyzed for accurate imitation of fundamental frequency.

THE EFFECT OF SOURCE TIMBRE ON PITCH REPRODUCTION

6

Method The experimental design to investigate the effect of sine tone and vocal timbre (test condition) on pitch reproduction for singers, musicians and non-musicians (test categories) is detailed below. Stimuli The stimulus material was formed from 2 timbres – ‘voiced’ by sine wave and human voice (on the vowel /ɑː/). The set of stimuli presented to a participant consisted of 10 sine tones and 10 pitch-matched (same fundamental frequency as sine tones) vocal tones, each 1 second in length, and randomly ordered uniquely for each participant. The pitch of each tone pair was randomly selected within the continuous range of a musical octave, in order to reduce bias to participants’ tonal schema. Three sets of stimuli were created in the middle of the vocal range for Alto, Tenor and Bass voice parts, and a participant was able to select an appropriate range for the reproduction task. A stimuli set was initially created for the Soprano vocal part, however samples were found to contain significant vibrato, rendering the texture inconsistent with other stimuli, thus this vocal part was removed from the voice part choices. For the vocal tones, male participants listened to stimuli sung by males in Tenor or Bass parts. Female participants listened to stimuli sung by females in the Alto part. Sine tones were generated using the ‘Audacity’ audio editing software. Vocal tones were sourced from live recordings of 3 singing students in the Stanford University Music Department, for each of the voice parts. Further processing on sine and vocal tones was done using the ‘Audacity’ software (see Figures 1 & 2). A 1 second steadystate portion was extracted from each vocal sample to form the stimulus. Sine tone and vocal samples were then processed with a fade envelope at the edges of 0.025 seconds length to remove clicks. Vocal samples were recorded from sung Western tonal scales, and thus also

THE EFFECT OF SOURCE TIMBRE ON PITCH REPRODUCTION

7

required pitch-shifting in the in order to pitch-match the sine tones and bring the vocal tones into the continuous pitch space used in this experiment. The perceptual loudness of all stimuli samples were matched to -6dB by mapping to the Fletcher-Munson equal-loudness contour. Note that signal amplitude is not a correct indicator of loudness when there is variance in timbre due to the variance in harmonic energy present, in addition to the human sensitivity to frequency. Subjects 25 Stanford University students were selected as participants by word-of-mouth advertising. Participants were not compensated financially, however they were offered snacks or chocolates as a way of appreciation. Participants had self-verified that they did not have a vocal or hearing impediment before proceeding with the task. Participants were grouped according to musical experience, in categories of singers, musicians (but non-singers), and non-musicians (and therefore non-singers also). A musician was categorized by whether they had taken more than 3 years of consistent musical experience, meaning taking at least 1 music lesson per week and/or practicing at least 3 times per week. A singer was categorized similarly. If a participant answered less than 3 years for musical and singing experience, they were categorized as a nonmusician. Refer to Table 1 for a breakdown of participants. Task Refer to Figure 3 to view a schematic of the pitch reproduction task. Each participant was initially asked to answer an electronic questionnaire to determine vocal and musical ability for categorization into study group, and to confirm that they did not have a vocal or hearing impediment before proceeding. Following this, a warm-up phase required the participant to determine a comfortable voice part. If the participant normally sings a particular voice part, they were assigned to this stimulus pitch range. If not, they were called to listen to a sample of a

THE EFFECT OF SOURCE TIMBRE ON PITCH REPRODUCTION

8

musical scale sung in each range, and able to choose a suitably comfortable part. Stimuli were played to the participant over ‘Apple EarPod’ in-ear headphones. The directional microphone present on the same headphones was used to record the response. The experimental regime was programmed using the ‘Psychopy’ software (see Figure 4) and presented to the participant on the researcher’s laptop, this was commenced after the warm-up phase. Firstly, a set of 4 practice trials was undertaken by the participant with the researcher present in the room. Here, the participant was encouraged to clarify the task, and adjust and set the volume for listening at a comfortable level. After this, the researcher absented the room for the participant to commence the actual task. Each cycle of the task is as follows (see Figure 5) - a visual cue for stimulus play is displayed for 1.5 seconds, immediately followed by a stimulus tone presented for a length of 1 second. The cue then switches to indicate recording mode and the participant is then required to reproduce the tone within a space of 5 seconds, as immediately and as faithfully as possible on the vowel /a/, and the response will be automatically recorded. The whole stimulus set will play through in this manner consecutively, this is to counter short-term memory and schema effects. The participant may press a key to skip ahead to the next stimulus if they have finished their response or are unable to complete it. The mobility of the laptop and headphones setup greatly eased the collection of data. A quiet setting was sought in the music studios and practice rooms at the Center for Computer Research in Music and Acoustics and Music Department at Stanford University.

THE EFFECT OF SOURCE TIMBRE ON PITCH REPRODUCTION

9

Results All raw vocal response data collected from participants was found to be valid, and useful for analysis. Some samples were rather soft or noisy. However, this was mitigated by crosschecking pitch extraction information between two audio processing software programs, ‘MATLAB’ and ‘Audacity’. Data Analysis Most data processing and analysis was done using ‘MATLAB’ software, including extraction of mean pitch from the vocal samples, calculation of pitch inaccuracy for each participant, followed by a mixed ANOVA on these results. Pitch Extraction. The raw vocal response data from the reproduction task was processed with an open-source pitch extraction algorithm written for ‘MATLAB’ [5]. Default settings were used. The mean pitch (fundamental frequency) was found for each response, and this figure was cross-checked with a pitch detection feature in the ‘Audacity’ program. Pitch Inaccuracy. Pitch inaccuracy was calculated for each response as a percentage deviation of the mean response pitch from the true stimulus pitch. As mentioned earlier, pitch is equivalent to fundamental frequency, or f0. The equation for this calculation is as follows: 𝑝𝑖𝑡𝑐ℎ  𝑖𝑛𝑎𝑐𝑐𝑢𝑟𝑎𝑐𝑦   % =

𝑟𝑒𝑠𝑝𝑜𝑛𝑠𝑒  𝑓0 − 𝑠𝑡𝑖𝑚𝑢𝑙𝑢𝑠  𝑓0  ×  100 𝑠𝑡𝑖𝑚𝑢𝑙𝑢𝑠  𝑓0

Mixed-Design Analysis of Variance (Mixed ANOVA). The pitch inaccuracy of vocal responses were averaged over each participant for the two conditions, sine tone stimulus and vocal tone stimulus. Similarly, the standard deviation of the pitch inaccuracy per condition was found for each participant. Thus, each participant was associated with two values for mean pitch inaccuracy and two values for the standard deviation of pitch inaccuracy. A mixed ANOVA was conducted with a between-subjects factor as grouping (singers, musicians, non-musicians) and a

THE EFFECT OF SOURCE TIMBRE ON PITCH REPRODUCTION

10

within-subjects factor as condition (sine tone, vocal tone). The algorithm for mixed ANOVA was also designed for use in ‘MATLAB’ [6]. Post-hoc comparison for each grouping was carried out using the paired t-test function in ‘MATLAB’. Summary Statistics Refer to Figures 6, 7 & 8 for individual plots of mean pitch inaccuracy per grouping, and to Figure 9 for a combined plot of mean pitch inaccuracy for all groupings. Figure 10 plots the standard deviation of pitch inaccuracy for all groupings. The results of the mixed ANOVA reveal the following for mean pitch inaccuracy: •

Grouping is insignificant as a main effect – F(2, 22) = 0.219, P = 0.805



Condition is significant as a main effect – F(1, 22) = 6.789, P = 0.016



Interaction between groups regarding condition is significant as a main effect – F(2, 22) = 4.239, P = 0.028

The results for the standard deviation of pitch inaccuracy are as follows: •

Grouping is insignificant as a main effect – F(2, 22) = 1.249, P = 0.306



Condition is significant as a main effect – F(1, 22) = 12.127, P = 0.02



Interaction between groups regarding condition is insignificant as a main effect – F(2, 22) = 3.012, P = 0.0698

Interaction between groups regarding condition was found to be significant for mean pitch inaccuracy. Therefore, a post-hoc comparison was carried out for each group independently, using a paired t-test. Results are as follows: • For singers, condition is insignificant as a main effect – P = 0.420 • For musicians, condition is significant as a main effect – P = 0.0425 • For non-musicians, condition is significant as a main effect – P = 0.0492

THE EFFECT OF SOURCE TIMBRE ON PITCH REPRODUCTION

11

Discussion The numerical results highlighted in the previous section are discussed in context to the research question posed on the effect of source timbre on pitch reproduction, and are in general agreement with the hypothesis. Interesting informal comments by participants are also included. Numerical Data The effect of timbre on both the mean and standard deviation of pitch inaccuracy were found to be significant. Sine tones were reproduced with less accuracy as compared to vocal tones, and also with a larger degree of variability in accuracy. Singers were found to be largely unaffected by the two condition timbres. Musicians were found to be significantly affected by timbre, with less accurate reproduction of sine tones compared to vocal tones. Non-musicians also, were found to be significantly affected by timbre in the same way. However, it appears that musicians and non-musicians are affected by timbre to similar degrees. Reflecting on the research question and hypothesis for this study, the results found are in general agreement. Timbre does have a significant effect on the accuracy of pitch reproduction, and it is evident that sine tones are reproduced with less accuracy than vocal tones. However, unexpectedly, it appears there is no distinct difference between the effect of timbre on musicians and non-musicians. A good reason for this could be the small sample size for non-musicians (n = 4) as compared to musicians (n = 17). It is necessary to conduct this experiment on more non-musicians and singers to draw conclusions on the influence of musical and vocal experience. Informal Verbatim Many participants were curious about the aims of the experiment after taking it, and recounted their feelings and ideas. Some of these are captured below, and may reveal branching ideas to be investigated in the future, such as interpretation of non-vocal tones as vocal tones:

THE EFFECT OF SOURCE TIMBRE ON PITCH REPRODUCTION •

12

Non-musician on sine tone - "it would be more natural to sing an /oo/ or /ee/ sound"



Non-musician on vocal tone - "more natural and repeatable"



Musician on sine tone- "harder to sing because there was an extra step to process the pitch"



Musician on vocal tone - "this sounds like me!"



Musician on all tone - "my perfect pitch meant I rounded everything to diatonic tones"



Singer on sine tone - "easier to match because it was more pure"

THE EFFECT OF SOURCE TIMBRE ON PITCH REPRODUCTION

13

References

1. Murry, T. (1990). Pitch-matching accuracy in singers and nonsingers. Journal of Voice, 4(4), 317-321. 2. Pfordresher, P. Q., & Brown, S. (2007). Poor-Pitch Singing in the Absence of" Tone Deafness". Music Perception, 25(2), 95-115. 3. Strait, D. L., O'Connell, S., Parbery-Clark, A., & Kraus, N. (2013). Musicians' Enhanced Neural Differentiation of Speech Sounds Arises Early in Life: Developmental Evidence from Ages 3 to 30. Cerebral Cortex. 4. Warrier, C. M., & Zatorre, R. J. (2002). Influence of tonal context and timbral variation on perception of pitch. Perception & psychophysics, 64(2), 198-207. 5. Sun, X. (2001) SHRP - a pitch determination algorithm based on Subharmonic-toHarmonic Ratio (SHR) [MATLAB source code]. Available at http://www.mathworks.com/matlabcentral/fileexchange/1230-pitch-determinationalgorithm (Accessed 1 June 2013) - See more at: Sun, X.,"Pitch determination and voice quality analysis using subharmonic-to-harmonic ratio" To appear in the Proc. of ICASSP2002, Orlando, Florida, May 13 -17, 2002. 6. Johnson, M. (2010) Mixed (Between/Within) Subjects ANOVA. Available at http://www.mathworks.com/matlabcentral/fileexchange/27080-mixed-betweenwithinsubjects-anova (Accessed 8 June 2013)

THE EFFECT OF SOURCE TIMBRE ON PITCH REPRODUCTION

14

Tables Table 1 Subject Categorizations Gender Male Female

Number 14 11

Voice Part Bass Tenor Alto

Number 8 6 11

Experience Non-musician Musician Singer

Number 4 16 5 N = 25

THE EFFECT OF SOURCE TIMBRE ON PITCH REPRODUCTION

15

Figures

Figure 1. Sine Tone Stimulus (1.0s) – Linear Fade Envelope at Edges (0.025s), Perceptual

Figure:"Sine"Tone"S-mulus"(1.0s)"V"linear"fade"envelope"at" Loudness -6dB corners"(0.025s),"amplitude"V6dB"(0.5)" Figure:"Sine"Tone"S-mulus"(1.0s)"V"linear"fade"envelope"at" corners"(0.025s),"amplitude"V6dB"(0.5)"

Figure:"Voice"S-mulus"(1.0s)"V"linear"fade"envelope"at"edges" Figure 2. Vocal Tone Stimulus (1.0s) – Linear Fade Envelope at Edges (0.025s), Perceptual (0.025s),"amplitude"normalized"to"V6dB"(0.5)"with"Replay"Gain,"DC" offset"removed" Loudness -6dB, DC Offset Removed

Figure:"Voice"S-mulus"(1.0s)"V"linear"fade"envelope"at"edges" (0.025s),"amplitude"normalized"to"V6dB"(0.5)"with"Replay"Gain,"DC" offset"removed"

THE EFFECT OF SOURCE TIMBRE ON PITCH REPRODUCTION

Phase

Q"&"A"

Resource

Online" ques-onnaire"

SATB"part" samples"

Instruc-ons" Prac-ce"s-muli"set"

Actual"s-muli" set"

Determine" musician/ singer" category"

Choose"SATB" part"

Prac-ce"task" Set"headphones" volume" Ask"ques-ons"

Perform"task"

Participant Task

Researcher

Warm"up" Prac-ce"trials"(4)"

16

Present"

Figure:"Schema-c"of"Experiment"Task"Flow"" Figure 3. Schematic of Experiment Task Flow

Figure 4. ‘Psychopy’ Program – an Example of a Stimulus Trial

Actual"trials"(M"="20)"

Not"present"

THE EFFECT OF SOURCE TIMBRE ON PITCH REPRODUCTION

17

Skip"to"next"test"

Cue"1.5s"

S-mulus"1.0s""

Figure:"One"Cycle"of"Pitch"Reproduc-on"Task" Figure 5. One Cycle of Pitch Reproduction Task

Response"5.0s"

THE EFFECT OF SOURCE TIMBRE ON PITCH REPRODUCTION

Figure 6. Mean Pitch Inaccuracy - Singers

18

THE EFFECT OF SOURCE TIMBRE ON PITCH REPRODUCTION

Figure 7. Mean Pitch Inaccuracy - Musicians

19

THE EFFECT OF SOURCE TIMBRE ON PITCH REPRODUCTION

Figure 8. Mean Pitch Inaccuracy - Non-musicians

20

THE EFFECT OF SOURCE TIMBRE ON PITCH REPRODUCTION

Figure 9. Mean Pitch Inaccuracy – All Groups

21

THE EFFECT OF SOURCE TIMBRE ON PITCH REPRODUCTION

Figure 10. Standard Deviation of Pitch Inaccuracy – All Groups

22