Bayesian and Unsupervised Machine Learning ...

1 downloads 0 Views 2MB Size Report
Duke Ellington, Miles Davis, John Coltrane, Charlie. Parker, Sonny Rollins, Louis Armstrong, Thelonious. Monk. ▷ Input data: • Use mxl files to extract notes in ...
Bayesian and Unsupervised Machine Learning Machines for Jazz Music Analysis Objectives I Could music deliver information tantamount to text? I To what extend do people grasp the meaning behind each piece of music expressed by the composer? I Why music from diverse culture can bring people so many different feelings? I What’s the similarity between music from different culture, or composers, or genres?

Representation

Generative Process I I I I

Text letter word topic document corpus Music note notes* chord song album *

L: number of notes in each measure N: number of measures in each song M: number of songs in the whole album K: number of harmony (key-profiles) = 24

1. Draw θ ∼ Dirichlet(α ) 2. For each harmony k ∈ {1, ..., K } • Draw βk ∼ Dirichlet(η) 3. For each measure un (notes in nth measure) in song m • Draw harmony zn ∼ Multinomial(θ ) • Draw pitch in nth measure xn |zn ∼ Multinomial(βk )

notes in each beat can be regarded as a "word"

Introduction Topic modeling is a text mining model for detection of “topics” hidden in a collect of documents. Here we borrow this tool for music mining, addressing to discover the hidden harmonic structure in the music pieces. Both the Bayesian Approach and modern unsupervised learning via latent Dirichlet allocation have been used for traditional western musical scores and audio tracks detection. In this research work, we explore improvisational (jazz) music on the extension of the work for classical music and show how this unsupervised probabilistic model works for music collections of 7 jazz giants.

Figure: Top 10 Series of Notes for Each Topic

Track the improvisational or solo part via audio music in wave form:

Takeaways I Music and text are two information carriers and emotion deliverers. I Harmonic chord could be learned via topic modeling, in music space. I Human emotion hidden in music could be detected and extracted through the probabilistic model.

Model Comparison Text Mining α

Applilcation I Main Jazz giants studied:

[4]:

θ

z

Duke Ellington, Miles Davis, John Coltrane, Charlie Parker, Sonny Rollins, Louis Armstrong, Thelonious Monk.

η

β

w N

I Input data: • Use mxl files to extract notes in each measure • Based on the concept of duration ( the length of time a

K

M

p(θ, z, w |α, β, η) =

K Y

p(β |η)

M Y

p(θ |α )

m=1

k=1

N Y

p(zn |θ )p(wn |zn, β )





n=1

pitch is sounded), and in each measure the duration is fixed, we can create matrix Collect tensor of matrices to create an album (corpus in text mining accordingly) containing 7 musicians’ songs

Music Mining: Measure-Note Matrix from Charlie Parker’s : α

θ

z

Qiuyi Wu, Ernest Fokoue [email protected]

Rochester Institute of Technology

K

M

p(θ, z, x |α, β, η) =

K Y k=1

p(β |η)

M Y m=1

p(θ |α )

N Y

p(zn |θ )p(xn |zn, β )

Figure: Four measures (2 frames per measure) of the tonality animation in the song [1]

References P. Toiviainen and T. Eerola. MIDI toolbox 1.1. https://github.com/miditoolbox/, 2016. Carol L. Krumhansl. Cognitive foundations of musical pitch. Journal title, 1990. Saul L. Hu, D. A probabilistic topic model for music analysis. International Society for Music Information Retrieval, 2009.

N

Figure: Intuition Behind LDA in Music Piece

I u: notes (observed) I z: chord per measure (hidden) I θ chord proportions for a song (hidden) I α: parameter controls chord proportions I β: key profiles [2] I η: parameter controls key profiles

η

β

u L

Notation (Graphical Model ⇒)

Application (Work in Progress)

M.I. Jordan M. Blei, A. Y. Ng. Latent dirichlet allocation. Journal of Machine Learning Research, 2003.



n=1

where I xn is a V × 1 indicator vector where a series of notes from a certain pitch ∈ {A, A#, B, ..., G#} among 12 in nth measure I zn ∈ {A major, F minor,...,Eb major} is a scalar given 24 key-profiles where zni = 1 for a specific i.

For improvisational tracking from audio music from Diane Hu et. [3], there is one more step — draw chroma-vector cn from the probability distribution:   1 1 p(cn |un, A) = p exp − (cn − Aun ) > Σ−1 (cn − Aun ) 2 (2π )p |Σ|

Contact Information I Web: https://qiuyiwu.github.io/ I Email: [email protected] I Phone: +1 (585) 520 4347