Paper Title (use style: paper title)

0 downloads 0 Views 610KB Size Report
Speeds of Videos Considering Speeds of Objects and Background ..... speeds v1 and v2 is the average value of v1 and v2 as shown in. Equation (2);Β ...
Speeds of Videos Considering Speeds of Objects and Background Hayato Kumagai

Teruhisa Hochin

Hiroki Nomiya

Graduate School of Information Science Kyoto Institute of Technology Kyoto, Japan e-mail: {hochin, nomiya}@kit.ac.jp Abstractβ€” This paper tries to obtain the speed of a video, where one or two objects and a background move with various speeds and in various directions. In this study, we measure their speeds by using the beats per minute (BPM). We find out the relationship of objects and a background through several experiments. In the case that only objects move, the average speed of objects becomes the speed of a video. When objects and the background move at the same time, the speed of the background influences the speed of the video. We find that the influence of the background speed decreases according to the increase of the object speed. Keywords- Speed; videos; object; background;

I.

BACKGROUND

In recent years, various multimedia data exist because of the advances of the Internet, computers, and something concerning about them. Video posting sites such as YouTube allow casual people to upload their videos and music pieces. In this case, they are generally combined with considering harmony to make them more impressive. Harmonies between audio and visual materials have been studied [1]. It is said that the semantic harmony and the temporal harmony exist, and they are important for general harmony. If harmony is very high, audiovisual materials tend to impress people, and enhance impression [1]. Synchronizing music pieces and videos is, however, very difficult. Moreover, people may have to find audio materials and/or visual ones before they try to match them each other. In order to synchronize a video and a music piece, the speed of a video must be determined. Determining the speed of a video is not easy when objects move in different directions, or the background moves. Moreover, the sizes of objects and the patterns of the background may affect the speed of a video. This paper tries to obtain the speed of a video clip. In this study, we measure the speeds by using the beats per minute (BPM). Objects and a background in a video move with various speeds and in various directions. We can find out the relationship of objects and a background through several experiments. In the case that only objects move, the average speed of objects is the speed of a video. When objects and the background move at the same time, the speed of the background influences the speed of a video. We find that the

more the speed of object increases, the less the influence of the background becomes. The remainder of this paper is organized as follows. Section 2 shows related works on interaction of sight and hearing, and harmony of music and video clips. Section 3 clarifies the relationships of objects and the background of a video. We show how to calculate the speed of a video from the speeds of objects and that of the background. Section 4 gives some considerations. Finally, Section 5 concludes this paper.

II.

RELATED WORKS

A. Harmony of Music and Movie It is said that semantic harmony and temporal harmony exist between music and video clips [1]. These two harmonies are explained as follows. 1) Semantic harmony: Semantic harmony is the harmony caused by consistency of mood and symbolic similarity of music pieces and video clips. A pair of a bright music piece and a bright video is an example of consistency of mood. A pair of a song of a pigeon and a video of a pigeon is an example of symbolic similarity. 2) Temporal harmony: Temporal harmony is the harmony on the temporal aspects of a music clip and a video one. Temporal harmony can be divided into two categories: the harmony of accent structures, and the harmony of speeds. The harmony of accent structures is the synchronization of accents of a music clip and those of a video clip. The harmony of speed is the fitness of the tempo of a music clip and the speed of a video clip. The relationships and the effects of these harmonies are shown in Fig. 1. Fig. 1 shows that the harmony of accent structures affects the harmony degree and the impact of the audiovisual material. The intense part of the movement [5], the portion where the direction of a movement is changed, or the one where there is no movement is considered as an accent of a video clip. The portion including large amplitude is considered as an accent of a music clip. Starting positions of music, and video clips, and finishing positions of them are considered to be important for the accent structure.

Synchronization feeling

Accent structure of sound Collation Accent structure of the video

Impact Speed of movie Balance of speed Light feeling

Speed of music

Complexity

Fig. 1

Fig. 2

Effect of temporal harmony on the impression [1]

Examples of the three shot motion classes [3]

B. Classifying the film Adams classified the film by the motion behaviors: No Motion, Fluid Motion, and Staccato Motion [3]. The No Motion class is self-explanatory. A fluid transition dictates continuous change from one scene to another. The change between two states is taken place without jumping. The textures are slowly changed. On the contrast, staccato transitions draw attention to themselves by pronounced change. They are discrete and abrupt. Fig. 2 shows the magnitude of the motions of three samples: No Motion, Fluid Motion, and Staccato Motion. The motion classes are defined as follows: No Motion: Fluid Motion:

|π‘š οΏ½ | < 𝑇1

οΏ½

οΏ½οΏ½οΏ½οΏ½οΏ½ πœ•πœ•

Staccato Motion: οΏ½

πœ•πœ• οΏ½οΏ½οΏ½οΏ½οΏ½ πœ•πœ• πœ•πœ•

οΏ½ < 𝑇2

οΏ½ > 𝑇2

where π‘š οΏ½ denotes an average motion vector. The classification of the motion behavior is divided into two steps. Shots are first labeled either No Motion or Motion. If they are Motion, they are classified as either Fluid or Staccato Motion.

The average motion, which is the sum of pan and tilt by the camera motion, is used to decide whether the shot has motion or not. A threshold is set high enough to account for any small object contributions or slight framing corrections, but low enough to identify real camera movements. The shots that contain motions are classified as Fluid or Staccato Motion by the following method. First, for the shots, the first derivative is calculated. The average of that magnitude is taken for shots in order to normalize the values for shots that have different lengths. This value is the threshold to decide whether a shot includes Fluid or Staccato Motion. C. Rhythmic Information Analysis in Movies Suzuki proposed a method to extract rhythmic information from a video clip [4]. There are two types of rhythm: a fixed rhythm and a free one. Almost all of rhythms of videos are classified as the free rhythm, and characterized by their temporal accents. S. M. Eisenstein’s montage theory to classify the rhythmic presentation elements of videos based on shot lengths, movements, texture, and their combination. In that theory, the montage of a video clip is divided into four parts. 1) Metric montage: shot length 2) Rhythmic montage: movement of objects and camera 3) Tone montage: shadow / texture 4) Overtone montage: combination of 1) to 3) In order to examine a video clip, shot lengths, optical flow, spatial frequency, and brightness difference of two frames are obtained. But in conclusion, only spatial frequency is the most effective for analyzing a video clip. In the following, words that represent the size category of the subject such as a long shot (the distance between the subject and the camera is very far) and close-up (the subject is close to the camera) are referred to as the field size. The film β€œBattleship Potemkin” is used to analyze the rhythm of the movie. In that film, the scene β€œOdessa harbor” and β€œOdessa steps” are used. The former is a quiet scene characterized by shading and texture. The latter is a fierce dynamic scene and close-up creates a dynamic rhythm sense. The results of spatial frequency in each scene are shown in Fig. 3. In Fig. 3, the vertical axis and the horizontal one represent the power spectrum and the frame number, respectively. There are seven line graphs, each of which is of the sum of power spectrums of a certain frequency, in Fig. 3 (a) and (b). In the scene titled β€œOdessa harbor,” the power spectrum is generally stable except for the beginning and the last of the part. In particular, the value of the high-frequency area increases around the 800th frame. It is considered that close-up makes the focus clear. In the scene titled β€œOdessa steps,” the value changes dynamically. The texture of the grand staircase plays the role of the dominant and how much of the visible particle size in each field size is reflected in the power spectrum. As an interesting characteristic, in the beginning scene that the umbrella of a woman closes to the camera, it is observed that the power spectrum value draws a characteristic curve corresponding to that scene. Thus, when the object moves to the camera direction, or the camera is

TABLE I SPEEDS OF OBJECTS HAVING VARIOUS SIZES

The power spectrum The sum of power spectrums of the lowest frequency

The sum of power spectrums of the highest frequency

Frame num

Size of the circle x 2x 3x 4x 5x 6x

BPM 106 99 100 93 78 93

(a) β€œOdessa harbor” The power spectrum The sum of power spectrums of the lowest frequency

The sum of power spectrums of the highest frequency Frame num

Fig. 4

. (b) β€œOdessa steps” Fig. 3

Spatial frequencies (a) β€œOdessa harbor” and (b) β€œOdessa steps”

operated, the value of spatial frequency changes since the particle size of the texture continuously changes. The spatial frequency analysis is effective not only for the analysis of the texture but also for the analysis of the field size and the movement. III.

SPEEDS OF VIDEOS

Speeds of videos are measured by using the beats per minute (BPM). This means the number of beats per minute. It is commonly used to measure the tempo of a music piece. Beats, however, include the numbers of stepping or clapping in a broad sense. People would step or clap while seeing a dance. In such a case, beats are decided by impression of people. Therefore, it is considered that using beats to measure the speed of a video is natural. Some experiments were conducted to measure the BPM of videos. A Dell monitor, whose resolution is 1600Γ—900 and size is 19.5 inches, was used. Videos were created by using PowerPoint 2010. In PowerPoint 2010, the default slide size was used. The width is 25.4cm, and the height is 19.05cm when a slide is printed. If the resolution differs, the size when we see it in the monitor differs. So, in the following figures, the unit is not shown. In this experiment, the shape of an object is a circle. One of the authors, who is a male, joined the experiment.

BPM of various speeds of four kind of backgrounds

A. Effect of the size of an Object First, the influence of the size of an object is examined. This is because it is said that the size of the moving object influences the impact of a video [1]. In the experiment, the size of a circle is changed by six steps. Supposed that the minimum size is x, the sizes are x, 2x, 3x, 4x, 5x, and 6x. Here, for example, 3x means three times larger than x. A circle moves left to right in the horizontal direction. The BPM is measured by tapping a smartphone. The speed of a circle is 6.0 cm/sec. The result is shown in Table I. Almost all of BPMs are around 100 regardless of the speed of the circle. The result shows that there seems to be no relationship between the size and the BPM of an object. B. BPM of an Object or a Background Next, we examine the speed of the background. As the background, we use four kinds of patterns: polka dots, right diagonal stripes, horizontal stripes, and grid patterns. Speeds of these four backgrounds are changed. A background moves from left to right with 2.54 and 5.08 cm/sec, and from the bottom to the top with 1.95 and 3.81 cm/sec. The result of the experiment is shown in Fig. 4. As a result of experiment, the BPM depends on the speed of the background, whereas it does not depend on the kind of background. From here on, the grid pattern is used as the background.

Fig. 7 Fig. 5

Fig. 6

The average of all values

BPM of various speeds and directions of objects

BPM of various speeds and two directions of a background

Next, the speed of a circle is changed. In order to clarify the effect of the direction of the movement, a circle moves in three directions: vertical, horizontal, and circular ones. In the same way, the BPM of the background is also measured when the background moves. The moving directions are horizontal and vertical. The results of the experiments of the circle and the background are shown in Fig. 5 and in Fig. 6, respectively. From Fig. 5 and Fig. 6, it can be seen that the BPM increases according to the speed, and that the BPM does not depend on the moving directions. The BPM of the circle and that of the background tend to take similar values. The average values of the BPMs of the circles moving in horizontal, vertical, and circular directions, and the BPMs of the backgrounds moving in horizontal and vertical directions are shown in Fig. 7. Apparently, the BPM is proportional to the speed. The linear approximation line of the average values is also drawn in Fig. 7. It is formulated as Equation (1). BPM = 11.5speed + 34.1

(1)

The BPM of only one moving object or a background can be calculated through Equation (1).

C. BPM of Moving Two Objects Next, we measure the BPM when two circles move at the same time. In this experiment, two circles move with different speeds. This is because it is considered that the speed of the circles moving with the same speed is the same as that of one circle. The speeds adopted in the experiment are those of a circle in crossing the screen in 3.0 and 6.6 seconds. These are 8.47 and 3.83 cm/sec, respectively because the width of the screen is 25.4 cm as described before. Moreover, two circles move in the same direction, or the opposite one. In order to clarify the influence of the sizes, it is changed in five steps. The same sizes and the different ones of the circles are examined. For the different sizes, the size of a circle varies from x to 5x, while that of the other is fixed to 5x because there are too many combinations if the sizes of both circles are varied. The result is shown in Table II. In Table II, the first column (second, respectively) shows the size of the left (right) circle. The third (forth, respectively) column shows the speed of the left (right) circle. The fifth (sixth, respectively) column shows the BPM when two circles move in the same (opposite) direction. Regardless of the size and the direction, the BPM takes around 100. This makes the hypothesis that there is no relationship between the size and the direction of moving objects. TABLE II BPM OF TWO MOVING OBJECTS

Size of circle left right x x 2x 2x 3x 3x 4x 4x 5x 5x x 5x 2x 5x 3x 5x 4x 5x

Speed [cm/sec] left right 8.47 3.83 8.47 3.83 8.47 3.83 8.47 3.83 8.47 3.83 8.47 3.83 8.47 3.83 8.47 3.83 8.47 3.83

same 106 106 108 105 107 109 110 106 100

BPM opposite 108 106 106 105 109 107 106 107 97

From Equation (1), when speeds are 8.47 and 3.83 cm/sec, the BPMs become 131.5 and 78.1, respectively. The average of these values is 104.8. This value is very close to those shown in Table II. This result gives a hypothesis that when two objects move at the same time, the average of the BPMs would be the BPM of the video. This means that the speed of a video where two objects move with different speeds v1 and v2 is the average value of v1 and v2 as shown in Equation (2); 𝑣1 + 𝑣2 (2) 𝑣𝑣𝑣𝑣𝑣𝑣 = 2

D. The Effect of the Background Speed Here, the speed of the circle is fixed, and that of the background is changed. The directions of the movement of a circle are horizontal, vertical, and circular. Those of the background are horizontal and vertical. The direction of horizontal (vertical, respectively) movement of the circle is the same as that of horizontal (vertical) one of the background. To investigate the effect of background speed when the circle moves at the fixed speed, the number of the measure of the BPM is fifty six or thirty times. The speed of the ball is 7.24 cm/sec. The result that the background moves in the horizontal (vertical, respectively) direction is shown in Fig. 8 (Fig. 9).

Fig. 10

Fig. 11

The averages values shown in Fig. 8 and Fig. 9

The averages of the values shown in Fig. 10

The averages of the values shown in Fig. 8 and in Fig. 9 are shown in Fig. 10. These values are very close. The averages of the values shown in Fig. 10 are shown in Fig. 11. Fig. 11 shows the influence of the background speed on the BPM when the speed of the circle is 7.24 cm/sec. The linear approximation equation of the values shown in Fig. 11 is shown in Equation (3).

Fig. 8

BPM of various speeds and horizontal direction of background

𝐡𝐡𝐡𝑣𝑣𝑣𝑣𝑣 = 3.6 𝑠𝑠𝑠𝑠𝑠𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 + 117.5

(3)

Here, the value of 117.5 means the BPM when the background does not move. This value can be obtained from Equation (1). By substituting 7.24 cm/sec to Equation (1), 117.4 is obtained. This is almost the same as 117.5. This shows that Equation (1) agrees with Equation (3). The value of 3.6 is the slope of a line. It indicates the degree of the influence of the background speed. Equation (3) can be generalized to Equation (4). 𝐡𝐡𝐡𝑣𝑣𝑣𝑣𝑣 = Slope βˆ— 𝑠𝑠𝑠𝑠𝑠𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 + 𝐡𝐡𝐡𝑠𝑠𝑠𝑠𝑠𝑠_𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏

Fig. 9

BPM of various speeds and vertical direction of background

(4)

Here, BPMstatic_background means the BPM of a circle when the background does not move, which can be calculated by Equation (1).

an object moving fast dominates the speed of a video. When the speed of an object becomes slow, the speed of a background affects the speed of a video. This may agree with our impression of a video. V.

Fig. 12

Values of slopes against speeds of a circle

E. Total Effect Similar experiments were conducted for various speeds of a circle: 2, 4, 6, 7.24, 8, 10, 12, and 14 cm/sec. The values of slopes obtained are shown in Fig. 12. The value of the slope decreases according to the speed of the circle. The linear approximation line shown in Fig. 12 is formulated in Equation (5). Slope = βˆ’0.23 π‘ π‘ π‘ π‘ π‘ π‘œπ‘œπ‘œπ‘œπ‘œπ‘œ + 5.45

(5)

By using Equation (5), the value of the slope can be obtained when the speed of an object is obtained. And if the speed of the background is determined, the BPM of a video can be calculated by using Equation (4). F. Application Example Let us consider the case that the speed of an object is 5 cm/sec and that of the background is 13 cm/sec. In this case, the value of the slope is obtained as 4.3 from Equation (5). The BPM of the object when the background does not move is obtained as 91.6 by using Equation (1). Therefore, the BPM of this case can be calculated as shown in Equation (6). 𝐡𝐡𝐡𝑣𝑣𝑣𝑣𝑣 = 4.3 𝑠𝑠𝑠𝑠𝑠𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 + 91.6

(6)

As the speed of background is 13 cm/sec in this case, the BPM of the video becomes 147.5. IV.

CONSIDERATION

The experiment showed that the size of an object did not affect its BPM. It may be natural because the speed of the object is the same. The speed of an object and that of a background can be obtained by using the same equation, which is Equation (1). This shows there is no distinction between an object and a background in calculating its speed. The speed of a video where two objects move seems to be the average speed of the two objects. Other experiments using more objects and various speeds are required. The influence of the background speed decreases according to the increase of the object speed. This means that

CONCLUSION

This paper tried to obtain the speed of a video. In this study, we measured the speeds by using BPMs. Objects and backgrounds in videos moved in different directions. We could find out the relationship of objects and backgrounds through several experiments. In the case that only objects move, the average speed of objects becomes the speed of a video. When objects and the background move at the same time, the speed of the background influences the speed of a video. We found the more the speed of object increases, the less the influence of the background becomes. In the experiments, we used only simple videos such that a circle moves in the horizontal, vertical, or circular direction. We should try many patterns of the movements. This is in future work. The speeds of a circle and the background were fixed. We should try the acceleration, deceleration, stop, and so on. It is also included in future work. Only one subject joined the experiment. Other experiments joined by many subjects should be conducted. This is included in future work. In general, when people create audiovisual materials, they would like to create more complex materials such as an animation, a film, or a video in which characters or people move complicatedly, and the background contains various directions of objects such as flowers, trees, cloud, and so on. Our goal is to obtain the speed of these videos. This will help people, who eager to create excellent audiovisual materials, retrieve the appropriate videos. ACKNOWLEDGEMENT This work was partially supported by JSPS KAKENHI Grant Number 16K00370. REFERENCES [1]. S. Iwamiya, Multimodal communication of music and video, University Kyushu Publication, Fukuoka, 2000. [2]. C/C++ language sample program for tempo analysis of an audio file (online), Link, . (reference February 15, 2016) [3]. B. Adams, β€œAutomated film rhythm extraction for scene analysis,” Proc. of IEEE Int’l Conf. on Multimedia and Expo, 2001. [4]. R. Suzuki, β€œRhythmic information analysis in movies,” Unisys technology review, vol. 71, 2001. [5]. T. Hochin, W. Xue, β€œSynchronizing Music and Video of Query Results in Cross-Media Retrieval System,” KES 2007, LNAI 4693, pp. 793-800, 2007. [6]. H. Kumagai, β€œSynchronization Method for Improving for Temporal Harmony of Music and Video Clips,” Proc. of the 3rd Int’l Conf. on Applied Computing & Information Technology (ACIT 2015), pp. 251-256, 2015.