Motion-based machine vision techniques for large

0 downloads 0 Views 331KB Size Report
management Closed Circuit Television (CCTV) systems. ... that work falls in the category of crowd safety research .... aspect is beyond the scope of this paper.
Motion-Based Machine Vision Techniques for the Management of Large Crowds”, IEEE 6th International Conference on Electronics, Circuits and Systems (ICECS '99), September 5-8 1999, pp. 961-964, Cyprus

MOTION-BASED MACHINE VISION TECHNIQUES FOR THE MANAGEMENT OF LARGE CROWDS B. A. Boghossian and S. A. Velastin Department of Electronic Engineering, King's College London Strand, London WC2R 2LS, England, UK {boghos.boghossian, sergio.velastin}@kcl.ac.uk

ABSTRACT The early detection and the prevention of crowdrelated emergencies are the main aims of large crowd management Closed Circuit Television (CCTV) systems. However, environmental and psychological factors play a role in degrading the performance of CCTV operators, hence causing a threat to crowd safety. We present computer vision techniques that estimate the paths and directions of crowd flows in CCTV images and improve the perception of scene dynamics by offering on-line illustrations. Moreover, we present motion-based algorithms to detect crowdrelated emergencies and assist CCTV operators in ensuring crowd safety.

1. INTRODUCTION Closed Circuit Television (CCTV) systems are widely employed by police and other local authorities to monitor public events that involve crowd interactions in confined areas. The early detection, and so the prevention, of crowd-related emergencies are the main aims of CCTV operators. This task is not trivial because the images captured by the distant CCTV cameras are often too small to meet the needs of the investigators. Therefore, zooming is used to scan areas of interest in detail, eliminating the advantages of comprehensive site coverage. Moreover, the available crowd flows are very difficult to trace visually because

crowd movements are slow and restricted by overcrowding. Figure 1 shows a typical CCTV image used by the Metropolitan Police for crowd management. This paper discusses the use of machine vision techniques to capture crowd flows and display visual information to assist CCTV operators in understanding the dynamics of the scene under observation. Moreover, we consider the automated detection of crowd-related emergencies by performing higher level processing on the measured flow paths and their directions. Hence, we present motion-based computer vision techniques that detect: circular flow paths close to site exits indicating trapped crowds. crowd flow diverging from a point to all directions, which might indicate a potential danger (fights, fire etc.). obstacles in the flow paths that might correspond to injured pedestrians or deliberate flow disturbances. By adopting a motion-based approach all processing is performed in motion space. Transformation from image space to motion space is achieved by on-line block-matching motion detection that is realised via specialised hardware. This approach benefits from: The reduction of the input data and consequently the processing cost by 128-fold where motion detection is performed with blocks of 8x8 pixels on pairs of video frames. The increase in the dimensionality of the input data to include speed and direction information. The paper is organised as follows: section 2 presents the system architecture, section 3 illustrates the methods adopted to extract the crowd flow paths and their directions, section 4 presents the incident detection algorithms developed and section 5 discusses the experimental results.

2. SYSTEM ARCHITECTURE

Figure 1: New Year's eve celebrations at Trafalgar square in London (a typical CCTV camera image employed for crowd management).

The system consists of a Pentium 166 PC fitted with a black and white video digitiser (256 grey scale) and a motion detection board developed by the authors [1][2]. Figure 2 shows the operation configuration. We realise real-time block-matching motion detection via specialised hardware that operates on images of dimensions 512x512 pixels. Image

Motion-Based Machine Vision Techniques for the Management of Large Crowds”, IEEE 6th International Conference on Electronics, Circuits and Systems (ICECS '99), September 5-8 1999, pp. 961-964, Cyprus

processing algorithms are applied to the continuously

CCTV operator

Camera

Improved display and detection alarms

PX 500

STi 3220

Pentium 166Mhz

Video Digitizer

Motion Detector

Image Processor

Figure 2: System architecture

updated motion vectors to perform feature extraction from motion. The aspect of real-time processing is vital in this application, where prompt detection of emergencies is important to allow effective action to be taken. The processing rate for the system presented varies between 6 and 16 Hz depending on the complexity of the algorithm running. Due to the slow rate of change in scene events and the nature of the situations of interest, these figures are considered to be within the real-time processing requirement.

3. DETECTION AND SEGMENTATION OF CROWD FLOW Other researchers in the field of machine vision [5-9] have investigated the automated estimation of crowd motion, flow, density and behaviour. Detection of abnormal behaviour, estimation of crowd density and classification of their motion directions are the most common objectives of such systems. However, most of these systems assume normal or low-crowding levels and all of them operate on scenes with small crowds in indoor environments. Very limited literature exists on the management of large crowds and most of that work falls in the category of crowd safety research rather than automated on-line crowd management. The interaction of large crowds in highly crowded environments can be classified as non-rigid elastic motion whose only constraint is some degree of continuity [3][4]. Here we adopt a non-model-based approach and assume a small degree of continuity in the crowd flow to allow noise suppression. We use the block-matching motion detection technique to track image pixels as features for correspondence through successive frames. Motion detection is performed on two video frames that are separated in time by several frames (typ. seven) to capture the very slow movements of individuals within the crowd. This technique is highly sensitive to image noise due to the Mean Absolute Error (MAE) criteria used to establish pixel correspondences. Moreover, motion inconsistencies within the flow are experienced due to

variations in the velocities and directions of individuals within the crowd. As a first stage after motion detection we proceed by eliminating outlying motion vectors by assuming some degree of smoothness and applying a robust spatial filter with a 3x3 window. Some sudden disturbances due to signal glitches or objects obstructing the field of view were experienced in the video sequences examined; therefore a stabilising stage was introduced to ignore any global or sudden changes in the motion information. Motion inconsistencies are smoothed by nonlinear temporal operations that accumulate the motion information from successive iterations and perform neighbourhood smoothing. Finally, region-growing segmentation is employed to separate different flows based on their directions and positions. Colour coding is used to label different crowd groups and their directions; hence “arrows” (or needles) are superimposed on the digitised CCTV images to illustrate scene dynamics on-line, as shown in Figure 3.

4. DETECTION OF CROWD-RELATED EMERGENCIES The performance of the CCTV systems in detecting potentially dangerous situations relies on the performance and experience of the people operating it. CCTV operators are certainly fallible and according to research carried out at the UK’s Home Office [10], their attention span lies between thirty to fifty minutes after which “video-blindness” degrades their performance. Therefore, automated detection of crowdrelated emergencies will assist CCTV operators to improve the system performance and will improve crowd safety. The following subsections present the algorithms developed for the detection of potentially dangerous events. 4.1

Detection of circular flow paths Circular flow paths originate close to scene exits when large crowds attempt to evacuate the scene. Scene exits act as bottlenecks for the large crowd flow, thus when different flows meet at exits, some are pushed back into the scene creating circular flow paths. We present a circular motion detection-technique based on a computationally inexpensive Hough voting scheme that identifies the centre of the circular motion as a peak in Hough space. The optical flow field is allowed to be noisy and partly incorrect since the Hough transform is robust towards noisy and partly incorrect data. A motion vector belongs to a circular flow if it is a tangent to the circle and points towards the flow direction. In other words, the normal to the tangent vector passes through the centre of the circular flow. To detect the clockwise and anticlockwise circular flows separately, two accumulator spaces are needed. Therefore, each motion vector in the measured crowd flow field contributes a line to each of the two Hough spaces. These lines are normal to the direction

Motion-Based Machine Vision Techniques for the Management of Large Crowds”, IEEE 6th International Conference on Electronics, Circuits and Systems (ICECS '99), September 5-8 1999, pp. 961-964, Cyprus

of the corresponding motion vector with angles of -90° and +90° for the clockwise and the anticlockwise spaces respectively. The length of these lines is set to equal the radius of the largest expected circular flow to reduce ambiguity. Hence, as peaks in the Hough space are detected, audible alarms are triggered with a graphical indication to the position of the detected incident. Figure 3 captures a clockwise circular flow and shows the detected peak in Hough space. At each cycle, the two Hough spaces are updated, peaks are identified and the “HOUGH SPACE” window is redrawn in 12 milliseconds, allowing prompt detection of any emergency situation. 4.2

Detection of diverging flows The outbreak of local threats to crowd safety, like fights or fire, is manually detected by spotting the diverging crowd flow from that location. Technically, the same approach as is followed in subsection 4.1 above can be employed to detect these flow patterns. Hence, a peak in Hough space would represent the centre of a diverging flow if each motion vector in the measured crowd flow field maps to a line extending towards its opposite direction in Hough space. In other words, all motion vectors pointing away from a point will contribute to a peak in Hough space at that point. This allows us to detect such diverging motion patterns and alert the operators via audible and visual means. 4.3

Detection of obstacles Stationary pedestrians or objects obstructing the flow paths are visualised as motion free regions surrounded by homogeneous flow. We perform a region-growing segmentation to group the motion free regions in the scene, thus marking the isolated stationary regions as obstacles in the flow path. Figure 3 shows the detection of police flow control units as obstacles in the crowd flow. A delay in the detection of such regions is introduced to calculate a confidence measure that reduces uncertainty in the detection, hence reducing the false positive detection rate and improving reliability.

5. EXPERIMENTAL RESULTS AND DISCUSSION The performance of the proposed system was evaluated on live-VTR (pre-recorded) video sequences, hence it should be noted that a VTR source normally has a worse signal-to-noise ratio (SNR) than a live camera (we have typically measured a reduction in SNR by 9dB at playback). Thus evaluation using a VTR provides a realistic worst-case scenario. Moreover, live-camera video sequences were used as means to generate synthetic test data. The accuracy of the detected crowd flows is evaluated as a function of the number of skipped video frames in the motion detection scheme. For the scene under observation, selecting a gap of 280 milliseconds (which corresponds to 7 skipped video frames in

CCIR/PAL mode) between the two captured video frames yields accurate representations of the available flows. However, automation can be introduced by engaging a mechanism that adapts the detection gap length to the available motion in the scene, but this aspect is beyond the scope of this paper. By selecting the minimum allowed flow size (in pixels), the observer can control the level of detail in which the crowd flow information is represented, hence concentrating on the main dominant flows in the scene and ignoring the less significant information. The emergency incident detection algorithms could not be exhaustively evaluated on real examples because these incidents are uncommon and so, a few examples are available. Hence, synthetic data is also used to integrate the evaluation process and give measures of reliability and performance. Each of the proposed algorithms is tested on real and synthetic data and their performance and drawbacks are discussed as follows: The circular flow detection process does not perform well in cases where the crowd flows are circulating around elliptical paths. These paths might be observed due to either the viewing perspective or other environmental factors. Therefore, the elegant peaks in Hough space are scattered causing detection uncertainties. However, none of the real examples tested presented elliptical flow paths, hence all were detected accurately by the system. The performance of the diverging crowd detection process was evaluated on synthetic data only where no real examples were available. At this stage of evaluation, the algorithm detected all incidents correctly. The confidence measure introduced to reduce uncertainty in the detection of obstacles in the flow path (section 4.3) has successfully eliminated the false detection cases. However, the algorithm performance was degraded by the increase in the true negative detection rate due to the delay introduced in the detection. Therefore, a direct trade-off exists between the level of confidence offered and the delay in the detection, making the fine-tuning process of these parameters subjective to scene parameters and end-user requirements.

6. SUMMARY AND CONCLUSIONS The flow of large crowds in CCTV images is estimated and flow paths are illustrated on-line as an attempt to assist observers in understanding scene dynamics. Moreover, automated detection of crowdrelated emergencies is considered as a step towards the improvement of the overall CCTV systems performance. The algorithms presented proved practical in aspects of prompt detection of incidents and performance reliability. However, performing live demonstrations on site would contribute more to the understanding of the system performance.

Motion-Based Machine Vision Techniques for the Management of Large Crowds”, IEEE 6th International Conference on Electronics, Circuits and Systems (ICECS '99), September 5-8 1999, pp. 961-964, Cyprus

7. ACKNOWLEDGEMENTS The authors would like to thank the Metropolitan Police Service for providing the necessary video footage to design and test the machine vision algorithms presented.

8. REFERENCES [1] B. A. Boghossian, S. A. Velastin ‘Real-time motion detection of crowds in video signals’, IEE Colloquium on High Performance Architecture for real-time image processing, pp. 12/1- 12/6, February 1998. [2] B. A. Boghossian, ‘Real-time motion detection in video signals’, MSc thesis, Department of Electronic Engineering, King’s College London October 1997. [3] C. Kambhamettu, D. B. Goldgof, D. Terzopoulos, T.S. Huang, ‘Nonrigid motion analysis’, In Handbook of PRIP: Computer vision, vol. 2, 1994. [4] J. K. Aggarwal, Q. Cai, W. Liao, B. Sabata ‘Articulated and elastic motion: A review’, Proc. of the Workshop on Motion of Non-rigid and Articulated Objects, pp. 2-14, November 1994, Austin, Texas. [5] A. C. Davies, J. H. Yin, S. A. Velastin, ‘Crowd

[6]

[7]

[8]

[9] [10]

Monitoring Using Image Processing’, Electronics & Communication Engineering Journal, vol.7, No 1, pp. 37-47, February 1995. S. Bouchafa, D. Aubert, S. Bouzar, ‘Crowd Motion Estimation and Motionless Detection in Subway Corridors by image processing’, IEEE Conference on Intelligent Transportation Systems 1997 (ITSC’97), pp. 332-337. S. Regazzoni, A. Tesei, V. Murino, ‘A real-time vision system for crowding monitoring’, Proceedings of the International Conference on Industrial Electronics 1993 (IECON’93), vol.3, pp. 1860-1964. T. Coianiz, M. Boninsegna, B. Caprile, ‘A Fuzzy Classifier for Visual Crowding Estimates’, IEEE International Conference on Neural Networks, Vol.2, pp. 1174 -1178, June 1996. J. H. Yin, ‘Automation of crowd data-acquisition and monitoring in confined areas using image processing’ Ph.D. thesis King’s college London September 1996. E. Wallace, C. Diffley, ‘CCTV control room ergonomics’, Published by Police Scientific Development Branch of the Home Office, Publication No 14/98.

B

A

Figure 3: Scene dynamics are illustrated via colour coded arrows that are displayed on-line to assist CCTV operators in detecting abnormal flows. (A) Illustrates the detection of circular motion as peaks in Hough space. (B) Illustrates the detection of obstacles in the flow path.