AN OPTIMUM FRAMEWORK FOR ENTITIES ... - IEEE Xplore

0 downloads 0 Views 126KB Size Report
AN OPTIMUM FRAMEWORK FOR ENTITIES TRACKING IN POPULATIONS. Pavel Loskot. College of Engineering, Swansea University,. Swansea, United ...
AN OPTIMUM FRAMEWORK FOR ENTITIES TRACKING IN POPULATIONS Pavel Loskot College of Engineering, Swansea University, Swansea, United Kingdom, SA2 8PP E-mail: [email protected] ABSTRACT The entities tracking within populations is a task encountered in many scenarios such as observations of the biological cells, studying behaviors of crowds, and evaluating transactions on the Internet. This paper outlines the optimality of a tracking process where associations between the entities in consecutive populations observed at discrete time instances must be determined. As the sub-optimum tracking methods are prone to the propagation of association errors, the optimum tracking is defined as a Maximum A posteriori Probability (MAP) or a Maximum Likelihood (ML) estimation problem over the set of time-varying attributes associated to each entity in the population. A subset of these attributes can be then also used to evaluate the characteristics of individuals in the population as one direct application of the entities tracking. Index Terms— Decision theory, entity tracking, graphs, graph matching, optimum estimation. 1. INTRODUCTION The entities tracking is a necessary pre-processing step towards characterizing the populations of entities. The populations considered in practical scenarios can be large (typically, up to tens of thousands of entities). This has led to a number of practical but the sub-optimum tracking schemes [1–4]. Specifically, the image-processing based multi-target tracking of the biological cells is developed in [1]. The tracking scheme in [1] minimizes a weighted sum of all the cells’ regions, boundaries and predicted motions functionals. The tracking in [2] is achieved by modeling the locomotive forces acting on individuals from their surrounding entities. The video-processing based tracking of crowded populations is devised in [3] assuming the multi-modal motion models of the crowds. The recent advances in cytometry for tracking the biological cells are outlined in [4]. In this paper, our aim is to formulate an optimum tracking of entities in the populations that are observed in discrete time instances. We assume that each entity in the population can be described by a set of measurable attributes, and adopt a measurement model involving additive noise. We also assume that attributes of only those entities located in the observable space are measurable at any given time instant. These observable entities form a sub-population. The attributes representing random or non-random processes are then used to estimate associations of the entities between the current and previous observed sub-populations. The entities randomly enter or leave the observation space, i.e., join or leave the sub-population, respectively. Knowledge of the statistical dependence of measurements conditioned on the random attributes, or parametrized by the nonrandom attributes is used to obtain the jointly optimum estimators of entities associations across discrete time observations.

978-1-4799-2890-3/14/$31.00 ©2014 IEEE

Mathematical model of the population observations is given in Section 2. The jointly optimum tracking of the population entities is obtained in Section 3. Implications of the optimum tracking to some applications is discussed in Section 4. Finally, Section 5 concludes the paper, and provides hints about possible future work. 2. POPULATION TRACKING A population is formed by a collection of Ntot distinct entities. Each entity is described by a set of measurable attributes from a space A. The attributes represent a mixture of possibly mutually correlated time-varying processes with continuous or discrete values. We assume that the population is only observed at discrete time instances t0 < t1 < · · · , so that, using a suitable base [5], the continuous time attributes A(t) ∈ A can be decomposed into a sequence of discrete time samples A(0), A(1), . . . corresponding to the time instances t0 , t1 , . . ., respectively. It is further assumed that at each observation time tn , only a subset of Nn ≤ Ntot entities representing a sub-population Pn ⊆ P can be observed. The entity is observable at time tn , provided that all its attributes of interest Ai (n) ∈ A, i = 1, 2, . . ., can be measured. We denote as Xi (n) the measurement of attribute Ai (n). Note that if Ai (n) at time tn is not measurable, the value of Xi (n) is undefined. The main objective of population tracking can be defined as follows. Population tracking objective: Given a sequence of observed sub-populations Pn , n = 0, 1, . . ., with the corresponding sets of un-ordered measurements Xi (n) ∈ X (n), reconstruct the ordered, but possibly incomplete sequences [Xi (0), Xi (1), . . . , Xi (n)] ≡ Xi for the attributes of interest Ai ∈ A. The sequences Xi of discrete time random processes Xi (n) are incomplete, since some measurements may be missing; the positions of missing samples are estimated in the tracking process. Hence, consider the observed sub-populations Pn−1 and Pn in Fig. 1 containing |Pn−1 | = Nn−1 entities (in quadrant II in Fig. 1) and |Pn | = Nn entities (in quadrant I) at time tn−1 and tn , respectively. The unobserved sub-populations P\Pn−1 (in quadrant III) and P\Pn (in quadrant IV) are estimated to have the size Ln−1 and Jn , respectively. Thus, Ln−1 entities are assumed to have ‘left’ the previous sub-population Pn−1 (i.e., they are not observable at time tn ), whereas Jn entities are assumed to have (re-)joined the current sub-population Pn (i.e., there were not observed at time tn−1 ). In order to reconstruct the sequences Xi , one has to associate entities between the sub-populations Pn−1 and Pn , for n = 1, 2, . . .. For each new sub-population Pn , we may identify up to Jn new entities (likely) not present in the sub-population Pn−1 (or less than Jn new entities if some of them had been already identified in some previous sub-populations Pn−i , i = 2, 3, · · · ). Similarly, at time tn , we may identify Ln−1 entities within the sub-population

2567 1257 643 630 602

ISCCSP 2014

Pn−1 that are (likely) not present in the currently observed subpopulation Pn . Consequently, the number of entities in the observed sub-populations can be related as, Nn−1 − Ln−1 + Jn = Nn

≤ ≤

Ln−1 Jn

≤ ≤

Nn−1 Nn

and (·)+ changes its argument to 0 if it is negative. The total number of observed entities within the whole population P up to the current time instant tn can be upper bounded as (cf. Fig. 2), Ntot ≤

n 

x(n) = S(n) (A(n) + w(n))

(1)

T

where (Nn−1 − Nn )+ (Nn − Nn−1 )+

the Nn measurements xi (n) of the attribute Ai (n) in the observed sub-population Pn at time tn can be expressed as,

Ji

i=0

where the right-hand side assumes that none of the Ji joining entities had been observed previously, and note that the initial subpopulation P0 consists entirely of J0 newly joined entities. The tracking of entities through the observed sub-populations corresponds to associations of the entities in the currently observed sub-population Pn to the entities in the previously observed subpopulation Pn−1 . Such associations can be represented as a bipartite graph matching problem. In general, a graph matching is a set of graph edges that have no graph vertices in common [6]. For the case of bipartite graph in Fig. 1, the matching connects entities on the left (from Pn−1 ) to the entities on the right (from Pn ), such that there exists exactly one edge at each graph vertex. Note that there are exactly Nn−1 − Ln−1 = Nn − Jn entities perceived to be located in both observed sub-populations Pn−1 and Pn . In some scenarios, we may be interested only in the entities tracking between the consecutive observation times tn−1 and tn . In such a case, we do not have to differentiate among Jn joining entities and Ln−1 leaving entities, so these entities can be represented as the two super-entities (indicated by dotted ovals in Fig. 1), and we only indicate which entities have left the sub-population Pn−1 , and which are currently joining the sub-population Pn . In ideal case, the matching between the sub-populations Pn−1 and Pn is perfect, so we could record the measurements xi (n) as indicated in Fig. 2. Thus, at each observation time tn , we may observe up to (previously unobserved) Jn new entities. Any missing values at time tn correspond to the entities that have left the observable sub-population between time tn−1 and tn . However, in more realistic scenarios, the index i (i.e., the entity identification in the tracking process) for the measurement xi (n) must be estimated, whereas the time index n can be assumed to be always correctly determined. Consequences of the incorrect decisions of the index i are discussed in Section 4. For presentation clarity, we assume a single scalar attribute of interest Ai (n), and adopt the following model of the attribute measurements, i.e., xi (n) = Ai (n) + wi (n) where wi (n) is an additive noise representing the measurement uncertainty. Recall also that the index i corresponds to some particular entity that has to be identified in the association process. In order to obtain the optimum estimator for indices i, we assume that all Ntot entities in the population P have some arbitrary, but fixed indexing. For example, after n = 2 observations, we could arrange the measurements for all Ntot entities in P as shown in Fig. 2. We define the column vectors of attributes as A(n) = [A1 (n), . . . , ANtot (n)]T , and of noise samples as w(n) = [w1 (n), . . . , wNtot (n)]T . Then,

where x(n) = [x1 (n), . . . , xNn (n)] , and S(n) is a binary (zeroone) random matrix of dimension (Nn × Ntot ). The weights of all rows of the matrix S(n) are exactly 1, whereas exactly Nn columns of S(n) have the weight 1, and the remaining (Ntot − Nn ) columns have the weight 0. The matrix S(n) represents random ordering of the measurements prior to identification and tracking of the entities. Thus, the permutation matrix S(n) in (1) selects Nn observed values which are then randomly arranged into a column vector x(n) of Nn measurements. The matrix S(n) in (1) is unknown to the observer, and it also represents undesired perturbations of the measurements resulting in un-ordered observations. The matrix S(n) can be constructed by random column permutations λc of the matrix [I(Nn ) 0(Nn ×Ntot −Nn ) ] where I and 0 denotes the identity matrix and the all-zeros matrix, respectively. Note also that, for all permutations λc , the matrix S1 = λc (I) is orthogonal, since always ST1 S1 = I. Consequently and importantly, association of the entities between the observed sub-populations Pn−1 and Pn can be equivalently expressed as, ˆ 1 (n) x(n) ˆ (n) = S x

(2)

ˆ 1 (n) of size (Nn × Nn ) where the binary permutation matrix S should be matched to the permutation and selection matrix S(n) in order to rearrange the measurements in x(n) into the desired order. ˆ 1 (n) is discussed in the following section. The estimation of matrix S 3. TRACKING OPTIMALITY One possible, but sub-optimum strategy to associate the entities between the sub-populations Pn−1 and Pn is to either enumerate (oneby-one) the entities in the previous sub-population Pn−1 , or in the current sub-population Pn−1 . For the former, the entity in Pn−1 is either associated with the corresponding entity in Pn , or it is declared as the entity which have left the current (at time tn ) subpopulation. The remaining Jn unassigned entities in Pn are then declared to have joined the current sub-population Pn . Equivalently, one may consider enumerating the entities in Pn , and for each such entity either find association with the entity in the previous subpopulation Pn−1 , or the entity is declared to have joined Pn at the current time tn . The remaining unassigned entities in Pn−1 are then declared to have left the previous sub-population Pn−1 . However, one-by-one enumeration and association of the entities between the sub-populations Pn−1 and Pn is clearly suboptimal for three reasons. First, enumeration of the entities in Pn−1 and then in Pn is likely to result in two different associations, i.e., ˆ 1 (n). Second, the association errors tend two different matrices S to propagate due to the sequential decisions made about individual associations. Third, only two consecutive observations of the subpopulations Pn−1 and Pn are considered; taking into account the observations Pi from all time instances ti , i = 0, 1, . . . , n must provide associations that are more reliable (i.e., less erroneous). The decision tree to mutually associate the entities in bipartite graph in Fig. 1 is depicted in Fig. 3. Recall that the bipartite graph corresponds to the previous and current sub-population Pn−1 and Pn , respectively. The association process starts at the root node (the filled square), and then continuous along the branches towards the leaves. All decisions must be eventually validated (at the filled circles) to ensure consistency (i.e. agreement) among all associations.

1258 2568 631 644 603

There are five types of errors that may occur in the association process: E1 : the entity is assumed to have left Pn−1 even though it did not; E2 : the entity is assumed to have joined Pn even though it did not; E3 : the two entities between Pn−1 and Pn are associated even though the entity in Pn−1 has left Pn−1 ; E4 : the two entities between Pn−1 and Pn are associated even though the entity in Pn has joined Pn ; E5 : association of the two entities between Pn−1 and Pn is incorrect which induces a double association error. As the first step in defining the optimum association, we have to assign to each error a penalty μe ≥ 0, e = 1, 2, 3, 4, 5. The assignment of penalties μe to errors Ee is application dependent. Here, we assume that the penalty μ = μe = 1, if the error occurred, and μ = 1 − μe = 0 otherwise. The association errors can ˆ 1 (n) be identified by comparing the estimated association matrix S with the square sub-matrix S1 (n) of S(n) which is formed by all non-zero columns of S(n). Since the positions of these non-zero columns are exactly known, the matrix S(n) is uniquely determined ˆ 1 (n) yields by the matrix S1 (n). Note that the error-free matrix S T ˆ ˆ S1 (n)S1 (n) = I. The overall penalty μtot (S1 , S1 ) is a sum of the individual penalties μe corresponding to each association error. The entities randomly move in and out of the observation space. For instance, one of the observable attributes in the space A could be a 3D location of the entity. If the location attribute is outside of the observation boundaries, no attributes of that entity can be observed. For the purpose of the optimum entities tracking, a random mobility of the entities in and out of the observation space can be described by a generally non-stationary joint probability Pr(Jn = j, Ln−1 = l). The a priori probability of the perturbation matrix in (1) can be evaluated as,  Pr(S(n) = s) = Pr(S(n) = s|j, l) Pr(Jn = j, Ln−1 = l) j,l

where conditioned on the particular values of Jn and Ln−1 , it is reasonable to assume that the entries of S(n) are equiprobable. Assuming the penalty μe = 1 for all five types of errors discussed above, and since, for a particular time tn , S(n) is a random matrix, we obtain the MAP estimator of S(n) as [5],  ˆ S(n) = argmax fx(n),x(n−1)|S(n) (x(n), x(n − 1)|S(n) = s) s  ×Pr(S(n) = s) where fx(n),x(n−1)|S(n) (x(n), x(n − 1)|S(n) = s) is the joint probability density function (PDF) of the measurements x(n) and x(n − 1) conditioned on the particular value of the random matrix ˆ 1 (n) in (2) is formed by taking Nn nonS(n). The square matrix S ˆ zero columns of the matrix S(n). Alternatively, if the mobility of the entities is not known, or difficult to estimate, we may not have knowledge of Pr(Jn , Ln−1 ), and thus, we cannot obtain the a priori distribution Pr(S(n)). In such a case, we can resort to the ML estimation of the matrix S(n) [5], i.e., ˆ S(n) = argmax fx(n),x(n−1) (x(n), x(n − 1), S(n) = s) s

where fx(n),x(n−1) (x(n), x(n − 1), S(n) = s) is the PDF parametrized by a particular value of the (deterministic) matrix S(n). ˆ ˆ 1 (n) in (2) is again obtained from the matrix S(n). The matrix S

Strictly speaking, both MAP and ML estimators of S(n) above are still sub-optimal, since only the last two measurements x(n − 1) and x(n) are considered. However, extending these estimators assuming the joint PDF of the measurements x(i), i = 0, 1, . . . , n should be straightforward. Furthermore, the MAP and ML estimators described here may readily incorporate the complete set of observable attributes Ai (n) ∈ A. Some of these attributes tend to be less varying in between the two consecutive observations at time instances tn−1 and tn . Such (almost) time-invariant attributes may be particularly well-suited to serve as an explicit labeling or signature for the entities in the population. We conjecture that the MAP and ML estimators will implicitly put more weight on the slow-varying attributes than on the fast-varying attributes via the joint PDF of the corresponding measurements. Moreover, one may be interested in identifying and tracking only a sub-group of the entities exhibiting the characteristic features expressed as a sub-set of some attributes. For instance, the tracking objective may be to decide whether a chosen entity in the current sub-population has been present in the previous sub-population, or whether this entity will remain in the sub-population during the next observation. However, the MAP and ML estimators presented in this section provide the jointly optimum decisions assuming the consecutive sub-populations Pn−1 and Pn . The optimal MAP and ML estimators for tracking a single entity will outperform, in general, their counterparts designed to track a sub-group of the entities. 4. APPLICATIONS In general, typical attributes common to many populations tracking applications are the (3D) entity location, its shape and color. However, as pointed out above, the optimum tracking requires to consider all the measurable attributes even though the attributes being less time-varying are likely to contribute more to the tracking decisions. An automated (machine) tracking of the entities in large populations is often required in high-throughput microscopy imaging [1]. The cells can be explicitly labeled by an intake of nanoparticles with different fluorescence [4]. Thus, the attributes aiding the tracking also include the numbers of such nanoparticles and their wavelength (color). However, the biological cells may duplicate in between observations (so-called mitosis) which makes the tracking problem much more challenging, and it also requires to modify the bipartite graph model in Fig. 1. In particular, in addition to the cells joining or leaving the observation space, one has to consider the cell duplication event yielding the two daughter cells. Then, the cells identified in the sub-population Pn−1 to have split between times tn−1 and tn require locating the corresponding daughter cells at time tn while one or both of the daughter cells may have left the sub-population Pn−1 . The daughter cells inherit some of the attributes, for instance, nanoparticles in the mother cell are usually randomly divided between the two daughter cells. Note also that tracking of the cell mitosis violates the (bipartite) graph matching rule (i.e., only one graph edge allowed per node). There are other cases of the populations where the entities are explicitly assigned well-defined and time-invariant labels: for example, transactions on the Internet, and the worldwide baggage delivery by the airline companies. In these scenarios, however, the labels are not guaranteed to be unique even though the observation noise can be neglected. This difficulty can be overcome by extending the deterministic labeling with other observation space induced attributes such as the spatio-temporal location. Moreover, the forecasting of attributes (e.g., transaction trends, and anticipated future baggage location) rather than their estimation may be more important [7].

1259 2569 632 645 604

t0

IV

III Jn

Ln−1 J0

Nn−1

I tn−1

1

x1 (0) x1 (1)

2

x2 (0) x2 (1)

3

x3 (0)

tn

J2

t

Fig. 1. The bipartite graph matching for the population entities tracking.

t2

t3

x1 (2)

x1 (3)

t4

t

x2 (4) x3 (2)

x3 (3)

x3 (4)

4

x4 (1)

x4 (2)

5

x5 (1)

x5 (2)

x5 (3)

x5 (4)

6

x6 (2)

x6 (3)

x6 (4)

7

x7 (2)

8

x8 (2)

J1

Nn

II

t1

x4 (4)

x7 (4) x8 (3)

Fig. 2. The idealized attribute tracking of entities in a population.

5. CONCLUSION AND DISCUSSION We developed mathematical model of the population observations in order to define the jointly optimum tracking of entities. All observable attributes were considered in making the tracking decisions taking into account any correlations among the attributes (in time, in between entities, and among themselves) as well as their timevariations. The proposed MAP and ML estimators of entities associations between the consecutive sub-populations provide the jointly optimum decisions leading to the smallest mean number of association errors. These estimators could be modified to be optimal even when only a small sub-group of the entities needs to be tracked. The tracking decisions are normally only a pre-processing step to evaluate the population characteristics from their discrete time observations using selected attributes. Not only the attribute processes may be non-stationary and with limited availability of their samples (observations), but also these samples are randomly mixed among different attribute processes due to the entities tracking errors. Statistical processing of such mixed random processes to recover the desired features of entities constitutes another level of optimality that can be considered jointly with the optimality of the entities tracking. Obviously, formulating the optimum tracking is only the first step towards practical algorithms having a good performance; this requires to consider concrete scenarios and adopt specific underlying models of the attribute processes. Realistic scenarios may also require the tracking to be performed in real-time which may severely constrain the affordable computational complexity, especially in cases of large populations. For these reasons, the suboptimum linear estimators such as Kalman filtering may be attractive [1]. More importantly, the labeling design for entities can be another promising direction to overcome the practical constraints. 6. REFERENCES [1] K. Li, M. Chen, T. Kanade, E. D. Miller, L. E. Weiss, and P. G. Campbell, “Cell population tracking and lineage construction with spatiotemporal context,” Med. Image Analysis, vol. 12, no. 5, pp. 546–566, Oct. 2008.

x1(n)

xNn (n)

x1(n−1)

existing

newly

existing

newly

predecessor

(re)joined

predecessor

(re)joined successor

xN (n−1) n−1+1

existing

xN (n−1) n−1+1

xN (n−1) n−1

(temporarily)

existing

(temporarily)

left

successor

left

xNn−1(n)

xNn (n)

x1(n−1) x1(n−1)

xN (n−1) x1(n) n−1

xN (n−1) n−1

xNn−1(n)

x1(n) xNn (n)

Fig. 3. The decisions tree for a bipartite graph matching.

[2] S. Ali and S. Mubarak, “Floor fields for tracking in high density crowd scenes,” in Computer Vision – ECCV 2008, ser. Lecture Notes in Computer Science. Springer Berlin Heidelberg, 2008, vol. 5303, pp. 1–14. [3] M. Rodriguez, S. Ali, and T. Kanade, “Tracking in unstructured crowded scenes,” in Proc. Comp. Vision, 2009, pp. 1389–1396. [4] S. C. Bendall, G. P. Nolan, M. Roederer, and P. K. Chattopadhyay, “A deep profiler’s guide to cytometry,” Trends in Immunol., vol. 33, no. 7, pp. 323–332, 2012. [5] S. M. Kay, Fundamentals of Statistical Signal Processing: Estimation Theory and Detection Theory. Prentice Hall, 1993 and 1998, vol. I and II. [6] T. Tassa, “Finding all maximally-matchable edges in a bipartite graph,” J. Theor. Comp. Sci., vol. 423, pp. 50–58, Mar. 2012. [7] M. Crovella and B. Krishnamurthy, Internet Measurement: Infrastructure, Traffic and Applications, 1st ed. Wiley, 2006.

1260 2570 633 646 605