Privacy Preserving Crowd Monitoring:
Counting People without People Models or Tracking
There is currently a great interest in vision technology for monitoring all types of environments. This could have many goals, e.g. security, resource management, or advertising. Yet, the deployment of vision technology is invariably met with skepticism by society at large, given the perception that it could be used to infringe on the individuals' privacy rights. This tension is common in all areas of data-mining, but becomes an especially acute problem for computer vision for two reasons: 1) the perception of compromised privacy is particularly strong for technology which, by default, keeps a visual record of people's actions; 2) the current approaches to vision-based monitoring are usually based on object tracking or image primitives, such as object silhouettes or blobs, which imply some attempt to "identify" or "single out" the individual.
From the laymen's point of view, there are many problems in environment monitoring that can be solved without explicit tracking of individuals. These are problems where all the information required to perform the task can be gathered by analyzing the environment holistically: e.g. monitoring of traffic flows, detection of disturbances in public spaces, detection of speeding on highways, or estimation of the size of moving crowds. By definition, these tasks are based on either properties of 1) the "crowd" as a whole, or 2) an individual's "deviation" from the crowd. In both cases, to accomplish the task it should suffice to build good models for the patterns of crowd behavior. Events could then be detected as variations in these patterns, and abnormal individual actions could be detected as outliers with respect to the crowd behavior. This would preserve the individual's identity until there is good reason to do otherwise.
In this work, we introduce a new formulation for surveillance technology,
which is averse to individual tracking and, consequently, privacy
preserving. We illustrate this new formulation with the problem of
pedestrian counting. This is a canonical example of a problem that
vision technology addresses with privacy invasive methods:
detect the people in the scene, track them over time,
and count the number of tracks.
Unlike these methods, we show that there is in fact no need for pedestrian
detection, object tracking, or object-based image primitives to
accomplish the pedestrian counting goal, even when the crowd
is sizable and inhomogeneous, e.g. has sub-components with different dynamics.
In fact, we argue that, when considered under the constraints of
privacy-preserving monitoring, the problem actually appears to become
simpler. We simply develop methods for segmenting the crowd
into the sub-parts of interest (e.g. groups of people moving in different
directions) and estimate the number of people by analyzing
holistic properties of each component. This is shown to be quite
robust and accurate.
The system is also privacy-preserving in the sense that
it can be implemented with hardware that does not produce a
visual record of the people in the scene, i.e. with special-purpose cameras that output low-level features (e.g. segmentations, edges, and texture).
|Antoni Chan, Nuno Vasconcelos