Skip to main content
School of Electronic Engineering and Computer Science

Multi-target tracking and scene analysis

Computer Vision Group and Centre for Intelligent Sensing

QMUL has an international reputation in multi-target tracking and scene analysis whose application aims to reduce risks to the security of nation states.


Public interested has also been informed and stimulated by this research [I6-I9].

Our work has generated considerable interest from governmental organisations (invitations from EU JRC and US Navy to offer advice), the industry and research organisations (invitations from leading organisations such as Thales, Honeywell, Fraunhofer to team-up in projects to integrate our tools into their products).

Moreover, the work on Event Detection was evaluated on 52 hours of the real multi-camera surveillance dataset of the TRECVID 2010 challenge and generated the lowest error score for the Person Runs detection task. The work is also core technology used in securing several national and international research projects whose current active funding is over £1.5M. These projects include:

  • a four-year Marie Curie Industry-Academia Partnership ‘CENTAUR’ Crowded ENvironments moniToring for Activity Understanding and Recognition (FP7-PEOPLE-324359, 2013-2016);
  • an EPSRC project on Multisource audio-visual production from user-generated content (EP/K007491/1, 2013-2015)
  • a three-year EU ARTEMIS collaborative project ‘COPCAMS’ COgnitive & Perceptive CAMeraS (ARTEMIS-2012-1-332913, 2013-2016)
  • and a seven-year Erasmus Mundus Doctorate Programme in Interactive Cognitive Environments (Education, Audiovisual & Culture Executive Agency (FPA n. 2010-0012, 2010-2017)

Finally, this work is one of the major research threads underpinning the newly founded QMUL research Centre for Intelligent Sensing (

Our Impact

The application of our multi-target tracking and scene analysis tools aims to reduce risks to the security of nation states. The work on tracking and scene analysis was demonstrated in both the 2006 Evaluation of advanced detection and tracking algorithms (ETISEO) in and the 2006 Video Analysis and Content Exploitation (VACE) evaluation campaigns, leading to the invited talks at the final ETISEO workshop in December 2006 (as top performer from Europe) and at the Group of Experts in Vision Systems conference (2008) organized by the UK Centre for the Protection of the National Infrastructure (CPNI).

The work also led to an invitation to present at the Emerging Surveillance Capabilities & Requirements workshop (by invitation only), organized by the Directorate General Joint Research Centre of the European Commission, Institute for the Protection and Security of the Citizen (2011).

The new object-tracking method that improves the sampling efficiency of particle filters developed during Cavallaro’s EPSRC First Grant ‘MOTINAS’ (Multi-modal object tracking in a network of audiovisual sensors, Project partners: Intel, GE industrial (Visiowave) 2006 – 2008), was licensed in 2006 to General Dynamics (“a global corporation with government and commercial customers on six continents and in more than 40 countries that provides combat and intelligence IT systems”) [I3]; is used since 2006 by the Robotics group of the University of Copenhagen [I4]; and is part of the teaching material for course E6998 at Columbia University since 2008 [I5].

As additional recognition of the expertise acquired through the development of the technology for multi-target tracking and scene analysis Cavallaro was invited to serve as an evaluator for the European Commission (2007-2009), the French and Canadian National Research Agencies (2006 and 2010); the UK Engineering and Physical Science Research Council (2012); Microsoft Research(2007); the Swiss and the US National Science Foundations (2008 and 2011); and was part of the review panel member for the Anticipate and Affect tenet for the Sciences Addressing Asymmetric Explosive Events (SAAET) Program of the US Office of Naval Research (ONR) in Washington (2011). The overall work on Multi-target tracking and scene analysis has more than 2400 citations (source: GoogleScholar).

Public interested has also been informed and stimulated by this research that has been covered by the Boston Globe [I6], SmartPlanet [I7], The Guardian [I8] and [I9]. Moreover, the YouTube channel demonstrating the corresponding results has over 17000 views [I10]

Underpinning Research

The growth of adoption of video surveillance systems has been recently driven by hardware advances, such as camera miniaturization, and increased availability of low-cost data storage. However, the opportunities offered by automated video surveillance are not yet exploited due to the lack of accurate and efficient algorithms for event detection and behaviour analysis.

The extraction of high-level information from surveillance video mainly relies on the analysis of lower level video data like objects and their trajectories, which are generated by multi-target trackers. While reliable tracking is possible under constrained conditions, the problem of tracking in a generic unconstrained scenario is still unsolved.

QMUL was the first to formulate the multi-object tracking problem for image sequences using Random Finite Sets, solving efficiently and accurately the challenge of filtering over space and time simultaneously moving targets in cluttered scenes.

This work was a key outcome of Andrea Cavallaro’s EPSRC First Grant MOTINAS (2006-2008), which produced 7 journal papers and the first monograph on Video Tracking. In this context a new object-tracking method was developed that improves the sampling efficiency of particle filters. This work led also to the demonstrator for the EU-ACTS three-year EU FP7 collaborative project APIDIS (ICT-2007.4.2; 2008-2010); and two paper awards at IEEE ICASSP. The work on multi-object detection and tracking with the PHD filter was demonstrated to be more efficient and accurate than other methods, and also allows for environmental modeling.

The subsequent algorithm that was defined to link and fuse trajectory segments across partially overlapping cameras was awarded the best-paper award at IEEE AVSS 2009. The work on perceptually-sensitive video analysis and processing contributed to the definition of a coherent framework for enhancing video-encoder performance using a pre-processing step that incorporates perceptual factors, independently of the specific coder.

This work attracted the interest of the Perceptual Engineering group at BT, which subsequently supported an EPSRC industrial CASE project (2006-2009) for development of a perceptually-sensitive analysis and processing. This EPSRC project led to the development of a perceptually-sensitive video encoder [I1], to a student-paper award at IEEE ICASSP 2009, the foremost signal processing conference; two IEEE Trans. on Image Processing papers; and to an invitation to present the work at Philips (Eindhoven) to start new collaboration on intelligent sensing [I2].

References (citations source: Google Scholar)



Completed grants underpinning the research 

  • [G1] A. Cavallaro, FP7 EU Project APIDIS, Autonomous Production of Images based on Distributed and Intelligent Sensing (2008-2010), 3 academic and 3 industrial partners, total project value: 2.615M€, total EU funding 1.925M€, QM funding: €315,600.
  • [G2] A. Cavallaro, EPSRC grant, Multi-modal object tracking in a network of audiovisual sensors (2006–2008), Project partners: Intel, GE industrial (Visiowave), Value: £125,526.
  • [G3] A. Cavallaro, Industrial research contract, Audio detection and classification of events, Visiowave research contract (2005 – 2006), Value: £23,014.
  • [G4] A. Cavallaro, EPSRC CASE grant, Audiovisual semantic discovery, Project partner: British Telecom (2005 – 2009), Initial value: £56,892
  • [G5] A. Cavallaro, Industrial research contract, British Telecom, contribution for the CASE project Audiovisual semantic discovery (2005 – 2009). Value: £25,000
  • [G6] A. Cavallaro, EPSRC CASE grant, Perceptually-sensitive encoding, Project partner: British Telecom, (2006 – 2009). Value: £59,464
  • [G7] A. Cavallaro, Industrial research contract, British Telecom, contribution for the CASE project Perceptually-sensitive encoding (2006 – 2009)alue: £29,000


Impact Corroboration 


Back to top