Video Motion Detection
From CCTV Information
There are many methods of detecting intruders into premises. These include such systems as:
- Intruder alarms.
- Fence mounted detectors.
- Buried vibration or electric field devices.
- Active infrared devices.
- Passive infrared devices.
- Microwave devices.
- Video motion detection devices.
This chapter is concerned with Video Motion Detection devices. (VMD). These may be within or outside the premises and, besides detecting intruders, can be used as part of a building management system. VMD may often be used either as a stand-alone system or integrated with other detection systems. In an ideal world, detection devices would give no false alarms and 100% of genuine alarms. Unfortunately, this is not an ideal world, and a certain amount of compromise is necessary. This compromise must be reduced to the most effective and acceptable level to achieve the system objectives.
There are really only two types of alarm, genuine alarms and false alarms. Sometimes mention is made of ‘spurious alarms’, unexplained alarms and system failures. These must only be considered as false alarms because the system has alarmed for no apparent reason. A genuine alarm is one created by deliberate nefarious human action, e.g. by movement of a person or vehicle into the detection field or disturbance of the alarm system. A false alarm is one that has no deliberate human input, such as those caused by animals, birds or any malfunction of equipment. One measure of the efficiency of a system is the ‘False Alarm Rate’ (FAR). This is the ratio of false alarms to a time scale, i.e. five per day. The FAR level will depend on many local site considerations. The objective is to reduce this to the minimum without missing any real alarms. Another measure is the ‘probability of detection’ (PD) rate, which is the ratio of detections to the number of attempts in controlled tests. The ideal for PD is 100%.
Uses Of VMD
The primary function of a VMD system is to relieve CCTV operators from the stress of monitoring one or many screens of information that may not change for long periods. The VMD system will be monitoring all the cameras in its system, and only reacting when there is suspicious activity in one of the scenes. During the long periods of inactivity the operator can continue with other tasks, secure in the knowledge that when something occurs the system will immediately respond. Even a moderate sized system, with eight cameras, would prove impossible for an operator to monitor. Eight monitors could not be viewed with any degree of concentration for more than about twenty minutes. If the monitors were set to sequence, then activity on seven cameras is lost for most of the time and would be totally ineffective to detect intruders. With more cameras in a system, the task of detecting intruders becomes impossible and technology must take over the strain.
The idea of VMD systems is that the processor is continuously monitoring all the cameras in the system. During this time, the, operator may select or sequence cameras using the conventional switching system. The system may include an additional monitor connected to the VMD system that will normally show a blank screen. When activity in any camera occurs that the VMD system interprets as an intruder, the alarmed camera is immediately switched to the blank monitor and a warning sounded to alert the operator. The operator’s attention, is therefore, immediately focused on the camera covering the alarm. The detection of an intruder can also set off further events, such as setting a video recorder to real time recording, setting a matrix switching system to sequence through a specific series of cameras, etc. The operator can analyse the scene and take the appropriate course of action.
An intruder could generate an alarm and be out of view of the camera before it is displayed. The operator would therefore see just a blank screen and be unsure about what to do next. To overcome this, at the time of detection, many VMD systems will capture an alarm image sequence containing one or more freeze frames. This may be displayed as the first view on the previously blank screen. The operator may then examine the scene at the instant of alarm in more detail.
Principle of operation
In the descriptions that follow reference is made to a ‘frame’ of video. Some systems use frames and some use fields, some systems can select between the two. This also applies to storage devices. For ease of description, the term frame is used for consistency but the actual method used should be checked for the system being considered.
Video Motion Detection is an electronic method of detecting a change in the field of view of a camera. In its simplest form, this is achieved by storing one frame of the video information and then comparing the next frame with this to decide whether there has been a change. The change detected would be a difference in the video voltage, indicating a change of brightness within the scene. This would be initially ignored as an alarm until a further frame confirmed the change, or not. If confirmed as a change of brightness in the scene, then an alarm would be generated. This could cause a contact to close and activate some warning device such as a buzzer, or cause the switcher to select the camera that detected the motion. The sampling process may take somewhere between one fiftieth of a second and one second to detect a change, depending on the method of sampling. This simple detector could be used in an environment where all conditions were absolutely stable and the only possible change in brightness would be due to an intruder. However, the intruder could be a mouse or a person. The system couldn’t differentiate between the two. In addition, by the time the alarm is displayed on a monitor, the cause of it could be out of view. If the scene were being continuously recorded, the event could be reviewed but this may be too late to take effective action.
For the purposes of this chapter the following definitions are used although there are no standard terms used at present. A CELL is a single detection block that is analysed electronically for brightness changes. A cell may be a single pixel, a block of pixels, or the whole screen. A ZONE is a group of cells that have been defined as an active area. The exact meaning of ‘zone’ must be checked with a manufacturer’s specification before assuming what area is covered and to what degree of definition. This method of comparing complete frames therefore has severe drawbacks. The next development was to divide the picture into a number of separate areas or cells. This was refined by being able to switch cells on or off to define the area of the scene that is of interest. Diagram 18.4 illustrates a VMD system that divides the picture into cells, and how only a selected part of the scene can be set for motion detection. The shaded areas are inactive and the clear parts are the active cells. In this case, only activity in the area of the car will create an alarm. The cells are only displayed as such during setting up the system. Once the set-up mode is exited, the complete picture is displayed as normal and it is not possible to see any of the cells.
The sensitivity of the cells can be adjusted to take into account local conditions. This control though is applied across all cells to the same extent. Some systems can be pre-set to different sensitivity levels, for instance, to make allowance for day or night operation when the lighting levels may be different.
This type of system would not be suitable in the scene shown out of doors. This is because external light conditions are changing frequently. Clouds moving across the sky would cause changes in brightness and create alarms. This type is used in simple indoor situations, where the lighting conditions are constant and anything breaking the cells could be considered an alarm. The set-up can be refined to reduce unwanted activations. For instance, there may be two doors in the scene, only one of which needs to be monitored. In this case, the part of the scene of interest could be adjusted accordingly. Note that with this type of system any change in any one or all the cells will create an alarm.
The next move towards reducing false alarms is to build in the computing power to process each cell individually and create algorithms that will intelligently analyse certain situations. In this way, decisions can be made according to the direction of movement. For instance, one cell may be declared as a pre-alarm cell and another as a detection cell. Pre-alarm cells do not create alarms. Instead, they instruct the system to associate detection in this area with detection in another. Activation of detection cells alone will not create an alarm. A combination of successive detection in adjacent cells will trigger a logical action dependant on the program. For example, if a detection cell is activated after a pre-alarm cell an alarm will be created. However, movement in the reverse direction, detection before pre-alarm, will not create an alarm. In this way, all persons leaving a building will not create an alarm but persons approaching it will do so. Also, persons moving down the right of the perimeter will not create an alarm.
Another factor that could be calculated in the processor is the number of cells caused to change simultaneously. This would then be used as a further part of the equation, so that an alarm would only be created if more than ‘x’ cells change contrast simultaneously. This brings in attendant problems in some situations. Three dogs in the scene could activate the same number of cells as one person. A major problem with cell count is that of the different number of cells a certain size of object occupies in relation to the position of the camera.
Diagram 18.4 shows that a person in the foreground occupies eight cells while one in the background is less than half a cell. Similarly, a cat close to the camera would activate far more cells than a person in the background. Simple cell count systems may offer some improvement in false alarms but do not offer accurate size discrimination.
It was stated that the detection of movement was obtained by measuring the changes in video level (brightness) between successive frames. This is fine if a person in a dark suit passes through a very bright scene. The change in brightness will be dramatic and immediately evident to the processor. However, a person in a grey suit in a grey scene, with little contrast, will cause only a small change in the brightness levels. If the sensitivity of the system were set to detect the latter event, it would be over responsive to insignificant changes in a bright scene. This is less important for indoor systems, but a significant factor in external systems where the light changes frequently and greatly. In addition, where the object is smaller than the cell, the brightness change will be a function of both the size of the object and the contrast between the object and the background. This becomes especially critical when detecting a person in the background when they may be only 10% of the screen height. This can be only 0.25% of the screen area. If the person is substantially smaller than the cell, the sensitivity would have to be very high to detect this change, but would cause many false alarms for larger subjects providing greater contrast, although much smaller than a person.
Another problem with measuring brightness using large cells is that a small dark object such as a cat could cause the same brightness change as a large low contrast object such as a person.
In external systems, cameras are mounted on brackets or towers. It is often impractical to ensure that they are absolutely rigid with no movement. The camera would only have to move a small amount, such as can happen in the wind, to cause a global change and register an alarm.
Changes In Light Levels
By processing separate cells and having the power to define better algorithms, other problems can be overcome. For instance, light changes may be ignored if all cells are affected to the same extent. Another method to allow for global light changes is to make one reference cell in which movement is unlikely. The other cells are then referenced to this to compensate for light levels. This latter method can impose limitations on the system set-up and is now infrequently used.
All the systems described so far have only been able to set the overall sensitivity of all cells. This renders them quite unsuitable for outdoor use. The next need therefore is to be able to adjust the sensitivity of each cell individually. This obviously requires much more computing power but is an absolute prerequisite for any VMD to be used externally.
Most simple VMD systems have one processor irrespective of the number of cameras. If it requires three frames to analyse a scene then the processing time for one camera will be about 0.12 seconds. This must be multiplied by the number of cameras in the system. Therefore, with eight cameras the processing speed for each will be about one second. For example, a 1/2” camera with a 25mm lens has a width of view of about 5m at 20m from the camera. A person could run across this field of view in less than the processing time and not be detected.
Limitations of Simple VMD Systems
The previous examples have served to show the principles of simple video motion detectors. Variations of these types are still available but their use is limited, and they should be used with great caution in anything but the most basic applications. However, they do have uses and can provide a very cost-effective method of motion detection when the situation is appropriate.
The limitations of the types described for demanding external situations are as follows.
- Will not cope with moderate changes in light levels.
- Sporadic generation of alarms in high contrast scenes.
- Will not cope with changing weather conditions.
- Lack of size discrimination means compromise in setting up.
- Non-uniform sensitivity with range.
- Will not cope with size variation due to perspective.
- Slow processing speed can miss moving action.
- Inability to discriminate between small high contrast dark and large low contrast objects.
- Prone to false alarm due to camera shake.
- Cell measurements prevent accurate area discrimination.
- Restricted to small areas of view.
- Unlikely to detect a person at 10% of screen height.
- Only simple algorithms can be computed.
- Cannot distinguish between a person moving in a line and a waving object.
- Single processor increases time between frame comparisons.
This article is an extract from chapter 18 of 'The Principles & Practice of CCTV' which is recognised as the benchmark for CCTV installation in the UK.