Nick Bowden, Managing Director of Digifort UK explores the cost performance benefits of CCTV analytics technologies.
There are three broad types of video analytics technology available in server-based, VMS, CCTV solutions. In order of accuracy and capability, they are Neural Analytics; then Deep Learning and Artificial Intelligence (DL and AI) Analytics; and finally Binary Large Object, or BLOB.
The most accurate and capable analytics option is neural. This is the most expensive option to deploy because the software costs more and it requires high performance hardware to run it.
The other analytics options may be less capable, but they are perfectly suitable for many CCTV applications, where budgets are tighter.
Digifort is a technology partner of Nvidia.
Its analytics software is optimised to run on Nvidia Graphics Processing Units (GPUs). These are fitted in a server alongside the operating system (OS) processor.
VA server ‘performance’ is measured in CUDA cores, which is like brake horsepower (BHP) in cars. GPU’s of 4000 CUDA cores or more are commonplace and affordable.
This GPU performance ‘budget’ is distributed across multiple analytics channels and the analytics functionality allocated to the required video channels – with the flexibility to be reallocated to different video channels in the system, if required.
NVRs; boxed analytics solutions; and cameras with onboard analytics simply do not have this performance ‘grunt’ or system deployment flexibility.
GPU boards are rapidly developing, with processing performance doubling each year, for the same cost.
We can therefore expect to benefit from yet more huge performance improvements and cost reductions in server-based CCTV systems going forward.
Also, dedicating the GPU cores to analytics and the server cores to the OS and video processing is good practice for optimal server performance, as each accesses its respective processor resources differently.
Neural analytics is a relative newcomer to mainstream CCTV.
Like human recognition, many different objects within a camera view are identified from a library of known objects, with specific new objects “introduced” to the system and other objects learnt by the system over time.
Rules can be applied for individual objects or combinations of objects with real-time alarms or events triggered for an operator response.
Digifort has three neural networks to choose from:
Neural analytics lends itself to ‘occupancy’ type applications, such as the number of cars in a car park or people in a queue.
It recognises the objects ‘seen’ in the camera view, or a zone and counts them.
Multiple zones from one or many cameras can be aggregated for a site count. Scene backgrounds are ignored, as they are not recognised objects, reducing false alarms.
2. Deep Learning and Artificial Intelligence (DL and AI)
DL and AI analytics may also have a neural element and most commonly recognise people, vans, bikes, cars, trucks, groups of people, bags, cyclists and much more, including with a specific, colour profile.
As a camera scene is ‘learnt’ the DL / AI analytics self-calibrates to learn the scene backgrounds, minimising false alarms.
Many rules can be applied individually or concurrently, such as presence, entry, exit, appearance, disappearance of an object; direction, tailgating filters; counting over a line; and stopped, loitering, abandoned and removed objects.
Digifort’s analytics also uses a metadata reporting framework which allows forensic searching of recorded video for different objects to the original settings.
Many NVRs, boxed analytics and embedded camera solutions use versions of this analytics type, usually without the neural element, but often lack the processing capability required to maximise their potential as its not practical or cost-effective to fit Nvidia GPUs into NVRs.
3. Binary Large Object / BLOB
This is the most basic level of analytics, recognising object size (number of pixels) and behaviour based on motion detection and some simple analytics like line crossing.
Many NVRs use this type of analytics. It is a low-cost option, ideal for driving motion or event recording in a VMS system, to save on storage.
Neural analytics use D1 (720×576 pixels) video streams for processing, even if the recorded ‘evidential’ stream in the VMS is 4MP, 8MP or more.
Some very specific analytics types use 1080p (1920×1080 pixels), usually when analysing human behaviors.
As an indication of capability, a 3000 core GPU at under £500 will typically process around 40x neural channels.
A word of warning, some boxed analytics solutions only record the analytics processing stream, which might be at D1 or less, without a concurrent HR stream for evidence.
This means that analytics video can often be recorded at low resolution – so do check if you go down a boxed analytics route.
There is a place for each type of analytics when cost and performance are factored in.
However, neural analytics outperforms them all in terms of accuracy and capability and it is future-proof, allowing for continuous software upgrades.
This article was originally published in the June Edition of Security Journal UK. To read your FREE digital edition, click here.
Click to Open Code Editor