Solution
To overcome this challenge, DataArt’s team developed a continuous tracking system which included the following image processing steps:
STEP 1
DataArt team chose OpenCV library to work with video which was streamlined from a camera with around a 1600x1200 pixel resolution.
Each frame got into the data processing pipeline. In many videos, the objects change much slower than the frames change (e.g. a person can stay in the same place in line for couple of minutes, but the video runs on ~30 frames per second), so at the first step the processing pipeline detected the objects/scene difference between sequential frames. It reduced the number of images (frames) for further processing and was connected with a limited amount of hardware resources.
The system developed is highly flexible and was designed to process both the entire video streaming and a certain number of images chosen according to specified criteria.
STEP 2
The next step in processing was the engagement of a neural network trained to find and detect people in the image. The result of this step in our case was the number of people found in a certain area of the image. For more complex cases, information on location can also be used.
STEP 3
The number of people found was processed during the next step, in which historical data was used for corrections (the previous several weeks - the same time, the previous few minutes). This allowed deviation recognition and more accurate data on the actual number of people in line displayed in real time.
