Christmas Lights, Pedestrians and Machine Learning – Part 2

We finished Part 1 having calculated an adjusted total of 51,144 people visited our Christmas light display (many multiple times). This adjusted total was estimated based on the results of nearly 20,000 images being processed by AWS Rekognition and a simple algorithm being applied.

As we eluded too, this number may have underestimated the total crowd and the reason for this is to do with the accuracy of the Rekognition model with the data we captured. This isn’t to say Rekognition isn’t accurate, more so the images we feed the model were not optimised. The example below shows the raw image we captured and then the same image if we adjust the brightness, contrast and sharpness of the picture.

Raw (left) verse adjusted Images(right)

As you move the slider above from side to side you can see there is a lot more people present than first apparent when the image parameters are adjusted (brightness, contrast, sharpness).

The Rekognition model was also able to detect more people with the adjusted image, detecting 15 people in the raw image and 22 in the adjusted image. The main changes are in those areas that in the raw image are dark and become lighter and clearer.

Rekognition results raw (left) v adjusted image (right).

In both instances, the model doesn’t pickup everyone that is present in the image and this lead us to try building a custom made model for this scenario in an attempt to get better results.

Targeted Model Development

In some of our other ML/AI projects we have been using a tool called Darwin which is developed by v7Labs. The Darwin product allows users to create and then refine ML models based on data specific to the user. You can read more about Darwin here and we’ll dive into the process we used now.

Using Darwin the first step is to upload some images to train the model on, open them and start labelling. As can be seen in the raw image below, the camera and lighting conditions were not optimal for even humans to detect pedestrians in the image.

Raw image in Darwin

Fortunately, Darwin has a built in image manipulation tool and adjusting some of the parameters (contrast, brightness) the people in the image become easier to see to the human eye. Next we draw the bounding boxes around all the people we can see. The image below shows the adjusted image, that has been partly annotated (drawing labels). Once all the people have been identified by the labeller (a person), the image is sent for peer review.

Adjusted Image with several bounding boxes draw.

Darwin recommends that at least 500 samples are needed before a model should be trained. For this project we processed 1,500 images and labelled 5214 people in them before training our first and second models.

Once the model was finished training, we tested it out running both Rekogition and Darwin against the same image to establish which was more accurate. We limited the head to head comparison to the images from the 23rd and 24th December, as they were had the most people detected by Rekogition in the first instance.

Adjusted daylight image

Starting with the 23rd December, we had to split the data into two (2) groups, with the first group was “natural day light” and the second group was “No Natural Light”. The reason for this split was to enable specific pre-processing (tweaking the brightness & contrast) of the dark images before passing them to the models for evaluation. If we applied the same image manipulation to the daylight images we ended up with “all white” (to the right) pictures with nothing visible (not great for counting people!).

The graphs below summarises the results of the two (2) models for the 23rd and 24th Dec 2021.

Graphed results of the model comparison

The table below summaries the results of both models.

Results table from the model comparison

As can be seen in both the graphs and the tables, the specifically trained Darwin model was able to detect a much larger number of people in the adjusted images. A quick random sample shows the comparative results of the models, but also shows neither model is 100% accurate in detecting all of the people present in the captured images. The images below so the base adjusted image, then the rekogition detection and then the Darwin detection results.

Now looping back to where we started, was the number of people detected higher or lower than our original estimates? Based on the Darwin results we could reasonably say the number of people that visited the lights display was higher than first estimated using the targeted model. The number of people detected in the samples from the 23rd and 24th December were 2.1 higher (see below) than Rekogition and then applying the ‘overlap’ factor of 50% would have over 30,000 people visiting on 2 nights!

Table Comparison

In Conclusion

Based on the images analysed using the 2 different models, and many manual inspections of the images, we can conclude that there was a lot (very scientific term!) of visitors to our lights. The exact number is in the 10’s of thousands but based on the limitations of our camera location, lighting and algorithms we are not able to give a definitive number!

Next up, we’ll look at the lessons we’ve learnt and the changes we’ll make for the 2022 Christmas Light show!