Identifying the Risks for Online Businesses that Rely on AI Age-Estimation.

Dr Rachel O'Connell

Published in

trustelevate

8 min readApr 12, 2023

Identifying the Risks for Online Businesses that Rely on AI Age-Estimation.

New, simple-to-use AI image, video and voice generators are making AI based age-estimation obsolete

Dr Rachel O’Connell

The child, teen and adult females in the photos above do not exist; they are AI generated. Increasingly social media companies are trialling AI age-estimation services to enable platforms to have an idea of the approximate ages of their users. In time it is assumed that a proportion of platforms will use age estimation tools. Here is why it doesn’t work and is in fact dangerously inadequate.

What is image, video and voice based age estimation: AI assessing AI circularity problem

In an online age estimation process, when signing up to use a social media platform, users are asked to use the camera on their phones to take a series of photos that the age-estimation AI then analyses and lets the app know the estimated age of the user. Users may also be asked to say specific words or tilt their heads a certain way to check that this is a real time live person.

Adults with a sexual interest in children defeating age-estimation on social media

It is widely understood by law enforcement agencies that criminals are early adopters of new technologies. This is especially the case, for adults with a sexual interest in children who in a growing number of cases are using AI generators to create child abuse images. These adults will also use AI generated images of children to fool age estimation tools used to access sites used predominantly by children, to target and victimise those children. In other words, age estimation will fail to detect a significant number, if not all, of the adults with a sexual interest in children that platforms deploy to protect children. Age estimation does not help platforms to better protect children online.

Children circumventing age-estimation tools on adult content sites.

Age and voice estimation tools can also be circumvented by children seeking to access harmful adult content. Here are examples of AI-generated 18+ year old male and female that could be generated by <16 for the purposes, for example, of gaining access to adult content, to fool age estimation systems.

How concerned should parents, children, regulators and companies be?

Internet users, including parents, young people and children should also be aware of the ease with which AI age-estimation services can be circumvented to impersonate children, which can create ‘a wolf in sheep’s clothing’ scenario. It is vital that children do not have a false sense of security and understand that the person they are interacting with, who may have a profile picture of a child, and been through an age estimation process, may not, in fact be a child.

The underlying aim of the existing and emerging data protection regulations globally is for companies to reliably know the ages of their users, and for younger users to obtain parental consent before processing children’s data. Age estimation does not enable companies to meet either of these requirements.

Regulators are enforcing data protection laws and issuing substantial fines to companies processing children’s data without parental consent.UK Data Protection Regulator fined TikTok £12.7M and the FTC fined Epic games half a billion-dollars and Meta was fined $400M by the Irish data protection regulator for mishandling children’s data.

There are an increasing number of regulations requiring companies to reliably know the ages of their users and to create age-appropriate spaces for those children. Regulators and companies need to be aware of the ways in which age-estimation tools can be circumvented and consider the associated regulatory exposure and liability of companies that both provide and rely on these services.

The techie bit..

To the age-estimation AI, the image of a person’s face is simply a pattern of pixels, and the pixels are numbers. Facial age-estimation technology is trained to spot patterns in numbers, so it learns ‘this pattern is what 16-year-olds usually look like’. These patterns are exactly what the widely available online AI image generators use to create images of people in specific age bands, so when an age estimation service is analyzing an AI generated image it will not necessarily be able to detect that it is AI generated, but both the AI that generated the image and the age estimation AI will agree on the estimated age to within a mean average error of up to 2 years. We are increasingly familiar with AI tools and filters that show you what you will look like as an older person or what your child will look like. These images can be converted to videos that create completely realistic human-like features and movements.

Similarly, AI voice age-estimation services are impacted by the same circularity problem i.e. looking for markers that AI natural voice generators use to create for example, a natural sounding child’s voice, that is indistinguishable from a living person’s voice.

Internet users with zero coding skills can use AI tools for example, MidJourney AI, to generate photos and videos of AI generated ‘people,’ which when combined with ChatGPT to create scripts for the human-like avatar to speak in response to liveness detection prompts, and use AI for hyper-realistic voice generation that can sound like a child or an adult speaking to circumvent AI voice estimation tools.

Deep fakes and Presentation Attack Detection failures

In 2020, iProov warned of the emerging threat of deep fakes being digitally injected into camera feeds to impersonate an individual’s biometric verification process, to beat a biometric Presentation Attack Detection (PAD) system.

A large-scale real-world example, includes a scam against China’s taxation system in 2021, where high-resolution images of people were made to look live for the crime, with each “nodding, shaking, blinking and opening their mouths.” Presentation attack detection (PAD) tests are designed to detect efforts to spoof liveness. Digital injection attacks bypass the phone’s camera on the device by feeding fake AI generated video to the AI to evaluate. In other words, the AI generated human avatar speaks the words it is instructed to say — using a text to speech natural voice AI generator — and responds to commands to move in a particular way — text to motion AI — to fool the age estimation AI.

Security firm Sensity published a report this year explaining how they used deep fake videos to fool liveness checks in nine out of the top ten of the most widely adopted biometric verification vendors for KYC and found that the vast majority were severely vulnerable to deep fake attacks.

Herein lie the weaknesses of age and voice estimation tools. Age estimation AI has trouble detecting if the image, voice or video it is analysing is itself AI generated. Determining whether the ‘person’ is moving in response to a command or actually a live person has become and will continue to be increasingly difficult as AI and efforts to circumvent age estimation continue to become more sophisticated.

Companies that rely on age-estimation must recognise that this is a game of whac-a-mole, AI combating AI, and consider the impact on their levels of regulatory and liability exposure.

The role of standards and certification in the face of rapid innovation

Traditionally, new tools are evaluated from a cybersecurity perspective, the related threat vectors recognised and methods to combat misuse put in place. For example, children and adults were using costume latex masks to try to defeat age-estimation checks online. The standard titled ISO/IEC 30107–3 Level 2 is the standard against which the efficacy of age-estimation tools in detecting latex masks is assessed.

Beyond latex masks

However, the explosion of AI tools in the last year means that there is a close to zero need to rely on latex masks to fool age-estimation AI. The attack vectors have become far more sophisticated and will continue to do so at an extremely rapid pace. Since 2012, the growth of AI computing power has risen to doubling every 3.4 months. The rate at which ISO standards designed to test the efficacy of measures to detect how AI is used to aid criminal activity are developed at a far slower pace, one to three years.

Age Check Certification schemes currently have little or no means to assess whether or not the image or voice that is analysed is that of a human or an AI generated lifelike child, teen or adult that moves and speaks in the way AI expects humans to move and speak. Certification schemes rely on standards and as such will not be able to keep pace with the exponential rate of AI development and adoption.

Increasing risks and costs

Companies in the biometric identity verification space are investing huge amounts of money in R&D to combat criminals’ efforts to exploit these tools. Currently, the costs associated with ID verification are higher than AI age-estimation as full ID verification when for example, opening a banking account requires higher levels of assurance.

However, the same level of investment is required in age and voice estimation to combat circumvention and new exploits. Companies that are currently using age estimation at zero cost in exchange for the age estimation AI to be trained on their data sets, and those that pay a lower cost today for age estimation will have to brace themselves to support these additional costs in the future, and the need to re-check part, or all, their database of users at regular intervals as more knowledge is acquired about exploits.

Increasingly we are becoming familiar with age-estimation tools in an automated check-out setting when buying, for example, a bottle of wine at a supermarket. These are located in public areas, with environmental checks in place, limited scope to circumvent, and undoubtedly very convenient. However, while the deployment of age estimation in a real world setting has benefits, it isn’t necessarily the right approach in an online setting.

What is needed is careful consideration of the pros and cons of the various age assurance services and a thorough mapping of the potential unintended consequences of using AI driven age-estimation services. AI age-estimation does not enable compliance with data protection laws, and is easily defeated by the combination of the circularity of using AI and users with ill-intent and presents a significant risk to children’s safety online

Author’s bio:

Dr Rachel O’Connell is the author of the Pas 1296 Age Checking Code of Practice, which was published by the British Standards Institution (BSI) and is now becoming an ISO standard. Rachel is a member of the BSI technical committee IST/33/5 Identity Management and Privacy Technologies and its panel IST/33/5/5 Age Assurance Standards. Rachel’s PhD examined the implications of online paedophile activity for investigative strategies. The findings of her research informed the drafting of the the 2003 Sexual Offences Act (UK).

Written by Dr Rachel O'Connell