Team develops new deepfake detector designed to be less biased

By

Jan 16, 2024 #‘designed, #biased, #deepfake, #detector, #develops

<div> <div class="article-gallery lightGallery"> <div> <p> Deepfake detection algorithms often perform differently based on race and gender, including a higher false positive rate in Black men than white women. New algorithms developed at the University at Buffalo are designed to reduce those gaps. Credit: Siwei Lyu </p> </div> </div> <p>The image spoke for itself. Siwei Lyu, a computer scientist and deepfake expert at the University at Buffalo, created a photo collage from hundreds of faces that his detection algorithms had incorrectly classified as fake, and the new composite clearly had a predominantly darker skin tone. .</p> <p>“The accuracy of a detection algorithm should be statistically independent of factors such as race,” Lyu says, “but obviously many existing algorithms, including ours, inherit a bias.”</p> <p>Lyu, Ph.D., co-director of UB’s Center for Information Integrity, and his team have developed what they believe are the first deepfake detection algorithms specifically designed to be less biased.</p> <p>Their two machine learning methods (one that makes the algorithms aware of demographics and another that blinds them to them) reduced disparities in accuracy between races and genders and, in some cases, even improved overall accuracy.</p> <p>He <a target="_blank" href="https://arxiv.org/abs/2306.16635" rel="noopener">investigation</a>published in the arXiv prepress server, was presented at the <a target="_blank" href="https://wacv2024.thecvf.com/" rel="noopener">Winter Conference on Applications of Computer Vision (WACV)</a>held from January 4 to 8.</p> <p>Lyu, the study’s lead author, collaborated with his former student, Shu Hu, Ph.D., now an assistant professor of computer science and information technology at Indiana University-Purdue University Indianapolis, as well as George Chen , Ph.D., Assistant Professor of Information Systems at Carnegie Mellon University. Other contributors include Yan Ju, Ph.D. student at the Lyu Media Forensics Laboratory at UB and postdoctoral researcher Shan Jia.</p> <p>Ju, the study’s first author, says detection tools tend to be less scrutinized than the AI tools they keep in check, but that doesn’t mean they shouldn’t be held accountable as well.</p> <p>“Deepfakes have been so disruptive to society that the research community was in a rush to find a solution,” he says, “but even though these algorithms were created for a good cause, we still need to be aware of their collateral consequences.”</p> <p> </p> <h2>Demographic Aware vs. Demographic Agnostic</h2> <p>Recent studies have found large disparities in the error rates of deepfake detection algorithms (up to a 10.7% difference in one study) between different races. In particular, some people have been shown to guess the authenticity of lighter-skinned subjects better than those with darker skin.</p> <p>This can put certain groups at greater risk of having their real image labeled as fake, or perhaps even more harmful, having a manipulated image of them labeled as real.</p> <p>The problem is not necessarily the algorithms themselves, but the data with which they have been trained. Middle-aged white men are often overrepresented in these photo and video data sets, so algorithms are better at analyzing them than underrepresented groups, says Lyu, a SUNY Empire professor in the Department of Computer Science and Engineering of the UB, within the School of Engineering and Applied Sciences.</p> <p>“Let’s say one demographic has 10,000 samples in the data set and the other only has 100. The algorithm will sacrifice precision in the smaller group to minimize errors in the larger group,” he adds. “This reduces overall errors, but at the expense of the smaller group.”</p> <p>While other studies have attempted to make databases more demographically balanced (a time-consuming process), Lyu says his team’s study is the first attempt to actually improve the fairness of the algorithms themselves.</p> <p>To explain his method, Lyu uses an analogy of a teacher evaluated by students’ test scores.</p> <p>“If a teacher has 80 students with good results and 20 with bad results, they will still end up with a pretty good average,” he says. “So instead, we want to give a weighted average to students in the middle, forcing them to focus more on everyone instead of the dominant group.”</p> <p>First, their demographic method provided algorithms with data sets that labeled subjects’ gender (male or female) and race (white, black, Asian, or other) and instructed them to minimize errors in underrepresented groups. .</p> <p>“Basically, we are telling the algorithms that we care about the overall performance, but we also want to ensure that the performance of each group meets certain thresholds, or at least is well below the overall performance,” Lyu says.</p> <p>However, data sets are typically not labeled by race and gender. Therefore, the team’s demographic-agnostic method classifies deepfake videos not based on the demographics of the subjects, but on characteristics of the video that are not immediately visible to the human eye.</p> <p>“Maybe a group of videos in the data set correspond to a particular demographic or maybe it corresponds to some other characteristic of the video, but we don’t need demographic information to identify them,” Lyu says. “This way, we don’t have to select which groups should be highlighted. Everything is automated based on which groups make up that middle portion of data.”</p> <p> </p> <h2>Improve fairness and accuracy</h2> <p>The team tested their methods using the popular FaceForensic++ dataset and the state-of-the-art detection algorithm Xception. This improved all of the algorithm’s fairness metrics, such as equal false positive rate across races, with the demographic method performing the best.</p> <p>Most importantly, Lyu says, their methods increased the algorithm’s overall detection accuracy: from 91.49% to 94.17%.</p> <p>However, when using the Xception algorithm with different data sets and the FF+ data set with different algorithms, the methods, while still improving most fairness metrics, slightly reduced the overall detection accuracy.</p> <p>“There may be a small trade-off between performance and fairness, but we can ensure that the performance degradation is limited,” Lyu says. “Of course, the fundamental solution to the problem of bias is to improve the quality of the data sets, but for now we should build fairness into the algorithms themselves.”</p> <div class="article-main__more p-4"> <p><strong>More information:</strong><br /> Yan Ju et al, Improving fairness in deepfake detection, arXiv (2023). <a target="_blank" href="https://dx.doi.org/10.48550/arxiv.2306.16635" rel="noopener">DOI: 10.48550/arxiv.2306.16635</a></p> <div class="mt-3"> <strong>Magazine information:</strong><br /> arXiv<br /> <a target="_blank" class="icon_open" href="http://arxiv.org/" rel="noopener"> <p> </p></a> </div> </div> <div class="d-inline-block text-medium my-4"> <p> Provided by the University at Buffalo<br /> <a target="_blank" class="icon_open" href="http://www.buffalo.edu/" rel="noopener"></a></p> <p> </p> </div> <p> </p> <div class="d-none d-print-block"> <p> <strong>Citation</strong>: Team develops new deepfake detector designed to be less biased (2024, January 16) retrieved January 16, 2024 from https://techxplore.com/news/2024-01-team-deepfake-detector-biased .html </p> <p> This document is subject to copyright. Apart from any fair dealing for private study or research purposes, no part may be reproduced without written permission. The content is provided for informational purposes only. </p> </div> </div>

Deepfake detection algorithms often perform differently based on race and gender, including a higher false positive rate in Black men than white women. New algorithms developed at the University at Buffalo are designed to reduce those gaps. Credit: Siwei Lyu

The image spoke for itself. Siwei Lyu, a computer scientist and deepfake expert at the University at Buffalo, created a photo collage from hundreds of faces that his detection algorithms had incorrectly classified as fake, and the new composite clearly had a predominantly darker skin tone. .

“The accuracy of a detection algorithm should be statistically independent of factors such as race,” Lyu says, “but obviously many existing algorithms, including ours, inherit a bias.”

Lyu, Ph.D., co-director of UB’s Center for Information Integrity, and his team have developed what they believe are the first deepfake detection algorithms specifically designed to be less biased.

Their two machine learning methods (one that makes the algorithms aware of demographics and another that blinds them to them) reduced disparities in accuracy between races and genders and, in some cases, even improved overall accuracy.

He investigationpublished in the arXiv prepress server, was presented at the Winter Conference on Applications of Computer Vision (WACV)held from January 4 to 8.

Lyu, the study’s lead author, collaborated with his former student, Shu Hu, Ph.D., now an assistant professor of computer science and information technology at Indiana University-Purdue University Indianapolis, as well as George Chen , Ph.D., Assistant Professor of Information Systems at Carnegie Mellon University. Other contributors include Yan Ju, Ph.D. student at the Lyu Media Forensics Laboratory at UB and postdoctoral researcher Shan Jia.

Ju, the study’s first author, says detection tools tend to be less scrutinized than the AI tools they keep in check, but that doesn’t mean they shouldn’t be held accountable as well.

“Deepfakes have been so disruptive to society that the research community was in a rush to find a solution,” he says, “but even though these algorithms were created for a good cause, we still need to be aware of their collateral consequences.”

Demographic Aware vs. Demographic Agnostic

Recent studies have found large disparities in the error rates of deepfake detection algorithms (up to a 10.7% difference in one study) between different races. In particular, some people have been shown to guess the authenticity of lighter-skinned subjects better than those with darker skin.

This can put certain groups at greater risk of having their real image labeled as fake, or perhaps even more harmful, having a manipulated image of them labeled as real.

The problem is not necessarily the algorithms themselves, but the data with which they have been trained. Middle-aged white men are often overrepresented in these photo and video data sets, so algorithms are better at analyzing them than underrepresented groups, says Lyu, a SUNY Empire professor in the Department of Computer Science and Engineering of the UB, within the School of Engineering and Applied Sciences.

“Let’s say one demographic has 10,000 samples in the data set and the other only has 100. The algorithm will sacrifice precision in the smaller group to minimize errors in the larger group,” he adds. “This reduces overall errors, but at the expense of the smaller group.”

While other studies have attempted to make databases more demographically balanced (a time-consuming process), Lyu says his team’s study is the first attempt to actually improve the fairness of the algorithms themselves.

To explain his method, Lyu uses an analogy of a teacher evaluated by students’ test scores.

“If a teacher has 80 students with good results and 20 with bad results, they will still end up with a pretty good average,” he says. “So instead, we want to give a weighted average to students in the middle, forcing them to focus more on everyone instead of the dominant group.”

First, their demographic method provided algorithms with data sets that labeled subjects’ gender (male or female) and race (white, black, Asian, or other) and instructed them to minimize errors in underrepresented groups. .

“Basically, we are telling the algorithms that we care about the overall performance, but we also want to ensure that the performance of each group meets certain thresholds, or at least is well below the overall performance,” Lyu says.

However, data sets are typically not labeled by race and gender. Therefore, the team’s demographic-agnostic method classifies deepfake videos not based on the demographics of the subjects, but on characteristics of the video that are not immediately visible to the human eye.

“Maybe a group of videos in the data set correspond to a particular demographic or maybe it corresponds to some other characteristic of the video, but we don’t need demographic information to identify them,” Lyu says. “This way, we don’t have to select which groups should be highlighted. Everything is automated based on which groups make up that middle portion of data.”

Improve fairness and accuracy

The team tested their methods using the popular FaceForensic++ dataset and the state-of-the-art detection algorithm Xception. This improved all of the algorithm’s fairness metrics, such as equal false positive rate across races, with the demographic method performing the best.

Most importantly, Lyu says, their methods increased the algorithm’s overall detection accuracy: from 91.49% to 94.17%.

However, when using the Xception algorithm with different data sets and the FF+ data set with different algorithms, the methods, while still improving most fairness metrics, slightly reduced the overall detection accuracy.

“There may be a small trade-off between performance and fairness, but we can ensure that the performance degradation is limited,” Lyu says. “Of course, the fundamental solution to the problem of bias is to improve the quality of the data sets, but for now we should build fairness into the algorithms themselves.”

More information:
Yan Ju et al, Improving fairness in deepfake detection, arXiv (2023). DOI: 10.48550/arxiv.2306.16635

Magazine information:
arXiv

Provided by the University at Buffalo

Citation: Team develops new deepfake detector designed to be less biased (2024, January 16) retrieved January 16, 2024 from https://techxplore.com/news/2024-01-team-deepfake-detector-biased .html

This document is subject to copyright. Apart from any fair dealing for private study or research purposes, no part may be reproduced without written permission. The content is provided for informational purposes only.

By

Breaking News

Team develops new deepfake detector designed to be less biased

By

Demographic Aware vs. Demographic Agnostic

Improve fairness and accuracy

By

Related Post

One in three people could detest fidgets

One in three people could detest fidgets

One in three people could detest fidgets

You missed

One in three people could detest fidgets

One in three people could detest fidgets

One in three people could detest fidgets

One in three people could detest fidgets