The term "deepfake" entered public discourse in 2017 when a Reddit user posted face-swapped pornographic videos using a technique derived from the autoencoder architectures that were then becoming standard in deep learning research. What began as a harassment tool has since become an entire technical subdomain, a multi-billion-dollar fraud category, a routine election interference vector, and a persistent challenge to the epistemic foundations of democratic societies. This review surveys the current state of the phenomenon, the evidence on harm, and the response architectures being attempted.
TECHNICAL BASIS
Most deepfake systems in public use today derive from two architectural families. The first is the generative adversarial network (GAN), introduced by Goodfellow et al. in 2014. A GAN trains two neural networks in opposition: a generator that produces synthetic samples and a discriminator that attempts to distinguish real samples from the generator's output. Training proceeds until the discriminator can no longer reliably distinguish real from generated. The architectural consequence is that GAN-generated content is, by construction, optimized against the discriminator during training.
The second family is diffusion models, which generate samples by iteratively denoising from random noise toward coherent outputs. Diffusion has become dominant for image and video generation since roughly 2022 and is the basis for most widely deployed text-to-image systems including Stable Diffusion. Diffusion-based video generation is newer but advancing rapidly.
Voice cloning occupies a related technical space. Contemporary voice cloning systems can synthesize a target speaker's voice from a few seconds of reference audio, producing outputs that are difficult for both humans and automated speaker-verification systems to distinguish from the original.
The salient point for policy and social impact is that all three families — face/body synthesis, audio cloning, and text generation — have matured to the point where high-quality outputs are achievable with modest compute and open-source tooling. The technical barrier to creating convincing synthetic media has collapsed.
THE HUMAN DETECTION THRESHOLD
Multiple studies have documented the degradation of human ability to distinguish synthetic from authentic content. Groh et al. showed that participants in a 2022 study identified deepfake videos at rates only modestly better than chance for state-of-the-art generations. Subsequent work has extended this finding to newer generation methods; the direction has been consistent.
The practical implication is that the traditional defense of careful human observation has been defeated. The human perceptual system, evolved over millions of years to read faces, is outperformed by systems specifically trained on face data. This is not an argument that humans are bad at their job; it is an argument that the adversarial optimization of deepfake generators targets exactly the cues humans rely on, and has succeeded at erasing them.
THE DOCUMENTED INCIDENT RECORD
The catalog of high-impact deepfake incidents has grown steadily. A partial list of events that have received substantial primary-source documentation:
January 2024 — New Hampshire Biden robocall. A synthetic voice clone of U.S. President Joe Biden was deployed in a robocall to New Hampshire Democratic voters, telling them to stay home during the primary. The FCC subsequently ruled that AI-generated voices in robocalls were illegal under the Telephone Consumer Protection Act. The incident was the first widely recognized use of voice-cloning deepfakes in a U.S. election.
February 2024 — Hong Kong video-call fraud. An employee at a multinational firm's Hong Kong office transferred the equivalent of approximately $25 million after participating in a video conference in which all other attendees, including the company's UK-based CFO, were deepfakes. Hong Kong police confirmed the incident in a public statement. The case is significant because it demonstrates that real-time multi-participant deepfake video calls are operationally achievable by criminal organizations.
January 2024 — Non-consensual Taylor Swift imagery. Sexually explicit deepfake images of the musician Taylor Swift circulated on X (formerly Twitter), accumulating tens of millions of views before being removed. The incident triggered bipartisan legislative responses and industry policy reviews. It is significant for illustrating that even the most famous and resourced individuals have limited recourse against deepfake harassment once the content is in circulation.
September 2023 — Slovakia election audio deepfake. A doctored audio recording, allegedly of liberal candidate Michal Šimecža discussing vote-buying with a journalist, circulated in the final 48 hours of the Slovak parliamentary campaign. Progressive Slovakia lost the election. The audio's causal contribution to the outcome cannot be determined, but the timing (deployed during a media moratorium that prevented effective rebuttal) was tactically significant.
Ongoing — Non-consensual intimate imagery (NCII). Multiple industry analyses have found that the overwhelming majority of deepfake videos in circulation are non-consensual intimate imagery, primarily targeting women. This is the category with the highest absolute volume of harm and the weakest legal response in most jurisdictions.
Ongoing — Business email compromise and voice-cloning fraud. The FBI's Internet Crime Complaint Center has reported rising losses from AI-enabled fraud, including voice cloning used to impersonate executives authorizing wire transfers. The specific dollar figures vary by reporting period and methodology but the trend is unambiguously upward.
THE LIAR'S DIVIDEND
The most influential theoretical contribution to deepfake scholarship is the concept of the "liar's dividend," introduced by law professors Danielle Citron and Robert Chesney in a 2019 paper in the California Law Review. Chesney and Citron argue that the primary structural harm from deepfakes is not the fabrication of false content but the destruction of trust in authentic content. Once deepfakes are sufficiently convincing and sufficiently common, anyone caught on camera can plausibly claim the recording is synthetic, regardless of its actual provenance.
The liar's dividend operates independently of whether any specific deepfake exists. Its mechanism is the possibility of deepfakes, not the fact of them. As soon as a population is aware that photorealistic synthetic video is achievable, authentic recordings of consequential events become deniable. Video evidence of police conduct, corporate malfeasance, war crimes, and political statements all become subject to the universal defense: "that's a deepfake."
The empirical literature has begun to document this effect. Hameleers et al. showed in a 2024 study that subjects exposed to the concept of deepfakes subsequently discounted authentic video evidence at higher rates, even when the evidence was demonstrably real. The liar's dividend is not just a theoretical concern; it is changing how people evaluate all digital evidence.
Chesney and Citron's paper pre-dates the most recent generation of deepfake technology, and its predictions about the trajectory have proven conservative rather than alarmist. The paper is the most-cited theoretical work on deepfakes in the policy literature for good reason.
THE DETECTION ARMS RACE
The obvious response to synthetic media is detection: build classifiers that can identify whether a given image, video, or audio clip is machine-generated. A substantial research literature has accumulated on this approach, along with commercial products and government programs.
The structural problem is that detection is losing by design. GANs are explicitly trained against a discriminator, which means every improvement in discrimination produces a corresponding improvement in generation. Diffusion models are not adversarially trained but can be fine-tuned against any fixed detector to the same effect. The arms race is asymmetric in favor of generation because the generator only needs to fool a detector, while a detector must generalize across an open-ended space of possible manipulations.
The empirical record has been grim. Detection accuracy on state-of-the-art generations is substantially lower than the 95%+ figures sometimes quoted for older deepfake detection benchmarks, and degrades further against adversarially crafted examples. Moreover, even accurate detection at the individual image level is operationally insufficient: social media platforms process billions of uploads daily, and running every upload through an expensive detection model is economically and technically prohibitive.
Detection retains value as a component of layered defense but cannot by itself solve the deepfake problem. The policy literature has increasingly concluded that downstream detection is the wrong architectural layer and that the solution, if one exists, must operate at the point of capture rather than the point of distribution.
THE PROVENANCE ALTERNATIVE: C2PA
The most promising structural response to deepfakes is content provenance: cryptographic verification at the point of capture that a given piece of media was produced by a specific device at a specific time and has not been subsequently modified. The Coalition for Content Provenance and Authenticity (C2PA) is an industry consortium including Adobe, Microsoft, Intel, BBC, and Sony that has published an open technical specification for content provenance metadata.
The C2PA approach inverts the detection problem. Rather than trying to identify whether content is fake (which is structurally hard), C2PA makes it possible to verify whether content is authentic (which is cryptographically tractable). A C2PA-enabled camera embeds a signed manifest with each image: when and where it was captured, what device captured it, what processing has been applied. Viewers can verify the chain and trust or distrust accordingly.
The limitations are significant. C2PA requires adoption at the hardware level (cameras) and at the distribution level (platforms that preserve and display the provenance metadata). It does not address content captured with non-C2PA devices, which will be the vast majority of content for the foreseeable future. It also addresses only origin, not semantic truth: a C2PA-verified photograph of a misleading scene is still misleading, even if it is authentic in the strict sense.
Despite these limitations, C2PA is the most coherent response that has emerged. The European Commission's AI Act includes provisions encouraging provenance-based approaches, and several major platforms have begun supporting C2PA metadata in early 2024. Whether adoption reaches critical mass is an open question.
SOCIAL MEDIA AMPLIFICATION
Deepfakes do not spread on their own. They spread through social media platforms whose recommendation algorithms optimize for engagement, and synthetic media is disproportionately engaging. Emotionally provocative content performs better in engagement metrics regardless of its authenticity, and deepfakes are explicitly engineered to be emotionally provocative.
This creates a structural problem for platforms. Their business models depend on engagement. Moderating deepfakes reduces engagement. Platforms have adopted labeling and detection policies, but enforcement has been inconsistent and the economic incentives work against aggressive removal. The speed asymmetry is also critical: a deepfake can go viral in minutes, while detection, review, and removal take hours or days. By the time content is removed, the harm has already been done.
The recommendation algorithms that amplify deepfakes are themselves AI systems optimizing for engagement signals that do not distinguish authentic from synthetic. There is no adversarial intent in this dynamic; it is a structural consequence of algorithmic optimization over a metric that correlates with harm.
EROSION OF EPISTEMIC TRUST
The deepest consequence of widespread deepfake capability is not the harm from any specific fabrication. It is the cumulative erosion of what Rainie and Anderson and others have called "epistemic trust": the baseline assumption that certain institutions, evidence types, and sources are reliable enough to serve as a shared reference for collective decision-making.
Epistemic trust is what allows courts to treat video as evidence, journalists to cite footage as documentation, and democratic publics to agree on basic facts even when they disagree on values. Deepfakes erode this trust at every level. The failure mode is not that people believe specific lies; it is that they retreat into sources that confirm their priors and dismiss contrary evidence as potentially synthetic. The result is not a single false reality; it is the absence of any shared reality at all.
Rini has argued, in a widely cited philosophy paper, that deepfakes undermine the epistemic practice of trusting testimony from remote sources, which has historically been a substantial component of how knowledge is acquired at scale. If testimony becomes unreliable across the board, the cost of establishing shared knowledge about the world rises sharply.
WHAT HELPS AT THE INDIVIDUAL LEVEL
Individual-level responses to a structural problem are inadequate, but they are not worthless. The most defensible recommendations from the literature are:
Lateral reading. Rather than evaluating a claim on the content itself, verify it across multiple independent sources before sharing. This is the single most effective media literacy technique documented in the empirical literature. A deepfake deployed by a single source is unlikely to be corroborated by others; cross-referencing is cheap and works.
Emotional pause. Deepfakes exploit emotional responses. Content that produces strong immediate outrage or alignment with prior beliefs should trigger additional scrutiny rather than immediate sharing. The emotional response is the exploit vector.
Provenance awareness. When C2PA or similar provenance metadata is available, use it. When it is not, note the absence. Media from unknown sources without any chain of custody should be treated with more skepticism than media from identified journalists or institutions.
Calibrated uncertainty. Accept that absolute certainty about digital media is no longer possible and operate with probabilistic rather than binary judgments. This is uncomfortable but it is the honest epistemic stance given the current technology.
CONCLUSION
Deepfakes have moved from a research curiosity to an operational capability across multiple domains of harm in less than a decade. The technical trajectory is toward higher quality and lower cost. Detection cannot keep up by design. The liar's dividend operates whether or not any specific deepfake is produced. The most promising structural response is content provenance, which requires broad institutional adoption that has not yet occurred.
The honest summary is that the epistemic cost of widespread synthetic media is already being paid and will continue to grow. The appropriate response combines individual media literacy, institutional provenance infrastructure, and regulatory frameworks that impose costs on malicious use. None of these alone is sufficient; all are necessary.