INSIGHTS | September 17, 2025

Deepfake Defense: From No-Cost Basics to Enterprise-Grade Controls

At CanSecWest 2025 I walked through a red team where we used AI voice cloning to test an organization’s people and processes. The short version is this: a familiar voice is not identity. Treat voice as untrusted input and move verification into systems you control.

The financial exposure is no longer hypothetical. Deloitte estimates fraud losses in the United States could reach 40 billion dollars by 2027 as generative AI accelerates vishing and synthetic media.

Recent incidents back this up, including the 25 million dollar Hong Kong video-call heist tied to a deepfake of company leaders, and the Ferrari CEO impersonation attempt that an employee stopped by asking a question only the real executive could answer. Outside the enterprise, deepfake investment ads continue to run across Meta platforms, which erodes trust in familiar faces and voices your employees see every day. In government, an AI voice impersonating the U.S. Secretary of State contacted foreign ministers and U.S. officials in June 2025. Both trends increase the chance that urgency and authority will be misread as authenticity.

What changed

Cloning a usable voice takes seconds of public audio and commodity tools. On the receiving end, most victims are reached by phone or collaboration apps, not polished video. The attacker leans on urgency, hierarchy, and insider context, then pushes to finish before the target can shift to a verified process. The fix is a workflow that forces that shift.

Caller ID helps less than many expect. STIR/SHAKEN can authenticate numbers on IP networks, but it is not proof of who is speaking, and non-IP gaps are still being addressed by the FCC. Treat attestation as a helpful signal, not a decision.

The most targeted roles are accounts payable, vendor management, helpdesk, executive assistants, and HR. If you only harden one slice of the company, start there. Make it clear those teams can slow or refuse requests that arrive by phone or ad-hoc video without any penalty.

How the scam runs in practice

Attackers harvest voice and context from interviews, talks, and social videos. They start a call or meeting that looks routine, often with a look-alike number, and ask for a payment, a vendor add, or a credential reset.

If questioned, they add pressure and a sense of urgency by claiming a change needs to be made immediately for access to a meeting. The goal is to keep you in the real-time channel where social pressure works, and logging is weak. Your goal is to move the request into a tracked system where identity is verified, and more than one person approves.

Meeting platforms are part of the story too. Treat unscheduled or “special” meetings the same way as calls. Use lobbies and named invites. If a meeting requests a payment, an entitlement change, or a vendor add, move it into the approval system and end the call.

Controls by maturity

Entry level

  • Ban voice-only approvals. No payments, access changes, or identity changes are executed from calls or voicemails. Put the request into a ticket or approval flow first.
  • Directory callback only. If “the CFO” calls, hang up and call back using the corporate directory contact, not the inbound number.
  • One-time challenge in a second channel. Send a short code in Slack or Teams and have the caller read it back. Avoid static passphrases that can leak.
  • Plain-language script. “Per policy I am moving this into our approval workflow and calling you back on the directory number. I will send you a code in Slack in 10 seconds.”
  • Show what fakes sound like. With consent, let staff hear an internal clone so they learn what “good enough” fakery sounds like. Realtime voice clones and Text to speech voice clones will each have differing levels of quality that can be picked up on.

Bank instruction rule. Publish a standing rule that your bank instructions never change by phone or email. Put that statement on invoices and vendor onboarding materials so finance has a clear policy to point to.

Mid-maturity

  • Cross-channel confirmation by default. Every sensitive request is confirmed in a second, logged channel and attached to a ticket or approval artifact.
  • Approvals inside Slack or Teams. Use Microsoft Teams Approvals or Power Automate for sequential and multi-stage approvals, or Slack Workflow Builder for routed approvals with an audit trail.
  • Surface STIR/SHAKEN as a signal. If your carrier exposes attestation, show it in softphones and train that it is advisory, not identity.
  • Target training. Focus on AP, vendor management, helpdesk, executive assistants, and HR.

Lock down account recovery. Do not allow phone-based resets or knowledge questions for admins, finance approvers, or identity administrators. Require device-bound authenticators or passkeys and route all recovery into a ticket with second-party approval.

Enterprise-grade

  • Multi-party, cross-function approvals. Require at least two approvers from different functions for wires, new vendors, privileged access, and identity changes. Build this into Teams or Slack plus your ticketing or ERP.
  • Timed holds for high-risk actions. Add a 2 to 24 hour hold for first-time payees, large payments, and new vendors. Require re-affirmation during the hold.
  • Telco posture with specifics. Ask providers about STIR/SHAKEN coverage and expose attestation in your tooling. Track FCC work to close non-IP gaps and treat caller authentication as one input among many.

Executive exposure management. Where practical, limit long, clean public voice samples, and trim long monologues into shorter clips.

Incident response if you suspect a deepfake call

End the live channel and move to policy. Open a ticket, freeze any pending payment or access change, and notify finance leadership and security. If the window for a bank recall exists, contact your bank immediately. Save call metadata and collaboration logs. If recording calls is part of your workflow, make sure you follow local consent laws.

Do we need a deepfake detector?

Use detectors as a signal, not a gate. Accuracy varies in the wild, especially with compressed audio and human-in-the-loop attackers. Directory callbacks, second-channel challenges, multi-party approvals, and timed holds are what stop losses when detection misses.

Conclusion

Deepfakes are a forcing function for better workflow. Start by removing voice-only approvals and requiring directory callbacks and second-channel challenges. Add routed approvals inside Slack or Teams. For high-risk actions, require multiple approvers and a hold. If a perfect clone calls, these controls still give your team time to slow down and verify before money moves.

About the Author

Dave Falkenstein, Red Team Tech Lead, IOActive. Based on the CanSecWest 2025 talk “Deepfake Deception: Weaponizing AI-Generated Voice Clones in Social Engineering Attacks.”

Additional Resource

The Wall Street Journal | IOActive Senior Security Consultant David Falkenstein and The Wall Street Journal recently collaborated to create a quiz that looks to test if “your ears [can] distinguish a human voice from an AI deepfake…”

David was able to use AI to clone The Wall Street Journal colleagues sourced from publicly available social media clips and used OpenAudio to help run the experiment.

“Can your ears distinguish a human voice from an AI deepfake? Knowing the difference could save you from a phone scam that costs you thousands of dollars. …

… To test this, we enlisted David Falkenstein of corporate security firm IOActive to clone a few Wall Street Journal colleagues. He pulled down bits of our publicly available social-media and podcast audio, clips just 10 to 30 seconds in length. He used OpenAudio—easily accessible software that can run on a laptop—to make our voices say some pretty crazy things.”