Back to Projects
Python

Audio-Deepfake-Detection-for-Real-Conversations

What the project does**: Detects AI‑generated (deepfake) speech in real‑world conversations using a lightweight end‑to‑end neural model (RawNet2) trained on the ASVspoof 2019 dataset.

System Overview

What the project does: Detects AI‑generated (deepfake) speech in real‑world conversations using a lightweight end‑to‑end neural model (RawNet2) trained on the ASVspoof 2019 dataset.

Key features

– Raw‑waveform input (no spectrogram preprocessing)

– Sinc‑based filter layer, residual CNN blocks, and GRU for temporal modeling

– CPU‑compatible training/inference, quick proof‑of‑concept pipeline

– Modular code ready for ONNX/TorchScript conversion and API deployment

Tech stack

– Python 3.6+

– PyTorch (≥ 1.10)

– Librosa, NumPy, YAML, TensorBoardX

– Windows environment (CPU only)

Use case: Real‑time or near‑real‑time verification of spoken audio to flag synthetic speech in applications such as call‑center monitoring, voice‑assistant security, and forensic analysis of recorded conversations.

Architecture Details

This system integrates multiple components for a seamless automation flow. Structural interpretation based on project focus:

Backend Infrastructure

Core execution layer for robust data processing and API handling.

AI / Logic Core

Intelligent decisioning via models or logical workflow rules.

Tech Stack

PythonIntegrationAutomationAPIs

Key Capabilities

  • Custom workflow execution
  • Data transformation and routing
  • Extensible architecture