Back to Projects
Python
Azure-Document-Intelligence
Enterprise AI system to classify, split, and auto-route PDFs using Azure Document Intelligence and SharePoint.
System Overview
What the project does – An end‑to‑end AI pipeline that ingests a multi‑page PDF, classifies each page using a custom Azure Document Intelligence model, automatically splits the PDF into separate documents, and uploads each file to its appropriate SharePoint folder.
Key features – AI‑powered page‑level classification, multi‑document PDF splitting, zero‑manual sorting, Microsoft Graph integration for SharePoint uploads, enterprise‑grade Azure AD OAuth authentication, modular Python‑based processing pipeline.
Tech stack – Azure Document Intelligence (custom classifier), Python 3.10+, PyMuPDF, Azure AD/MSAL, Microsoft Graph API, SharePoint Online, REST APIs.
Use case – Automates back‑office document routing for KYC, identity verification, loan onboarding, insurance/banking compliance, and any workflow requiring secure, automated segregation and storage of mixed‑type PDFs.
Architecture Details
This system integrates multiple components for a seamless automation flow. Structural interpretation based on project focus:
Backend Infrastructure
Core execution layer for robust data processing and API handling.
AI / Logic Core
Intelligent decisioning via models or logical workflow rules.
Tech Stack
PythonIntegrationAutomationAPIs
Key Capabilities
- ▹ Custom workflow execution
- ▹ Data transformation and routing
- ▹ Extensible architecture