AI-Powered-Document-Search-and-Chatbot-Development

What the project does

An interactive Q&A system that lets users upload PDF, DOCX, or TXT documents and ask natural‑language questions; the app extracts the text, creates semantic embeddings, stores them in a vector database, and generates answers using a locally‑run Mistral LLM via Ollama.

Key features

- Multi‑format document upload and text extraction (PDF, Word, plain text)

- Semantic embeddings with Sentence‑Transformers (`all-mpnet-base-v2`)

- Vector storage & similarity search using ChromaDB

- AI‑generated answers via Mistral LLM (Ollama)

- FastAPI backend with `/query` endpoint and health check

- Streamlit web UI for easy interaction

- Modular utilities for extraction, embedding, and querying

Tech stack

- Frontend: Streamlit

- Backend: FastAPI (Uvicorn)

- LLM: Mistral (hosted locally through Ollama)

- Vector DB: ChromaDB

- Embeddings: Sentence‑Transformers (`all-mpnet-base-v2`)

- Document parsing: PyMuPDF, python-docx

- Core libraries: LangChain, HuggingFace, Requests

Use case

Enables businesses, researchers, or anyone with unstructured documents to quickly build a searchable knowledge base and chatbot that can answer domain‑specific questions without relying on external APIs.

AI-Powered-Document-Search-and-Chatbot-Development

System Overview

What the project does

Key features

Tech stack

Use case

Architecture Details

Backend Infrastructure

AI / Logic Core

Tech Stack

Key Capabilities