A modular Python project combining:
- LangGraph Agent Flow:
Scraper → Summarizer → Tagger → Connector → Publisher - Ollama for Local Summarization:
Using/api/generateendpoint with streamed response handling - Streamlit UI for End-User Display:
Browse markdown-formatted insights with tags and source URLs - Markdown + JSON File Storage:
Insights saved as.md+.meta.jsonpairs
backend/
├── app.py (FastAPI routes)
├── core/orchestrator.py (LangGraph flow)
├── agents/ (scraper_agent, summarizer_agent, etc.)
├── protocols/ (mcp, a2a, acp structures)
├── utils/
streamlit_ui/
├── app.py (Streamlit frontend)
docker-compose.yml
README.md
git clone https://github.com/yourusername/insight-platform.git
cd insight-platformStart only Ollama service first:
docker compose up ollamaThen pull the model:
curl -X POST http://localhost:11434/api/pull \
-H "Content-Type: application/json" \
-d '{"name": "llama3"}'docker compose up --build- Backend available at:
http://localhost:8000 - Streamlit UI at:
http://localhost:8501
- Send a URL to
/run_pipeline:- The pipeline scrapes → summarizes → tags → connects → publishes.
- Visit Streamlit UI or call
/list_insightsAPI to view insights. - Insights include:
- Title, tags, content preview, original source URL.
- No CI/CD or Kubernetes deployment yet.
- Models must be pulled manually the first time.
- Local file storage only; no database integration yet.
- CI/CD pipelines (GitHub Actions, Docker Registry)
- Full Kubernetes Helm Charts
- OAuth2 login on Streamlit UI
- Ollama model pre-pull automation
- Fork → Clone → Submit PR
- Focus on modular, readable, community-friendly code.