Sustainability Analyzer
AI-driven ESG report analysis — topic discovery, qualitative assessment, automated cross-company comparison.

Overview
An analysis pipeline that parses corporate ESG report PDFs and performs systematic qualitative assessments with AI. Docling layout parser preserves table and text structure through hierarchical chunking. BGE-M3 embeddings (Dense+Sparse) enable hybrid semantic search. Claude auto-discovers 41 ESG topics and generates 415 checklist items for qualitative evaluation. Local embeddings + free APIs. Structured storage in PostgreSQL + pgvector.
Features
Structure-Preserving PDF Parsing
Docling layout parser — hierarchical chunking that preserves tables and section boundaries
Hybrid Search
BGE-M3 Dense+Sparse embeddings — pgvector hybrid semantic search
Auto Topic Discovery
Claude Sonnet — auto-discovers 41 topics and 415 checklist items from ESG reports
AI Qualitative Assessment
Checklist-based evaluation per topic — cited evidence, scores, and commentary
Cross-Company Comparison
Sector-filtered topic × company matrix — auto-generated comparison tables
Local + Free
BGE-M3 on-device embeddings, Claude & Gemini free-tier usage