LLM-based chunking is worth the cost only for high-value, semantically complex content where retrieval accuracy directly impacts revenue (legal discovery, medical literature, financial analysis). The method extracts "propositions" - self-contained semantic units - achieving 25-30% better context preservation than embedding methods. But it's 50-100x more expensive.
For production, implement a hybrid pipeline:
Optimization tips:
Monitor proposition quality with a sample - if <80% are truly self-contained, your prompt needs tuning. Real-world benchmark: LLM chunking improved answer accuracy by 18% on multi-hop reasoning but increased indexing costs by 40x.