measurement
The system configuration uses the all-mpnet-base-v2 model from SentenceTransformer for embeddings, Chroma for persistent vector storage, a chunk size of 512 tokens with a 100-token overlap, and a batch size of up to 10,000 chunks.
Authors
Sources
- Bridging the Gap Between LLMs and Evolving Medical Knowledge arxiv.org via serper
Referenced by nodes (1)
- Chrome concept