procedure
The DocumentParser component utilizes the open-source Docling library to convert input documents in formats such as PDF, HTML, XLSX, and CSV into a unified intermediate representation in JSON or Markdown format, while retaining layout, tables, and structural metadata.
Authors
Sources
- Efficient Knowledge Graph Construction and Retrieval from ... - arXiv arxiv.org via serper
Referenced by nodes (1)
- JSON concept