Vertesia Launches New Semantic Document Preparation Service

03.06.25 14:15 Uhr

Revolutionary agentic service speeds development by transforming complex PDF documents
into richly structured XML, enabling GenAI models to accurately interpret content and deliver
reliable results

BROOKLYN, N.Y., June 3, 2025 /PRNewswire/ -- Vertesia, a unified, low-code platform for developing, deploying, and operating custom generative AI applications, today announced the launch of its Semantic DocPrep service, a cloud-based API service designed to eliminate hallucinations and speed the development of generative AI (GenAI) applications.

Vertesia is a low-code platform for developing, deploying, and managing specialized generative AI (GenAI) applications. (PRNewsfoto/Vertesia)

As enterprise adoption of GenAI expands, organizations face two persistent challenges: ensuring output accuracy and managing the burden of data preparation. According to Vertesia's research, up to 50% of GenAI development time is consumed by document preparation alone. Semantic DocPrep removes these barriers.

"The two concerns we hear most from enterprise leaders are consistent: 95% accuracy isn't good enough, and data preparation is a costly, time-consuming challenge," said Chris McLaughlin, Chief Revenue Officer at Vertesia. "Our Semantic DocPrep service was built to solve both – giving developers a set of APIs to automate document preparation and significantly improve the accuracy and relevancy of LLM outputs. It removes two major hurdles to building reliable, enterprise-grade GenAI applications."

With five patents pending, Vertesia's new Semantic DocPrep service works by converting even the most complex documents, such as invoices, annual reports, and regulatory filings, into richly structured, semantically tagged XML – without rewriting or altering the source. By preserving the original structure, relationships, and context, Vertesia ensures that large language models (LLMs) can accurately interpret documents without fabricating or misrepresenting information – dramatically improving the accuracy and reliability of model outputs.

Unlike conventional tools that flatten or rewrite inputs, Vertesia's approach deconstructs documents at the page level, automatically determining the most appropriate AI model based on that page's content — whether it's dense text, tabular data, images, or a mix. Some pages are best handled by LLMs, others by OCR or vision models. This hybrid method also forbids model rewrites, preserving the original text without corrections. The output is high-fidelity XML that precisely mirrors the original document and supports downstream processing with 100% accuracy.

Designed for developers building custom GenAI apps and Retrieval-Augmented Generation (RAG) systems, Semantic DocPrep fits seamlessly into modern AI pipelines. Developers send documents—PDFs generated from Word, PowerPoint, or other formats—via an API, and receive structured XML output that's ready for chunking, indexing, and model ingestion. No setup or model training is required.

Semantic DocPrep is part of Vertesia's broader platform, which provides the end-to-end infrastructure organizations need to build, deploy, and manage custom GenAI applications and agents at scale. From intelligent content pre-processing to agentic RAG, hybrid search and observability, Vertesia offers a unified foundation to accelerate GenAI development while maintaining control, accuracy, and performance. Pricing is designed to be affordable and starts well below other document processing services, while delivering higher output fidelity, precision, and control. Get started with a free trial or learn more by visiting: vertesiahq.com

About Vertesia
Vertesia is a unified, low-code platform for developing and deploying generative AI (GenAI) applications in days, not months. The unified platform enables customers to intelligently operate these solutions, giving new levels of visibility and ownership to the business and ensuring full governance and compliance. Simply put, Vertesia delivers GenAI at enterprise scale.

Media Contact:
Andrea LePain
andrea@emediajunction.com
617-894-1153

Cision View original content to download multimedia:https://www.prnewswire.com/news-releases/vertesia-launches-new-semantic-document-preparation-service-302471462.html

SOURCE Vertesia