Beyond OCR: Benchmarking Question-Answering on Complex Industrial PDFs with TIA-pdf-QA-Bench

Jun 3, 2025

ThirdAI Automation Team

Document QAIndustrial AutomationOCR BenchmarkingRAG SystemsPDF ProcessingThirdAI

Introduction

A growing number of benchmark datasets have emerged for evaluating document understanding, particularly in areas like Optical Character Recognition (OCR), information extraction, and question answering (QA). However, many existing benchmarks rely on clean document formats or fail to evaluate end-to-end QA pipeline quality. This makes them insufficient for assessing real-world industrial documents where noise, formatting variability, and complex semantics are the norm.

At ThirdAI Automation, we address a fundamental question: How well can we answer questions based on complex, semi-structured PDF documents from industrial domains? To solve this, we created TIA-pdf-QA-Bench, a new benchmark evaluating end-to-end QA performance over PDFs with emphasis on retrieval-augmented generation (RAG) pipelines.



Why Traditional OCR Benchmarks Fall Short

OCR performance is traditionally evaluated in isolation using word/character accuracy metrics. While useful for assessing text extraction fidelity, this approach misses a crucial downstream impact: How do OCR mistakes affect real use cases like question answering?

For example:

  • A single misrecognized term in a spec sheet might negligibly impact OCR scores

  • The same error could derail a QA system extracting critical parameters

  • We don't just care if text is readable—we care if it's useful for the task



The Real Challenge: Retrieval and Understanding

OCR is just the beginning. Extracted text must be chunked, linked, and indexed to enable effective retrieval and reasoning. Industrial documents present unique challenges:

  • Long documents with heterogeneous formatting

  • Tables, figures, and side-by-side layouts

  • Implicit references and domain-specific terminology

  • Dense hierarchical structures (specifications, standards)

In TIA-pdf-QA-Bench, we found that text chunking and representation structure profoundly impacts QA performance. Poor chunking leads to:

  • Missed answers

  • Irrelevant retrievals

  • Hallucinations in generative models



About TIA-pdf-QA-Bench

TIA-pdf-QA-Bench evaluates QA quality on real-world industrial documents with these features:

  • Uses authentic PDFs from industrial partners and public sources

  • Simulates realistic QA scenarios (domain terminology, multi-hop reasoning)

  • Evaluates end-to-end pipeline (OCR → preprocessing → retrieval → answer generation)

We tested multiple RAG pipelines using:

  • OCR tools: Tesseract, Azure OCR

  • Chunking strategies: Fixed-length vs. semantic

  • Retrieval methods: Dense vs. sparse



Key Insights

ThirdAI Automation's RAG framework achieved the highest QA accuracy in our benchmark:

ThirdAI's RAG framework outperformed alternatives in industrial document QA

"More benchmark details will be released in an upcoming research paper!"



What's Next?

TIA-pdf-QA-Bench advances realistic, task-oriented evaluation of document intelligence systems. We're expanding the benchmark with:

  • More document types

  • Richer annotations

  • Harder questions to identify failure cases

Example document from our benchmark:

Complex industrial documents in our benchmark require advanced understanding capabilities



Grow With Us

Working on industrial document QA or building PDF reasoning systems? Reach out to our team

We're releasing an API for testing our OCR, Chunking and RAG functionalities! Join the waitlist for early access.


Bring Your Data, Leave With Answers

CONTACT US TO LEARN MORE

Your Data, Your Control

Your Data, Our Highest Priority

We safeguard your information with advanced security protocols and strict compliance standards, including CCPA, ISO, and SOC 2. Learn more in our Privacy Policy.

Get In Touch

Don’t let complexity slow you down.

See how Industrial AI can supercharge your operations.

ThirdAI Automation empowers engineers with agentic AI to pinpoint root causes, accelerate troubleshooting, and surface critical insights—reducing downtime and maximizing efficiency.
Reach out to discover how we can help transform your workflows.

Join Our Newsletter

© 2025 ThirdAI Automation. All Rights Reserved.

ThirdAI
Automation