VECTOR FEED

Pipeline Backend

The Pipeline backend is a traditional computer vision pipeline composed of sequential expert models. It is the default backend and offers the best throughput for batch processing on GPU hardware.

Processing Stages

Pre-processing

Orientation Correction: Detects and corrects page rotation anomalies before parsing.
Language Detection: Identifies document language via fast-langdetect to select appropriate OCR models.

Layout Analysis

Page Layout Detection: Model-based classification of page regions (text, title, image, table, formula).
Block Segmentation: Divides detected regions into individual content blocks.
Reading Order Reconstruction: Reassembles blocks in logical reading sequence, handling multi-column and complex layouts.

OCR Engine

Performs text recognition across detected text regions with multi-language support. The OCR model is selected based on the detected document language.

Formula Recognition

Subsystem	Function
MFD (Math Formula Detection)	Identifies regions containing mathematical expressions
MFR (Math Formula Recognition)	Converts detected formula images to LaTeX notation

Table Recognition

Type	Method
Wired Tables	Grid line detection and cell extraction
Wireless Tables	Content-spacing analysis for boundary detection
OTSL Format	Output Table Structure Language for downstream processing

middle_json Assembly

All model outputs are merged into a normalized middle_json intermediate representation, which is then passed to the Markdown generator.