VECTOR FEED

Pipeline Backend

The Pipeline backend is a traditional computer vision pipeline composed of sequential expert models. It is the default backend and offers the best throughput for batch processing on GPU hardware.

Pipeline Backend Flow

Processing Stages

Pre-processing

  • Orientation Correction: Detects and corrects page rotation anomalies before parsing.
  • Language Detection: Identifies document language via fast-langdetect to select appropriate OCR models.

Layout Analysis

  • Page Layout Detection: Model-based classification of page regions (text, title, image, table, formula).
  • Block Segmentation: Divides detected regions into individual content blocks.
  • Reading Order Reconstruction: Reassembles blocks in logical reading sequence, handling multi-column and complex layouts.

OCR Engine

Performs text recognition across detected text regions with multi-language support. The OCR model is selected based on the detected document language.

Formula Recognition

Subsystem Function
MFD (Math Formula Detection) Identifies regions containing mathematical expressions
MFR (Math Formula Recognition) Converts detected formula images to LaTeX notation

Table Recognition

Type Method
Wired Tables Grid line detection and cell extraction
Wireless Tables Content-spacing analysis for boundary detection
OTSL Format Output Table Structure Language for downstream processing

middle_json Assembly

All model outputs are merged into a normalized middle_json intermediate representation, which is then passed to the Markdown generator.