VECTOR FEED

Architecture

VECTOR FEED employs a multi-backend architecture that dispatches document parsing tasks to specialized engines based on configuration. The central orchestration functions do_parse and aio_do_parse in cli/common.py manage the full lifecycle from raw document bytes to structured output.

System Architecture Overview

Backend Overview

Backend Identifier Description
Pipeline pipeline Traditional multi-model pipeline: layout detection, OCR, MFD/MFR, table recognition. Best for high-throughput batch processing on GPU.
VLM vlm End-to-end vision-language model (Qwen2-VL). Single model handles layout, text, formulas, and tables. Higher accuracy, GPU-intensive.
Hybrid hybrid VLM for coarse layout + pipeline expert models for refinement (table structure, formula verification).
Office office Direct converter for DOCX, PPTX, XLSX using python-pptx, python-docx, openpyxl. No ML inference required.

Dispatch Flow

The backend dispatcher in utils/magic_model_utils.py classifies each page using a lightweight ML model to determine which backend handles it. Pages with heavy mathematical content or complex tables may be routed to VLM even when Pipeline is selected as the primary backend.

VLM & Hybrid Routing
CLI / Python API
    |
    v
do_parse() / aio_do_parse()  --  cli/common.py
    |
    v
magic_model_utils.py  --  per-page backend dispatch
    |
    +---> pipeline  -->  layout -> OCR -> MFD/MFR -> table -> middle_json
    |
    +---> vlm       -->  VLM inference -> middle_json
    |
    +---> hybrid    -->  VLM layout + expert model refinement -> middle_json
    |
    +---> office    -->  python-pptx/docx/openpyxl -> middle_json
    |
    v
middle_json  -->  union_make()  -->  Markdown / content_list

Singleton Model Management

Both Pipeline and VLM backends use thread-safe singleton patterns to prevent redundant model loading:

  • Pipeline: AtomModelSingleton caches atomic models (Layout, OCR, MFR, WirelessTable, WiredTable, etc.) keyed by configuration parameters (device, thresholds, language).
  • VLM: ModelSingleton caches VECTORFEEDClient instances keyed by (backend, model_path, server_url).

Data Layer

The data/ module provides DataReader / DataWriter abstractions supporting local filesystem, HTTP, and S3-compatible storage. The parsing logic is fully decoupled from storage.