VECTOR FEED

Usage

CLI

The vector-feed CLI is the primary entry point for document conversion.

Basic Usage

vector-feed -p input.pdf
vector-feed -p input.pdf -o ./output

Backend Selection

vector-feed -p input.pdf --backend pipeline
vector-feed -p input.pdf --backend vlm
vector-feed -p input.pdf --backend hybrid

Key Flags

Flag Description
-p, --path Input file/directory path (PDF, images, DOCX, PPTX, XLSX)
-o, --output Output directory
--backend Inference backend: pipeline, vlm, hybrid, office
--lang Target language (e.g., en, zh)
--formula Enable formula recognition (latex)
--table Enable table recognition
--api-url Remote API server URL (client-server mode)
--vlm-model VLM model path or HuggingFace ID
--vlm-engine VLM inference engine: vllm-engine, lmdeploy-engine, transformers-engine

Client-Server Mode (v3.0+)

The CLI acts as an orchestration client. Without --api-url, it launches a LocalAPIServer internally.

# Remote server mode
vector-feed -p input.pdf --api-url http://localhost:8000

Python API

from vector_feed.cli.common import do_parse

# Synchronous parsing
result = do_parse(
    data=b"...",                    # Raw document bytes
    output_dir="./output",
    backend="pipeline",
    lang="en",
    formula_enabled=True,
    table_enabled=True
)

# Returns: (middle_json, markdown_content)
middle_json, markdown = result

Async API

from vector_feed.cli.common import aio_do_parse

result = await aio_do_parse(
    data=b"...",
    backend="vlm",
    vlm_model="Qwen/Qwen2-VL-7B-Instruct",
    vlm_engine="vllm-engine"
)

REST API

VECTOR FEED provides a FastAPI-based server (vector-feed-api):

vector-feed-api --port 8000

Endpoints:

  • POST /v1/parse — Submit document for parsing
  • GET /v1/task/{task_id} — Poll task status and retrieve results