VECTOR FEED

Usage

CLI

The vector-feed CLI is the primary entry point for document conversion.

Basic Usage

vector-feed -p input.pdf
vector-feed -p input.pdf -o ./output

Backend Selection

vector-feed -p input.pdf --backend pipeline
vector-feed -p input.pdf --backend vlm
vector-feed -p input.pdf --backend hybrid

Key Flags

Flag	Description
`-p, --path`	Input file/directory path (PDF, images, DOCX, PPTX, XLSX)
`-o, --output`	Output directory
`--backend`	Inference backend: `pipeline`, `vlm`, `hybrid`, `office`
`--lang`	Target language (e.g., `en`, `zh`)
`--formula`	Enable formula recognition (latex)
`--table`	Enable table recognition
`--api-url`	Remote API server URL (client-server mode)
`--vlm-model`	VLM model path or HuggingFace ID
`--vlm-engine`	VLM inference engine: `vllm-engine`, `lmdeploy-engine`, `transformers-engine`

Client-Server Mode (v3.0+)

The CLI acts as an orchestration client. Without --api-url, it launches a LocalAPIServer internally.

# Remote server mode
vector-feed -p input.pdf --api-url http://localhost:8000

Python API

from vector_feed.cli.common import do_parse

# Synchronous parsing
result = do_parse(
    data=b"...",                    # Raw document bytes
    output_dir="./output",
    backend="pipeline",
    lang="en",
    formula_enabled=True,
    table_enabled=True
)

# Returns: (middle_json, markdown_content)
middle_json, markdown = result

Async API

from vector_feed.cli.common import aio_do_parse

result = await aio_do_parse(
    data=b"...",
    backend="vlm",
    vlm_model="Qwen/Qwen2-VL-7B-Instruct",
    vlm_engine="vllm-engine"
)

REST API

VECTOR FEED provides a FastAPI-based server (vector-feed-api):

vector-feed-api --port 8000

Endpoints:

POST /v1/parse — Submit document for parsing
GET /v1/task/{task_id} — Poll task status and retrieve results