Deployment
Docker
Global (HuggingFace models)
git clone https://github.com/opendatalab/VECTOR-FEED.git
cd VECTOR-FEED
docker compose --profile global upChina (ModelScope mirrors)
docker compose --profile china upMulti-GPU
VECTOR FEED supports distributed inference across multiple GPUs. The Pipeline backend splits batch processing across available devices:
# Auto-detect available GPUs
vector-feed -p ./batch/ --backend pipeline
# Manual device specification
CUDA_VISIBLE_DEVICES=0,1,2,3 vector-feed -p ./batch/ --backend pipelineHardware Acceleration
| Platform | Support | Notes |
|---|---|---|
| NVIDIA CUDA | Full | Volta, Ampere, Hopper architectures |
| Apple Silicon (MPS) | Full | Native Metal Performance Shaders |
| CPU-only | Full | Slower, suitable for Office backend |
REST API Server
Start Server
vector-feed-api --port 8000Client Usage
import requests
response = requests.post(
"http://localhost:8000/v1/parse",
files={"file": open("document.pdf", "rb")},
params={"backend": "pipeline"}
)
task_id = response.json()["task_id"]
# Poll for results
result = requests.get(f"http://localhost:8000/v1/task/{task_id}")