VSS Feature Store¶

A data management system for systematically storing and retrieving video analysis results such as video captions, ASR (speech recognition), and summaries

System Introduction¶

VSS-Feature Store is a video analysis result management system built on Feast.

Key Features:

Integrated management of video analysis results by model, configuration, and generation time
Data reproduction at specific points in time through Point-in-Time queries
Entity Discovery through metadata search
Support for registering results from external models (GPT-4, Claude, Gemini)

System Architecture¶

System Architecture

Components¶

Component	Role	Notes
VSS Framework	Video analysis execution	VLM(captions), ASR(speech recognition), LLM(summary)
Milvus DB	Vector search system	Dedicated to VSS services (RAG, Video QA)
Feature Store	Data version management	Point-in-Time queries, research/analysis use
Metadata Store	Entity Discovery	UUID/model/config/time-based search
Airflow	Auto-synchronization	Milvus → Feature Store (10-minute cycle)

Data Flow¶

Generation

VSS Framework receives video input and generates captions (VLM), ASR, summaries (LLM)
Results are stored in Milvus DB for real-time service provision

Collection

Automatic: Airflow synchronizes Milvus → Feature Store every 10 minutes
Manual: Direct registration of external model results (GPT-4, Claude, etc.) via API

Consumption

Researchers/engineers query data through Feature Store API
Model comparison, dataset curation, experiment reproduction, etc.

Why Feature Store?¶

Problem¶

Different captions are generated for the same video depending on various conditions:

Model: GPT-4o, Claude-3.5, Gemini, etc.
Configuration (Config): prompt, temperature, chunk_duration, etc.
Generation Time: model updates, reprocessing

Importance of Config

Even with the same model and same video, completely different results are generated depending on config settings!

Example: scene_detection: fine-grained vs coarse-grained
→ Same 120-second video generates 45 vs 15 segments

Config can be written simply and users can freely define it.
See Configuration for details

Solution¶

Through Feature Store:

Prevent duplicate processing: Centrally manage high-cost computation results
Consistent access: All researchers/engineers use the same data
Experiment reproduction: Accurately reproduce past experiments with Point-in-Time queries

Quick Start¶

Installation¶

# 1. Virtual Environment Setup
uv venv
source .venv/bin/activate

# 2. Install Libraries
uv pip install -e "packages/mantis_common" -e "services/svc-vss/data/feast"

# 3. Environment Variable Setup
export AWS_ACCESS_KEY_ID=minioadmin
export AWS_SECRET_ACCESS_KEY=minioadmin
export AWS_ENDPOINT_URL_S3=http://andrew-minio-2025-12-03-svc.dev.idc.k8s:9000
export FEAST_S3_ENDPOINT_URL=http://andrew-minio-2025-12-03-svc.dev.idc.k8s:9000

# 4. Run MCP Server (Verify)
python services/svc-vss/data/feast/mcp_server.py

# 5. Function Testing
python services/svc-vss/data/feast/test_feast.py

Basic Usage¶

import api
from constants import FEATURE_VIEW_VIDEO_DESCRIPTION

# 0) Target Feature View
fv = FEATURE_VIEW_VIDEO_DESCRIPTION

# 1) Register New Captions (required step)
result = api.register_captions(
    feature_view=fv,
    uuid='yt-video-001',
    mp4_source='https://www.youtube.com/watch?v=wjZofJX0v4M',
    model='gpt-4o',
    segments=[
        {'index': 0, 'start': 0.0,  'end': 10.0, 'text': 'Intro segment...'},
        {'index': 1, 'start': 10.0, 'end': 20.0, 'text': 'Content segment...'},
    ],
)

# 2) Retrieve Captions (verify registration result)
captions = api.get_captions(
    feature_view=fv,
    uuid='yt-video-001',
    model='gpt-4o',
)

# 3) Search Metadata (optional)
metadata = api.search_metadata(
    feature_view=fv,
    models=['gpt-4o'],
)

See Getting Started for details

API Reference¶

Caption Retrieval¶

API	Description
search_metadata	Search video metadata (UUID, model, date filters)
get_captions	Retrieve captions for single video
get_captions_batch	Batch retrieval of captions for multiple videos

Caption Registration¶

API	Description
register_captions	Register captions for single video (GPT-4, Claude, etc.)
register_captions_batch	Batch registration of captions for multiple videos (JSON file)

Video Management¶

API	Description
get_video	Retrieve single video information
get_all_videos	Retrieve all video information

MCP Integration¶

Use Feature Store with natural language through MCP (Model Context Protocol):

"Find all GPT-4 captions registered in December"
"Show captions with coarse config for video WuFL2bJm2yo"

See MCP Guide for details

Next Steps¶

Getting Started - Get started in 5 minutes
Configuration - Config setup guide (Important!)
API Reference - Full API documentation
Architecture - Understand system structure
MCP Guide - How to use natural language interface

Reference Materials¶

📄 Detailed Design Document¶

For complete design documentation of VSS Feature Store, refer to this Notion page:

VSS-Feature Store Design Document (v1)
Background, motivation, ER diagram, API details, FAQ, and complete design content

📚 Feast Learning Materials¶

Understanding Feast, the core technology of Feature Store:

Feast Concept Summary Document
Training-Serving Skew, Feature Store concepts, Feast architecture summary
Feast Official Documentation
Feast framework official reference

Support¶

📧 Email: sunghyun.ahn@pyler.tech
🐛 Issue Report: GitHub Issues
💬 GitHub: github.com/pylerAI/mantis-ml-platform