Configuration¶
Caption generation settings management
Overview¶
Even with the same model and same video, completely different results are generated depending on config settings.
Factors affected by Config:
- Engine/model to use
- Scene detection method (fine-grained vs coarse-grained)
- Summarization settings
- Other custom parameters
Config Flexibility
Config files can be freely defined by users.
Complex parameters are not required; you can simply write only core settings.
VSS Framework Configuration Example¶
When using the VSS framework, detailed settings like below are possible. However, you don't need to configure all items in a complex way; you can selectively use only the items you need.
| Config | Description |
|---|---|
| prompt | Default prompt to use for caption generation |
| caption_summarization_prompt | Prompt used when creating caption summaries from segment-by-segment caption/ASR results |
| max_tokens | Maximum token count for captions (1-1024) |
| temperature | Sampling temperature for captions (0-1) |
| top_p | Top-p sampling mass for captions (0-1) |
| top_k | Top-k candidate token count for captions (1-1000) |
| summarize_top_p | Top-p in summarization stage (0-1) |
| summarize_temperature | Temperature in summarization stage (0-1) |
| summarize_max_tokens | Maximum token count in summarization stage |
| enable_audio | Whether to enable audio stream ASR |
| enable_reasoning | Whether to enable reasoning mode |
| chunk_duration | Chunking video in N-second units |
| chunk_overlap_duration | Overlapping duration between chunks (seconds) |
| summarize_batch_size | Batch size to input at once to summary LLM (caption_summary unit) |
| segment_source | Segment generation criteria (start/end specified externally) |
[!TIP] Like the Gemini example, it can actually be configured very simply in practice.
Config File Management¶
Storage Location¶
/gpfs/public/artifacts/feature_store/vss_feature_store/config/
├── gemini_coarse.yaml
├── gemini_fine.yaml
└── custom_config.yaml
Feature Store Integration¶
When calling register_captions, input the absolute path with the config parameter. The system automatically copies the file to the management path (/gpfs/public/artifacts/feature_store/vss_feature_store/config), and only the filename is stored in Feature Store.
# 1. During registration: Input absolute path (automatic copy occurs)
api.register_captions(
uuid='...',
model='gemini-2.5-pro',
config='/gpfs/public/my_configs/gemini_fine.yaml', # ← Input absolute path
...
)
# 2. During retrieval: Query with saved filename
captions = api.get_captions(
uuid='...',
model='gemini-2.5-pro',
config='gemini_fine.yaml' # ← Query with filename only
)
Auto-copy Mechanism
If the file at the input absolute path doesn't exist in the management directory, it's automatically copied. If it already exists, the existing file is used. Through this, all configuration files are systematically managed centrally.
Config Examples¶
gemini_fine.yaml (Fine-grained)¶
Fine-grained scene detection
main_engine:
name: gemini
scene_detection:
type: fine-grained
method: gemini-2.5-pro
summarization:
llm: gemini-2.5-pro
Characteristics: - Fine-grained scene detection - More segments generated - Detailed analysis
gemini_coarse.yaml (Coarse-grained)¶
Broad range scene detection
main_engine:
name: gemini
scene_detection:
type: coarse-grained
method: gemini-2.5-pro
summarization:
llm: gemini-2.5-pro
Characteristics: - Coarse-grained scene detection - Fewer segments - High-level summary
gpt4_custom.yaml (Custom Configuration)¶
Custom configuration example
main_engine:
name: openai
model: gpt-4o
scene_detection:
type: fine-grained
fps: 1
processing:
max_segments: 50
overlap: true
output:
format: detailed
Characteristics: - Users define only needed parameters - Can be written simply or complexly
Config Comparison Example¶
When processing with the same video(uuid=abc123), same model(gemini-2.5-pro):
| Config | Scene Detection | Segment Count | Characteristics |
|---|---|---|---|
| gemini_fine | fine-grained | 45 | Detailed analysis |
| gemini_coarse | coarse-grained | 15 | High-level summary |
Same Time, Different Results¶
# 2 versions registered at 2024-12-01 10:00:00
metadata = api.search_metadata(
uuid='abc123',
model='gemini-2.5-pro',
time_after='2024-12-01T09:00:00',
time_before='2024-12-01T11:00:00'
)
# Output:
# config segment_count
# gemini_fine.yaml 45
# gemini_coarse.yaml 15
Config Selection Guide¶
Fine-grained¶
Use Cases: - Detailed behavior analysis - Frame-by-frame change tracking - Educational content
Advantages: High precision
Disadvantages: Many segments, long processing time
Coarse-grained¶
Use Cases: - Entire video summary - Quick preview - Scene transition-focused analysis
Advantages: Fast processing, few segments
Disadvantages: Possible loss of detailed information
Config Registration and Retrieval¶
During Registration¶
# Register with Fine config
result = api.register_captions(
feature_view='caption_summary',
uuid='abc123',
model='gemini-2.5-pro',
config='gemini_fine.yaml', # ← Specify
segments=[...]
)
# Register with Coarse config (same video)
result = api.register_captions(
feature_view='caption_summary',
uuid='abc123',
model='gemini-2.5-pro',
config='gemini_coarse.yaml', # ← Different config
segments=[...]
)
During Retrieval¶
# Retrieve Fine version
captions_fine = api.get_captions(
feature_view='caption_summary',
uuid='abc123',
model='gemini-2.5-pro',
config='gemini_fine.yaml'
)
# Retrieve Coarse version
captions_coarse = api.get_captions(
feature_view='caption_summary',
uuid='abc123',
model='gemini-2.5-pro',
config='gemini_coarse.yaml'
)
# Compare
print(f"Fine segments: {len(captions_fine)}") # 45
print(f"Coarse segments: {len(captions_coarse)}") # 15
Best Practices¶
1. Clear Naming¶
2. Config Version Control¶
# Manage config files with Git
cd /gpfs/public/artifacts/feature_store/vss_feature_store/config/
git init
git add *.yaml
git commit -m "Initial configs"
3. Add Comments¶
# Gemini Fine-grained Config
# Created: 2024-12-01
# Purpose: Detailed scene-by-scene analysis
main_engine:
name: gemini
# Fine-grained setting for detailed analysis
scene_detection:
type: fine-grained
method: gemini-2.5-pro
4. Config Documentation¶
Manage descriptions of each config file in a separate README:
config/
├── gemini_fine.yaml
├── gemini_coarse.yaml
└── README.md # ← Description and usage guide for each config
Key Points¶
Important
- Config is freely defined by users
- Simple structure is sufficient (even with just 3-5 fields)
- Even with same video + same model, completely different results depending on config
Next Steps¶
- API Reference - Config parameter usage
- Architecture - Config storage structure
- Getting Started - Config application examples