# Kling AI + fal.ai YouTube 자동화 파이프라인 리서치

> 작성일: 2026-03-27
> 목적: 대본 → 씬별 비디오 클립 → 최종 영상 자동화 파이프라인 설계를 위한 레퍼런스

---

## 1. 현재 Kling 모델 버전 및 엔드포인트 (fal.ai 기준, 2026-03)

### 모델 계보 (최신순)

| 모델 | fal.ai 엔드포인트 | 특징 | 가격/초 |
|------|------|------|------|
| **Kling 3.0 O3 Pro** (Omni) | `fal-ai/kling-video/o3/pro/text-to-video` | 멀티캐릭터 일관성, 레퍼런스 비디오, 네이티브 오디오 | ~$0.39/s |
| **Kling 3.0 V3 Pro** | `fal-ai/kling-video/v3/pro/text-to-video` | 멀티샷 스토리보딩, 1080p, 3-15초, 6개 샷/단일생성 | ~$0.39/s (audio) |
| **Kling 3.0 V3 Standard** | `fal-ai/kling-video/v3/standard/text-to-video` | 빠른 반복, 비용 효율적 | ~$0.168/s |
| **Kling 2.6 Pro** | `fal-ai/kling-video/v2.6/pro/text-to-video` | 1080p, 최대 3분(씬 연장), Elements 기능 | ~$0.18/s |
| **Kling 2.5 Turbo Pro** | `fal-ai/kling-video/v2.5-turbo/pro/text-to-video` | 고속 처리, 뛰어난 프롬프트 정확도 | 중간 |
| **Kling 2.1 Standard** | `fal-ai/kling-video/v2.1/standard/image-to-video` | 비용 효율 이미지→비디오 | 저렴 |

### 💡 YouTube B-roll 파이프라인 추천 조합
- **반복/테스트 단계**: V3 Standard 또는 2.5 Turbo (빠르고 저렴)
- **최종 출력**: V3 Pro 또는 2.6 Pro (1080p, 상업 이용 가능)
- **캐릭터 일관성 필요 시**: O3 Pro + Elements 기능

---

## 2. fal.ai Python API 완전 구현 코드

### 2-1. 기본 설치 및 인증

```bash
pip install fal-client
export FAL_KEY="your-api-key-here"
```

### 2-2. 동기 방식 (단일 씬)

```python
import fal_client
import os

os.environ["FAL_KEY"] = "your-key"

result = fal_client.subscribe(
    "fal-ai/kling-video/v2.6/pro/text-to-video",
    arguments={
        "prompt": "A lone astronaut walks across a red Martian desert, dust swirling around boots, slow tracking shot from left to right, cinematic wide angle, golden hour lighting, photorealistic",
        "duration": "5",               # "5" or "10" (seconds)
        "aspect_ratio": "16:9",        # "16:9", "9:16", "1:1"
        "negative_prompt": "blurry, watermark, text overlay, distortion, cartoon",
        "cfg_scale": 0.5,              # 0.0-1.0, higher = more prompt-adherent
    }
)

video_url = result["video"]["url"]
print(f"Generated video: {video_url}")
```

### 2-3. 비동기 방식 (씬 병렬 처리 - 핵심!)

```python
import fal_client
import asyncio
import aiohttp
from pathlib import Path

FAL_KEY = "your-key"

async def generate_scene_async(scene: dict, idx: int) -> dict:
    """단일 씬 비동기 생성"""
    handler = await fal_client.submit_async(
        "fal-ai/kling-video/v2.6/pro/text-to-video",
        arguments={
            "prompt": scene["prompt"],
            "duration": str(scene.get("duration", 5)),
            "aspect_ratio": "16:9",
            "negative_prompt": "blurry, watermark, text, distortion, low quality",
        }
    )
    
    result = await handler.get()
    video_url = result["video"]["url"]
    
    return {
        "scene_idx": idx,
        "scene_id": scene["id"],
        "video_url": video_url,
        "duration": scene.get("duration", 5),
    }

async def generate_all_scenes_parallel(scenes: list) -> list:
    """모든 씬 병렬 생성 (최대 N개 동시)"""
    MAX_CONCURRENT = 5  # fal.ai 동시 처리 한도 고려
    semaphore = asyncio.Semaphore(MAX_CONCURRENT)
    
    async def generate_with_semaphore(scene, idx):
        async with semaphore:
            try:
                return await generate_scene_async(scene, idx)
            except Exception as e:
                print(f"Scene {idx} failed: {e}")
                return {"scene_idx": idx, "scene_id": scene["id"], "error": str(e)}
    
    tasks = [generate_with_semaphore(s, i) for i, s in enumerate(scenes)]
    results = await asyncio.gather(*tasks)
    
    # 인덱스 순서대로 정렬 후 반환
    return sorted([r for r in results if "error" not in r], key=lambda x: x["scene_idx"])

# 사용 예시
scenes = [
    {"id": "scene_01", "prompt": "Close-up of coffee beans falling in slow motion, dark background, dramatic lighting, macro lens", "duration": 5},
    {"id": "scene_02", "prompt": "Aerial view of a city at dawn, fog rolling over buildings, golden light breaking through clouds, cinematic drone shot", "duration": 7},
    {"id": "scene_03", "prompt": "Person typing on a laptop in a modern cafe, shallow depth of field, warm ambient light, tracking shot", "duration": 5},
]

results = asyncio.run(generate_all_scenes_parallel(scenes))
```

### 2-4. Queue API + Webhook 방식 (대규모 배치)

```python
import fal_client

# Submit 후 나중에 결과 poll
handler = fal_client.submit(
    "fal-ai/kling-video/v2.6/pro/text-to-video",
    arguments={"prompt": "...", "duration": "5", "aspect_ratio": "16:9"},
    webhook_url="https://your-server.com/webhook/video-done"  # 선택적
)

request_id = handler.request_id
print(f"Job ID: {request_id}")

# 나중에 상태 확인
status = fal_client.status("fal-ai/kling-video/v2.6/pro/text-to-video", request_id)
print(status)  # IN_QUEUE / IN_PROGRESS / COMPLETED

# 완료 후 결과 가져오기
result = fal_client.result("fal-ai/kling-video/v2.6/pro/text-to-video", request_id)
video_url = result["video"]["url"]
```

### 2-5. 이미지→비디오 (Hybrid 방식 - 비용 최적화)

```python
import fal_client

# FLUX로 먼저 이미지 생성 (훨씬 저렴)
image_result = fal_client.subscribe(
    "fal-ai/flux/schnell",  # 또는 fal-ai/flux-1/dev
    arguments={"prompt": "A serene mountain lake at sunrise, photorealistic, 16:9"}
)
image_url = image_result["images"][0]["url"]

# 생성된 이미지를 Kling으로 애니메이션화
video_result = fal_client.subscribe(
    "fal-ai/kling-video/v2.6/pro/image-to-video",
    arguments={
        "image_url": image_url,
        "prompt": "Camera slowly zooms in, ripples on water surface, birds fly across, gentle wind in trees",
        "duration": "5",
        "motion_strength": 0.6,  # 0.0-1.0
    }
)
video_url = video_result["video"]["url"]
```

---

## 3. 전체 파이프라인 구조 (권장 아키텍처)

```
script.txt
    │
    ▼
[Phase 1] 대본 파싱 + 씬 분할
    │  - LLM (GPT-4o / Claude)으로 씬별 영어 프롬프트 생성
    │  - 씬 유형 분류: B-roll, talking-head, concept, product
    │  - 씬 중요도 평가 (Kling vs Flux 결정)
    │
    ▼
[Phase 2] 비디오 생성 (병렬)
    │  ├── 중요 씬 → Kling v3 Standard (빠른 draft)
    │  └── 배경/보조 씬 → Flux image (저렴) → Kling i2v (선택)
    │
    ▼
[Phase 3] 클립 다운로드 + 검증
    │  - 품질 체크 (해상도, 길이 확인)
    │  - 실패 씬 재생성 (fallback)
    │
    ▼
[Phase 4] FFmpeg 조합
    │  - 클립 순서대로 concat
    │  - TTS 보이스오버 오버레이
    │  - BGM 추가 (sidechained)
    │  - 자막 (SRT) 태우기
    │
    ▼
final_video.mp4
```

### 파일 구조 (권장)

```
pipeline/
├── main.py                  # 오케스트레이터
├── config.py                # FAL_KEY, 모델 선택, 파라미터
├── phases/
│   ├── script_parser.py     # 대본 파싱 + 씬 추출
│   ├── prompt_generator.py  # LLM으로 비디오 프롬프트 생성
│   ├── video_generator.py   # fal.ai Kling API 호출 (비동기)
│   ├── media_processor.py   # FFmpeg 조합
│   └── uploader.py          # YouTube 업로드 (선택)
├── episodes/
│   └── ep001/
│       ├── script.txt
│       ├── scenes.json       # 파싱된 씬 + 프롬프트
│       ├── clips/            # 다운로드된 .mp4 클립들
│       └── output.mp4        # 최종 영상
└── utils/
    ├── downloader.py         # URL → 로컬 파일
    └── quality_check.py      # 클립 검증
```

---

## 4. LLM 기반 프롬프트 자동 생성

### 4-1. 씬 파싱 프롬프트 (한국어 대본 → 영어 씬 목록)

```python
SCENE_EXTRACTION_PROMPT = """
You are a video production expert. Convert this script into a list of visual B-roll scenes.

Script:
{script_text}

Rules:
- Each scene should be 5-10 seconds of footage
- No faces, no text overlays, no people reading scripts  
- Focus on visuals that illustrate the topic
- Output as JSON array

Output format:
[
  {{
    "scene_id": "s01",
    "script_excerpt": "대응하는 대본 내용",
    "scene_type": "b-roll|concept|environment|product",
    "importance": "high|medium|low",
    "visual_prompt": "Detailed English prompt for video generation",
    "duration": 5
  }}
]
"""
```

### 4-2. 프롬프트 품질 향상 래퍼

```python
PROMPT_ENHANCEMENT_SYSTEM = """
You are an expert at writing prompts for Kling AI video generation.
Convert rough scene descriptions into professional video generation prompts.

Prompt structure:
[Shot type] + [Subject/Action] + [Environment] + [Camera movement] + [Lighting] + [Style]

Rules:
- Be specific about camera movements
- Include lighting conditions  
- Specify cinematic style
- Add motion descriptors
- Keep under 200 words
- No faces, no text, no watermarks
- Write in English only
"""

def enhance_prompt(rough_description: str, llm_client) -> str:
    response = llm_client.chat([
        {"role": "system", "content": PROMPT_ENHANCEMENT_SYSTEM},
        {"role": "user", "content": f"Create a video prompt for: {rough_description}"}
    ])
    return response.content
```

---

## 5. 프롬프트 엔지니어링 가이드 (YouTube B-roll용)

### 5-1. 최적 프롬프트 구조

```
[Shot type] of [Subject + specific details], [Environment + atmosphere],
[Camera movement], [Lighting condition], [Style keywords], [Negative elements to avoid]
```

**실제 예시:**
```
Extreme close-up of water droplets falling onto a glass surface in slow motion, 
dark studio background with subtle gradient, camera locked static with macro lens blur, 
dramatic backlit rim lighting creating prismatic refraction, 
ultra-high definition product photography style, 4K cinematic
```

### 5-2. 씬 유형별 프롬프트 템플릿

#### 인물/캐릭터 (No face - 신체/손)
```
Close-up of hands typing on a mechanical keyboard, modern minimalist desk setup with soft natural light, 
camera slowly racks focus from keyboard to blurred background, warm afternoon lighting through window, 
cinematic shallow depth of field, professional lifestyle photography style
```

#### 배경/환경 씬
```
Aerial establishing shot of [location] at [time of day], [weather condition], 
slow drone pull-back revealing the [feature], [lighting type] casting [shadow/glow effect], 
cinematic wide angle, IMAX quality
```

#### 개념 시각화 (추상/데이터)
```
Abstract visualization of [concept]: glowing [color] particles forming [shape/pattern] 
in dark void, camera slowly orbits around the formation, subtle depth of field, 
sci-fi documentary style, 4K render
```

#### 제품/오브젝트
```
360-degree rotating view of [product] on [surface], [background color] minimalist backdrop, 
soft diffused studio lighting with subtle highlights, camera tracks smoothly around subject, 
product photography commercial style, shallow depth of field
```

#### 자연/풍경
```
Time-lapse of [natural scene], [weather/atmospheric effect], 
camera [movement type], [color grading style] with [mood descriptor] atmosphere, 
cinematic nature documentary style, high dynamic range
```

### 5-3. 카메라 무브먼트 레퍼런스

| 한국어 | 영어 프롬프트 | 효과 |
|--------|-------------|------|
| 앞으로 전진 | `slow dolly push-in`, `camera creeps forward` | 집중감, 긴장감 |
| 뒤로 후퇴 | `camera pulls back revealing`, `dolly out` | 전체 공간 reveal |
| 좌우 패닝 | `slow pan left to right`, `sweeping pan` | 풍경, 전경 |
| 트래킹 | `tracking shot following subject`, `lateral tracking` | 움직임 강조 |
| 드론 상승 | `crane shot rising`, `ascending aerial` | 웅장함, 전체감 |
| 궤도 선회 | `orbiting around subject`, `360-degree orbit` | 제품, 인물 소개 |
| 핸드헬드 | `handheld documentary feel`, `slight camera shake` | 실제감, 긴장 |
| 스태틱 | `static locked shot`, `fixed tripod shot` | 안정감, 집중 |
| 랙 포커스 | `rack focus from foreground to background` | 관계, 전환 |

### 5-4. 스타일 일관성 유지 방법

```python
# 영상 전체에 걸쳐 스타일 시드 설정
STYLE_SUFFIX = "cinematic color grading, shallow depth of field, film grain, anamorphic lens flare, professional grade footage"

# 모든 씬 프롬프트에 동일한 스타일 접미사 추가
def add_style_consistency(prompt: str, style: str = STYLE_SUFFIX) -> str:
    return f"{prompt}, {style}"

# negative prompt도 통일
GLOBAL_NEGATIVE = "blurry, low quality, watermark, text overlay, cartoon, animation, CGI looking, artificial, overexposed, noise"
```

### 5-5. 피해야 할 요소 (Negative Prompts)

```python
# 유형별 negative prompts

# B-roll 범용
"blurry motion, watermark, text, subtitles, UI elements, artificial CGI look, overexposed highlights, film grain artifacts"

# 인물 씬 (얼굴 없음)
"faces, portrait, person looking at camera, selfie, close-up face"

# 개념 시각화
"realistic human figures, text labels, chart overlays, low resolution render"

# 제품
"reflections of photographer, studio equipment visible, incorrect proportions, morphing shapes"
```

---

## 6. 비용 최적화 전략

### 6-1. 씬 중요도 기반 모델 선택

```python
def select_model_for_scene(scene: dict) -> str:
    """씬 중요도와 유형에 따라 최적 모델 선택"""
    
    importance = scene.get("importance", "medium")
    scene_type = scene.get("scene_type", "b-roll")
    
    # 고품질 필요 씬 (인트로, 아웃트로, 핵심 포인트)
    if importance == "high":
        return "fal-ai/kling-video/v3/pro/text-to-video"
    
    # 개념 시각화 (중간)
    elif scene_type == "concept" and importance == "medium":
        return "fal-ai/kling-video/v2.6/pro/text-to-video"
    
    # 배경/보조 씬 (Flux 이미지 → 간단한 animation)
    elif importance == "low":
        return "FLUX_THEN_KEN_BURNS"  # 특수 처리
    
    # 기본 (Standard 모델)
    else:
        return "fal-ai/kling-video/v3/standard/text-to-video"

def estimate_cost(scenes: list) -> float:
    """씬 목록의 예상 비용 계산"""
    costs = {
        "fal-ai/kling-video/v3/pro/text-to-video": 0.392,
        "fal-ai/kling-video/v3/standard/text-to-video": 0.168,
        "fal-ai/kling-video/v2.6/pro/text-to-video": 0.18,
        "fal-ai/kling-video/v2.1/standard/image-to-video": 0.08,
        "FLUX_THEN_KEN_BURNS": 0.005,  # Flux 이미지 + FFmpeg 효과만
    }
    
    total = 0
    for scene in scenes:
        model = select_model_for_scene(scene)
        duration = scene.get("duration", 5)
        total += costs.get(model, 0.168) * duration
    
    return total
```

### 6-2. Hybrid 방식 (Flux + Ken Burns 효과)

```python
import subprocess

def create_kenburns_clip(image_url: str, duration: int = 5, output_path: str = "output.mp4") -> str:
    """
    저비용 옵션: Flux 이미지에 Ken Burns 줌 효과를 FFmpeg으로 적용
    비용: ~$0.005 (Kling의 1/30 수준)
    """
    # 이미지 다운로드 (생략)
    local_image = download_image(image_url)
    
    # Ken Burns 효과 (천천히 줌인)
    cmd = [
        "ffmpeg", "-loop", "1", "-i", local_image,
        "-vf", f"scale=8000:-1,zoompan=z='min(zoom+0.0015,1.5)':d={duration*25}:x='iw/2-(iw/zoom/2)':y='ih/2-(ih/zoom/2)',scale=1920:1080",
        "-c:v", "libx264", "-t", str(duration), "-pix_fmt", "yuv420p",
        "-y", output_path
    ]
    subprocess.run(cmd, check=True)
    return output_path
```

### 6-3. 씬 길이 최적화

```python
# 비용 최적화를 위한 씬 길이 가이드라인
DURATION_GUIDE = {
    "intro": 5,        # 인트로는 짧게
    "main_point": 7,   # 핵심 포인트는 조금 길게
    "b-roll_bg": 5,    # 배경 B-roll은 5초 기본
    "concept": 5,      # 개념 시각화
    "transition": 3,   # 전환 씬은 최대한 짧게
    "outro": 5,        # 아웃트로
}

# 5초 vs 10초 = 2배 비용 차이
# 7분 영상 기준 약 20-25개 씬 필요
# 모두 5초 + Standard 모델 = 25 * 5 * $0.168 = ~$21
# 모두 10초 + Pro 모델 = 25 * 10 * $0.39 = ~$97.50
# 하이브리드 (high 5개 Pro, medium 10개 Standard, low 10개 Flux) = ~$30
```

### 6-4. 재시도/폴백 전략

```python
import asyncio
from typing import Optional

async def generate_with_fallback(scene: dict, primary_model: str, fallback_model: str) -> Optional[dict]:
    """실패 시 더 저렴한 모델로 폴백"""
    
    for attempt, model in enumerate([primary_model, fallback_model], 1):
        try:
            result = await fal_client.subscribe_async(
                model,
                arguments={
                    "prompt": scene["prompt"],
                    "duration": str(scene.get("duration", 5)),
                    "aspect_ratio": "16:9",
                    "negative_prompt": GLOBAL_NEGATIVE,
                }
            )
            return {"video_url": result["video"]["url"], "model_used": model, "attempt": attempt}
            
        except Exception as e:
            print(f"Attempt {attempt} failed with {model}: {e}")
            if attempt == 2:
                return None
            await asyncio.sleep(5)  # 재시도 전 대기
```

---

## 7. FFmpeg 클립 조합

```python
import subprocess
from pathlib import Path

def concat_clips(clip_paths: list, output_path: str, add_transitions: bool = True) -> str:
    """클립들을 순서대로 합치기"""
    
    # concat 파일 생성
    concat_file = Path("temp_concat.txt")
    with open(concat_file, "w") as f:
        for path in clip_paths:
            f.write(f"file '{path}'\n")
    
    if not add_transitions:
        # 단순 concat
        cmd = [
            "ffmpeg", "-f", "concat", "-safe", "0", "-i", str(concat_file),
            "-c", "copy", "-y", output_path
        ]
    else:
        # crossfade 전환 효과 추가
        # 복잡한 필터 그래프 필요 - 씬 2개씩 처리
        cmd = build_crossfade_command(clip_paths, output_path)
    
    subprocess.run(cmd, check=True)
    concat_file.unlink()
    return output_path

def add_voiceover(video_path: str, audio_path: str, output_path: str, 
                  video_volume: float = 0.15, voice_volume: float = 1.0) -> str:
    """TTS 보이스오버 믹싱 (BGM 사이드체인 효과)"""
    cmd = [
        "ffmpeg", "-i", video_path, "-i", audio_path,
        "-filter_complex",
        f"[0:a]volume={video_volume}[bgm];[1:a]volume={voice_volume}[voice];[bgm][voice]amix=inputs=2[aout]",
        "-map", "0:v", "-map", "[aout]",
        "-c:v", "copy", "-c:a", "aac", "-b:a", "192k",
        "-y", output_path
    ]
    subprocess.run(cmd, check=True)
    return output_path
```

---

## 8. 주요 레퍼런스 레포

### 8-1. faceless-video-pipeline
**URL**: https://github.com/mmagdyelsafty/faceless-video-pipeline

**구조**:
```
pipeline.py          # 메인 오케스트레이터 (4단계)
core/
  slide_builder.py   # JSON → HTML 슬라이드
  export_slides.py   # HTML → PNG (Playwright)
tts/
  format_script.py   # 대본 → TTS 포맷
video/
  build_video.py     # PNGs + audio → video
```

**파이프라인 흐름**: Script → TTS patches → HTML Slides → PNG → Video assembly → YouTube metadata

**우리 파이프라인과의 차이**: 이 레포는 정적 슬라이드 기반, 우리는 Kling AI 동적 비디오 생성 기반

### 8-2. n8n 워크플로우 템플릿
**URL**: https://n8n.io/workflows/3442-fully-automated-ai-video-generation-and-multi-platform-publishing/

Flux + Kling 조합으로 짧은 소셜 미디어 영상 자동 생성 → 멀티플랫폼 자동 게시
Hacker News / Reddit 포스트 → 비디오 콘텐츠 변환 가능

### 8-3. Kling 3.0 공식 예시 코드 (fal.ai)

```python
import fal_client

# Kling 3.0 V3 Pro - Multi-shot storyboarding
result = fal_client.subscribe(
    "fal-ai/kling-video/v3/pro/text-to-video",
    arguments={
        "prompt": """Shot 1: Wide establishing shot of a medieval castle at dusk, golden hour light.
Shot 2: Medium shot tracking through the castle gates, torches flickering.
Shot 3: Close-up of a knight's gauntlet gripping a sword handle.""",
        "duration": "10",
        "aspect_ratio": "16:9",
    }
)

# Kling 3.0 O3 Pro - Character consistency (Elements)
result = fal_client.subscribe(
    "fal-ai/kling-video/o3/pro/text-to-video",
    arguments={
        "prompt": "A knight wearing weathered armor walks through a dark forest",
        "elements": [
            {"url": "https://your-reference-image.jpg", "label": "knight_character"}
        ],
        "duration": "5",
        "aspect_ratio": "16:9",
    }
)
```

---

## 9. 씬 연장 기능 (3분 영상 만들기)

```python
async def extend_video_sequence(initial_prompt: str, num_extensions: int = 5) -> list:
    """
    Kling 2.6 Pro의 씬 연장 기능 활용
    10초 클립 → 마지막 프레임 → 다음 10초 클립 반복
    최대 3분 (18개 클립 × 10초)
    """
    clips = []
    last_frame_url = None
    
    for i in range(num_extensions):
        if last_frame_url is None:
            # 첫 씬: text-to-video
            result = await fal_client.subscribe_async(
                "fal-ai/kling-video/v2.6/pro/text-to-video",
                arguments={"prompt": initial_prompt, "duration": "10", "aspect_ratio": "16:9"}
            )
        else:
            # 이어지는 씬: image-to-video (연속성 유지)
            continuation_prompt = f"{initial_prompt}, continuing seamlessly from previous shot"
            result = await fal_client.subscribe_async(
                "fal-ai/kling-video/v2.6/pro/image-to-video",
                arguments={
                    "image_url": last_frame_url,
                    "prompt": continuation_prompt,
                    "duration": "10",
                }
            )
        
        video_url = result["video"]["url"]
        clips.append(video_url)
        
        # 마지막 프레임 추출 (FFmpeg)
        last_frame_url = extract_last_frame(video_url)
    
    return clips
```

---

## 10. 실전 비용 시뮬레이션

### 10분 YouTube 영상 기준

| 씬 구성 | 모델 | 씬수 | 씬당 초 | 단가(/s) | 소계 |
|--------|------|-----|--------|---------|------|
| 인트로/아웃트로 | V3 Pro | 4 | 5s | $0.39 | $7.80 |
| 핵심 B-roll | V3 Standard | 12 | 5s | $0.168 | $10.08 |
| 보조 배경 | Flux → i2v Standard | 10 | 5s | $0.08 | $4.00 |
| 전환 씬 | Flux + Ken Burns | 8 | 3s | $0.005 | $0.12 |
| **합계** | | **34씬** | | | **~$22** |

- TTS (ElevenLabs): ~$2
- LLM 프롬프트 생성: ~$0.50
- **총 10분 영상 제작비: ~$24-25**

---

## 11. 주의사항 및 알려진 이슈

### fal.ai API 관련
- **Rate limit**: 동시 요청 제한 있음 → Semaphore로 최대 5개 병렬 처리 권장
- **생성 시간**: Kling Pro는 씬당 5-30분 소요 (Pro 우선순위 큐 사용 권장)
- **파일 보관**: 생성된 비디오 URL은 일정 시간 후 만료 → 즉시 다운로드 필수
- **에러 유형**: NSFW 필터, 프롬프트 거부, 타임아웃 → 각각 다른 처리 필요

### 프롬프트 관련
- 복잡한 동시 카메라 변환 (예: 360도 + 줌인 동시) → 왜곡 발생 위험
- "Golden hour" + "Studio lighting" 혼합 → 모델 혼란
- 길이가 긴 씬에서 객체 외형 변화 (Morphing) → Elements 기능으로 해결
- 최적 프롬프트 길이: 50-150 단어 (너무 길면 무시되는 요소 발생)

### FFmpeg 조합
- 클립마다 프레임레이트가 다를 수 있음 → concat 전 정규화 필요
- 오디오 코덱 불일치 → `-c:a aac -ar 44100` 강제 지정