Hey folks, I’m an AI enthusiast (not a developer) who’s been experimenting with video generation tools lately — Runway, Pika, Luma, etc. One thing I’ve noticed is: each model shines in different situations (realism, anime, cinematic shots, smooth motion). But as a casual user, I never know which model is best for my prompt until I try them all, which is slow and expensive.
It got me thinking: what if there was a tool that worked like Perplexity, but for video generation?
The Core Idea
-
You give a natural prompt (like “a cinematic drone shot over snowy mountains at sunrise”).
-
The system rewrites/optimizes that prompt into versions that each model “understands” better.
-
It then fans out the prompt to multiple video models via API.
-
An evaluation layer decides which output is the best match for your request (or shows a side-by-side comparison).
Basically: a meta-layer orchestrator that sits on top of video generation models.
Why It Matters
-
Removes “model-picking anxiety” — you don’t need to know whether Runway or Pika is better for your style.
-
Lets people see outputs across engines quickly.
-
Builds a feedback loop: over time, the system learns what you (and the community) like.
What I’m Looking For
I don’t have the engineering chops to build this, but I’d love to:
-
Hear whether this is technically feasible in practice.
-
Learn how one might start with a small experiment (e.g. Runway + Pika only).
-
Find potential collaborators who are interested in hacking on this.
If this sounds fun to you, I’d love to chat!