I’m working on a project involving:
- Face swapping (e.g., swapping faces in images while preserving expressions)
- Celebrity face matching (similarity scoring across datasets)
- Text-driven background replacement
- Image-to-video generation (e.g., animating a single image with text prompts)
- Dual-image video generation (e.g., creating a handshake video from two portraits)
I’ve researched models like DeepFaceLab, StyleGAN3, RunwayML, and Disco Diffusion , but I’m struggling to find direct comparisons for these niche tasks. Have you encountered studies, benchmarks, or repositories that evaluate models for these specific functionalities?
Additionally, if you’ve built similar pipelines, which open-source tools/libraries (e.g., PyTorch3D, Transformers) did you find most effective?