CLIP+NeRF: Fewshot Learning, Putting NeRF on a Diet

howtowhy · June 29, 2021, 1:19pm

ViT+CLIP+NeRF: Fewshot Learning, Putting NeRF on a Diet

Putting NeRF in a Diet

Is anyone interested in Computer Vision Field?
Our team project goal is the implementation of this paper:
Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis :

With this project, We will code a 3D neural scene representation (NeRF: Neural Radiances Field) estimated from a few images. Which is based on extracting the semantic information using a pre-trained visual encoder such as CLIP, a Vision Transformer.

1. Languages and Skills

Languages : Python, Pytorch, JAX
Other Skills : Git, Github, Cloud Computing Experience

2. Model / Baseline Code

ViT, CLIP : https://github.com/openai/CLIP
Meatlearning-NeRF (Learned Initialization Paper : JAX Based NeRF Code) :
https://github.com/tancik/learnit
NeRF (Pytorch Based NeRF Code) : https://github.com/yenchenlin/nerf-pytorch

3. Datasets

Phototourism dataset:
https://www.cs.ubc.ca/~kmyi/imw2020/data.html
Realistic Synthetic dataset :
https://drive.google.com/drive/folders/128yBriW1IG_3NJ5Rp7APSTZsJqdJdfc1
DTU multi-view stereo (MVS) dataset :
http://roboimagedata.compute.dtu.dk/?page_id=36

4. Challenges

JAX based NeRF Code
Data Preprocessing
CLIP / NeRF training with TPU
Augmentation CLIP to NeRF with JAX based code

5. Project Goal Output

The desired project output is an implementation of the Fewshot learning NeRF model which is using CLIP+ViT (JAX based code)

6. References

NeRF : https://www.matthewtancik.com/nerf
Putting NeRF in a diet : https://arxiv.org/abs/2104.00677
Learned Initialization : https://arxiv.org/abs/2012.02189

7. Team Mindset

-Enjoy!
-Passionate
-Collaborative
-Learning Oriented

howtowhy · June 29, 2021, 3:22pm

@Sasikanth Thank you for your interest Welcome!

patrickvonplaten · June 29, 2021, 3:27pm

Really like the atmosphere here Looks like one of the more ambitious projects - let’s give it a try! @valhalla - maybe you can take a look here as well. Will put it down on the Excel

JYChung · June 29, 2021, 5:58pm

I’m interested!

howtowhy · June 29, 2021, 10:08pm

Welcome guys! Let’s make a wonderful project! Please come and join the “putting-nerf-on-a-diet” channel on the discord : Flax-HuggingFace-Community-Week @Sasikanth @JYChung @alexlau

syedmech47 · June 30, 2021, 3:51am

This is really interesting, and would love to be part of this.

sseung0703 · June 30, 2021, 4:07am

I’m interested too!

khalidsaifullaah · June 30, 2021, 7:25am

This looks awesome, I’m in

howtowhy · June 30, 2021, 7:32am

Nice to meet you all! Our team finishes the team recruiting now!
Welcome 8 members: @howtowhy @Sasikanth @JYChung @alexlau @syedmech47 @sseung0703 @MrBananaHuman @khalidsaifullaah

@valhalla - Hello, valhalla, Could you put our team members to exel sheet?

valhalla · June 30, 2021, 7:55am

Sure!

This looks like an ambitious project!

Out of curiosity, how long do you think it’ll take to NeRF in JAX? VIT and CLIP are already available in JAX.

It’s important to define the scope such that the project can be finished with the given time and compute budget.

howtowhy · June 30, 2021, 8:13am

@valhalla Thank you for fast response. Could I reply after checking more detail on the paper? At now I guess that CLIP and NeRF could be available as JAX code but connecting CLIP and NeRF + Training unified model to new data would be challenge for our team. Also we will consider detail implementation on the paper.

valhalla · June 30, 2021, 9:00am

Could I reply after checking more detail on the paper

Sure!

hassiahk · June 30, 2021, 2:14pm

Is there still space left for one more ?

I have been looking at Few-shot learning for the past month and experimenting with Resnet architectures, would love to try ViT and CLIP.

howtowhy · July 1, 2021, 4:53am

@hassiahk OK hassiahk! If you want to join us please come to discord! #putting_nerf_on_a_diet channel Flax-HuggingFace-Community-Week
The team recruiting is really finished! now :0!

patrickvonplaten · July 1, 2021, 10:07am

added you @hassiahk

george31 · July 2, 2021, 5:36am

Hi,
I think I check this too late though… May I join?

I am a vision researcher in AR industry.
I’m interested in this project!!

howtowhy · July 3, 2021, 1:14am

@george31 Hello could you come in to discord of our channel?

howtowhy · July 3, 2021, 6:52am

@george31 If you want to join our team, please come in discord and participate today meeting

george31 · July 3, 2021, 7:04am

Yes!! Thanks!

george31 · July 3, 2021, 7:06am

@howtowhy Can I get a new invitation link? … I thinks the link above got expired…

Topic		Replies	Views
CLIP like contrastive vision-language models for German with pre-traind text and vision models Flax/JAX Projects	5	1828	July 4, 2021
Reproducing and Extending BEIT Flax/JAX Projects	4	1210	July 24, 2021
IndoClip : Pre Training Clip for Indonesian dataset Flax/JAX Projects	3	479	June 30, 2021
KoCLIP: Pretraining CLIP on Korean Flax/JAX Projects	9	1773	June 30, 2021
Vision-Language Project Ideas Flax/JAX Projects	13	1549	June 30, 2021