Hello, I was recently tasked with pipelining an object detection transformer (DETR). Being a newbie to the world of implementations of deep learning models, I have relied heavily on tutorials and examples to complete my work. However, it seems to be quite a challenge to find examples for model pipeline parallelism, specifically for image-based models which has left me in a bind. I have been stuck for weeks in this problem. Any kind of directions (tutorials even!) would help my case.
Iāve never dealt with DETR itself, but I just did a search. Surprisingly little informationā¦
I think the general parallelization/acceleration techniques for each library would work for the parts using torch and transformers, but it might be faster to send a mention to someone who is an expert in the field. (@+hf_username)
End-to-End Object Detection with Transformers (DETR) ć®č§£čŖ¬ #ē©ä½ę¤åŗ - Qiita (in Japanese)
DETRļ¼End-to-End Object Detection with Transformersļ¼ē©ä½ę¤ē„ #Windows - Qiita (in Japanese)
Thank you for your reply. I have already explored most of your mentions. What I am after is model pipelining, which doesnāt seem possible with automatic methods (pytorch.export
). The error messages generally ask me to resort to manual splitting or resolving graph breaks (seems quite complex).
Some of my code for splitting the model(in case I am missing something):
detr = DetrForObjectDetection.from_pretrained("facebook/detr-resnet-50")
detr.eval()
class Stage1(nn.Module):
def __init__(self):
super(Stage1, self).__init__()
self.backbone = detr.model.backbone
def forward(self, x):
# Process the input through the backbone and return features
features = self.backbone(x)
return features
class Stage2(nn.Module):
def __init__(self):
super(Stage2, self).__init__()
self.input_projection = detr.model.input_projection
self.encoder = detr.model.encoder
self.decoder = detr.model.decoder
self.class_labels_classifier_output = detr.class_labels_classifier
self.bbox_predictor = detr.bbox_predictor
def stage_2(features):
# Process features through transformer and class head
input_projection_output = self.input_projection(features)
encoder_output = self.encoder(input_projection_output)
decoder_output = self.decoder(encoder_output)
class_labels_classifier_output = self.class_labels_classifier(decoder_output)
output = self.bbox_predictor(class_labels_classifier_output)
return output
# Creating instances of the classes
stage_1 = Stage1()
stage_2 = Stage2()
# Wrapping stages into a sequential model for pipeline execution
stages = torch.nn.Sequential(
stage_1,
stage_2
)
stage_index = args.rank
# Split on basis of current stage index
if stage_index == 0:
stage_1
elif stage_index == 1:
stage_2
# Create the pipeline using GPipe scheduling
pipe = pipeline(
stages,
mb_args=(),
mb_kwargs={"input_ids": mb_inputs},
)
# Create a GPipe schedule for execution
schedule = ScheduleGPipe(pipe.get_stage_module(args.rank), args.chunks)