Keypoint Detection Accuracy is Very Low

Unfortunately, I cannot say too much about my data set but I am trying to predict hundreds of keypoints/landmarks on a given image/video feed.

I’m having great difficulty with my model architecture, I have not been able to get an accuracy greater than 20%.

My model is based on similar ones I found via GitHub and a few academic papers, however they were all predicting dozens of points vs my hundreds. I only saw two model architectures across these sources:

model = tf.keras.models.Sequential([
	layers.Conv2D(32, (3,3), padding='same', input_shape=(512,512,1)),
	layers.LeakyReLU(),
	layers.MaxPool2D((2,2)),
	layers.Conv2D(64, (3,3), padding='same'),
	layers.LeakyReLU(),
	layers.MaxPool2D((2,2)),
	layers.Flatten(),
	layers.BatchNormalization(),
	layers.Dense(128),
	layers.ReLU(),
	layers.Dropout(0.5),
	layers.Dense(64),
	layers.ReLU(),
	layers.Dropout(0.5),
	layers.Dense(501)
])
model = tf.keras.models.Sequential([
	layers.Conv2D(32, (5,5), input_shape=(512,512,1), strides=1),
	layers.Conv2D(32, (3,3), strides=1),
	layers.MaxPool2D((2,2), padding="valid"),
	layers.BatchNormalization(),
	layers.Dropout(0.2),
	layers.Conv2D(64, (5,5), strides=2),
	layers.Conv2D(64, (5,5), strides=2),
	layers.AveragePooling2D((2,2), padding="valid"),
	layers.Flatten(),
	layers.Dense(128),
	layers.ReLU(),
	layers.Dropout(0.5),
	layers.Dense(501),
	layers.Softmax()
])

My dataset is quite large, roughly 15K; this includes augmented data. Just looking for feedback.

I’ve put this in the research section because I noticed that there’s very few models out there for keypoint detection; object detection seems to be much more popular.