Hello everyone!

I wanted to ask for a reference/guide/teaching material that would show how to use geographical distance matrix in a neural network. I have roughly the following idea in mind: I have *n* categories each with multiple data points. For each *n*, distance with other categories will be computed, so I will have *n*×*n* matrix with diagonal entries = 0 (distance to itself) and non-diagonal entries being whatever distance category *n[i]* has to some other category *n[j]*. The idea of using this matrix is making model attend more (or introduce autocorrelation) to categories that are closer in terms of distance to one another, i.e., if two observations *n[i]* and *n[j]* have low distance between them, then model will be biased to make those two have more similar parameters.

I tried to make example a bit vague on purpose, as the application is going to be very field-specific (computational historical linguistics) and I expect people would get more confused if I actually say the problem.

Maybe there is an easier way of doing something with geographical information than just using this geographical distance. I would be happy to hear your takes on it. Note that distance to some independent arbitrary point would be meaningless in case of my problem, as I have read some people suggesting to compute the distance from data point to, e.g., closest hospital/school for problems predicting housing price.

Thanks a lot in advance!