How to compute Pairwise L1 Distance matrix on very large images in neighborhood only?

Multi tool use
How to compute Pairwise L1 Distance matrix on very large images in neighborhood only?
I am working on Deep learning approach for my project. And I need to calculate Distance Matrix on 4D Tensor which will be of size N x 128 x 64 x 64 (Batch Size x Channels x Height x Width). The distance matrix for this type of tensor will of size N x 128 x 4096 x 4096 and it will be impossible to fit this type of tensor in GPU, even on CPU it will require lot of memory. So, I would like to calculate the distance matrix only in some neighborhood pixels (say within radius of 5) and consider this rectangular matrix for further processing in neural network. With this approach my distance matrix will be of size N x 128 x 4096 x 61. It will take less memory in comparison to full distance matrix.
Precisely, I am trying to implement the Convolution Random Walk Networks for Semantic Segmentation. This network needs to calculate the Pairwise L1 Distance for features.
Architecture
Just to add this type of Distance Matrix is usually calculated for image segmentation via spectral clustering.
For Example
X = [[a,b],[c,d]]
L1_dist = [ [0, |a-b|, |a-c|, 0],
[|a-b|, 0, 0, |b-d|],
[|a-c|, 0, 0, |c-d| ],
[0, |b-d|, |c-d|, 0 ]
]
Final_L1_dist = [ [0, |a-b|, |a-c|], // "a" is near to b and c. Including self element i.e. a
[|a-b|, 0, |b-d|], // "b" is near to a and d.
[|a-c|, 0, |c-d| ], // "c" is near to a and d.
[|b-d|, |c-d|, 0 ] // "d" is near to b and c.
]
I would appreciate, if some one can help me to find an efficient way to compute such a matrix.
Thanks
why your output has channel dim of 128? doesn't the L1 distance sum the abs diff along channel to get the L1 norm of
|a-b|
?– Shai
Jul 1 at 6:55
|a-b|
@Shai, in the architecture I am trying to implement, pairwise distance L1 is calculated per channel wise. thus output has same number of input channels.
– anil2185
Jul 1 at 7:06
1 Answer
1
As far as I understand, the goal is to apply minus operation to each pixel and its surrounding neighbors. This sounds like convolution to me.
Consider the following convolution process (assume padding='SAME'
):
padding='SAME'
The 3x3 kernel calculates, for each pixel, the difference between the center pixel and its left one. For other neighbors, consider the following kernels:
Thus the goal can be achieved via the following:
num_channels
tf.tile
tf.nn.depthwise_conv2d
tf.abs
NxCx(HW)x1
For efficient for
loop, you may consider using tf.map_fn
.
for
tf.map_fn
Thanks for the response. But I am unable to follow your approach. it would be helpful if you can explain more.
– anil2185
Jun 29 at 22:01
Sorry for the confusion. I think I misunderstood your question. May I ask why there is no
|b-c|
or |a-d|
in your example? Is it because their distance is sqrt(2)
instead of 1
?– Richard_wth
Jun 30 at 10:46
|b-c|
|a-d|
sqrt(2)
1
Specifically, I marked them zero. I would like to find difference only between nearest pixels. In the example I am considering only Horizontal & Vertical pixels (4-Neighbours) as nearest pixels. thus I consider them as zero in final matrix.
– anil2185
Jun 30 at 11:33
I have modified my answer. Hope it is clear now.
– Richard_wth
Jul 2 at 7:04
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
Hello and welcome to Stack Overflow! When you ask questions here, it's really helpful (and would probably get you an answer faster) if you provide a minimal, complete and verifiable, example (MCVE): a starting point for the people here that want to help you. This way we can just copy/paste the code and go straight into figuring out how to optimize the code, without having to write one from scratch.
– Andreas Storvik Strauman
Jun 28 at 22:21