Basically "Information theoretically" compression is a measure of informational distance. So basically if text A and Text B contactenated together, compress better than text A and text C it means A and B repeat more patterns. and are closer. All we need are some distance functions .. in a way you can think about it like Levenhstein distance but that can take into account inputs with very different sizes, repetitions, changes in order big inserts etc...
Reminds me of a mostly joke ruby project I did a decade ago https://github.com/oripekelman/simple_similarity