Similarity/Proximity Measures between Nodes

François Fouss; Marco Saerens; Masashi Shimbo

doi:10.1017/CBO9781316418321.003

2 - Similarity/Proximity Measures between Nodes

Published online by Cambridge University Press: 05 July 2016

François Fouss ,

Marco Saerens and

Masashi Shimbo

Show author details

François Fouss: Affiliation:
Université Catholique de Louvain, Belgium
Marco Saerens: Affiliation:
Université Catholique de Louvain, Belgium
Masashi Shimbo: Affiliation:
Nara Institute of Science and Technology, Japan

Book contents

Get access

Summary

Introduction

This chapter is concerned with the similarity and its dual, dissimilarity, between nodes of a graph. The need to quantify the similarity between objects arises in many situations, not only in network analysis. Indeed, similarity has been an important and widely used concept in many fields of research for years.

Having its origins in, among others, psychology in the work of Gustav Fechner of the 1860s, the concept of similarity has evolved over the years, as many similarity measures have been proposed in various fields such as feature contrast models [778], mutual information [384], cosine coefficients [289], and information content [666] (see [212] for a survey). The core idea behind a similarity measure is to exploit relevant information for determining the extent to which two objects are similar or not in some sense [212, 688, 761]. The simple intuitions behind the concept of similarity are summarized by Lin in [535]:

▸ The similarity between two objects is related to their commonality. The more commonality they share, the more similar they are.
▸ Symmetrically, the similarity between two objects is related to the differences between them. The more differences they have, the less similar they are.
▸ The maximum similarity between two objects is reached when the two objects are identical, no matter how much commonality they share.

Notice, however, that some popular similarity measures do not satisfy all of them. For instance, inner product similarity does not meet the third condition, unless it is normalized (in which case it is equivalent to cosine similarity).

To measure the similarity between nodes of a graph, two complementary sources of information can be used:

▸ the features (or attributes) of the nodes, or
▸ the structure of the graph

The former refers to the fact that two nodes of the graph are considered to be similar if they share many common features, while the latter refers to the fact that two nodes of the graph are considered to be similar if they are “structurally close” in some sense in the network. Both kinds of information can be combined, of course.

Type: Chapter
Information: Algorithms and Models for Network Data and Link Analysis , pp. 59 - 101

DOI: https://doi.org/10.1017/CBO9781316418321.003 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2016

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book contents

2 - Similarity/Proximity Measures between Nodes

Summary

Access options

Save book to Kindle

Save book to Dropbox

Save book to Google Drive