Skip navigation

Generalization error bounds for graph embedding using negative sampling: linear vs hyperbolic

Generalization error bounds for graph embedding using negative sampling: linear vs hyperbolic

Suzuki, Atsushi, Suzuki, Nitanda, Wang, Jing, Xu, Linchuan, Yamanishi, Kenji and Cavazza, Marc (2021) Generalization error bounds for graph embedding using negative sampling: linear vs hyperbolic. In: Advances in Neural Information Processing Systems (NeurIPS 2021). Curran Associates Inc. - Neural Information Processing Systems Foundation Inc. (NeurIPS) - ACM, New York, US, 1243 -1255. ISBN 978-1713845393

[thumbnail of Conference paper]
Preview
PDF (Conference paper)
48652 WANG_Generalization_Error_Bounds_For_Graph_Embedding_Using_Negative_Sampling_Linear_Vs_Hyperbolic_(AAM)_2021.pdf - Accepted Version

Download (389kB) | Preview

Abstract

Graph embedding, which represents real-world entities in a mathematical space, has enabled numerous applications such as analyzing natural languages, social networks, biochemical networks, and knowledge bases. It has been experimentally shown that graph embedding in hyperbolic space can represent hierarchical tree-like data more effectively than embedding in linear space, owing to hyperbolic space's exponential growth property. However, since the theoretical comparison has been limited to ideal noiseless settings, the potential for the hyperbolic space's property to worsen the generalization error for practical data has not been analyzed. In this paper, we provide a generalization error bound applicable for graph embedding both in linear and hyperbolic spaces under various negative sampling settings that appear in graph embedding. Our bound states that error is polynomial and exponential with respect to the embedding space's radius in linear and hyperbolic spaces, respectively, which implies that hyperbolic space's exponential growth property worsens the error. Using our bound, we clarify the data size condition on which graph embedding in hyperbolic space can represent a tree better than in Euclidean space by discussing the bias-variance trade-off. Our bound also shows that imbalanced data distribution, which often appears in graph embedding, can worsen the error.

, P.S. Liang, & J. Wortman Vaughan (

Item Type: Conference Proceedings
Title of Proceedings: Advances in Neural Information Processing Systems (NeurIPS 2021)
Additional Information: 35th Conference on Neural Information Processing Systems (NeurIPS 2021), Virtual-only Conference, 6th - 14th Dec, 2021.
Uncontrolled Keywords: hyperbolic, representation learning, machine learning
Subjects: Q Science > Q Science (General)
Q Science > QA Mathematics
Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Faculty / School / Research Centre / Research Group: Faculty of Engineering & Science
Faculty of Engineering & Science > School of Computing & Mathematical Sciences (CMS)
Related URLs:
Last Modified: 19 Nov 2024 16:53
URI: http://gala.gre.ac.uk/id/eprint/48652

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics