Show simple item record

dc.creatorGranata, D
dc.creatorCarnevale, V
dc.date.accessioned2021-01-26T20:17:13Z
dc.date.available2021-01-26T20:17:13Z
dc.date.issued2016-08-11
dc.identifier.issn2045-2322
dc.identifier.issn2045-2322
dc.identifier.doihttp://dx.doi.org/10.34944/dspace/5017
dc.identifier.other27510265 (pubmed)
dc.identifier.urihttp://hdl.handle.net/20.500.12613/5035
dc.description.abstract© The Author(s) 2016. The collective behavior of a large number of degrees of freedom can be often described by a handful of variables. This observation justifies the use of dimensionality reduction approaches to model complex systems and motivates the search for a small set of relevant "collective" variables. Here, we analyze this issue by focusing on the optimal number of variable needed to capture the salient features of a generic dataset and develop a novel estimator for the intrinsic dimension (ID). By approximating geodesics with minimum distance paths on a graph, we analyze the distribution of pairwise distances around the maximum and exploit its dependency on the dimensionality to obtain an ID estimate. We show that the estimator does not depend on the shape of the intrinsic manifold and is highly accurate, even for exceedingly small sample sizes. We apply the method to several relevant datasets from image recognition databases and protein multiple sequence alignments and discuss possible interpretations for the estimated dimension in light of the correlations among input variables and of the information content of the dataset.
dc.format.extent31377-
dc.language.isoen
dc.relation.haspartScientific Reports
dc.relation.isreferencedbySpringer Science and Business Media LLC
dc.rightsCC BY
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/
dc.subjectAlgorithms
dc.subjectDatabases, Protein
dc.subjectPattern Recognition, Automated
dc.subjectProteins
dc.subjectSequence Alignment
dc.titleAccurate Estimation of the Intrinsic Dimension Using Graph Distances: Unraveling the Geometric Complexity of Datasets
dc.typeArticle
dc.type.genreJournal Article
dc.relation.doi10.1038/srep31377
dc.ada.noteFor Americans with Disabilities Act (ADA) accommodation, including help with reading this content, please contact scholarshare@temple.edu
dc.creator.orcidCarnevale, Vincenzo|0000-0002-0447-1278
dc.date.updated2021-01-26T20:17:08Z
refterms.dateFOA2021-01-26T20:17:14Z


Files in this item

Thumbnail
Name:
Accurate Estimation of the ...
Size:
1.691Mb
Format:
PDF

This item appears in the following Collection(s)

Show simple item record

CC BY
Except where otherwise noted, this item's license is described as CC BY