{"id":"RDUUNC_143dd03795ab0c68351356561ed7834e","dc:title":"A novel distance that reduces information loss in continuous characters with few observations","dc:creator":"Balseiro, Diego","dc:date":"2024","dc:description":["The calculation of pairwise distances is a fundamental step in many statistical analyses in biology and paleontology. The most commonly used distances work with a single observation per object and character, but there are scenarios where multiple observations are available per object. In these situations, the information for the character spans an interval, and pairs of objects can have overlapping intervals, which further complicates the distance calculation. Some coefficients can deal with this wealth of information but are either too coarse to provide detailed results or too computationally demanding for even moderately large data sets. Here, we present the Distance Between Intervals (DBI) as a novel semi-metric distance that can accommodate both singular and multiple observations per object by analyzing them as intervals. The DBI ranges from 0 to 1 when there is an overlap between the objects and from 1 to infinity when there is no overlap between them. It is easy to calculate and can be applied to a wide variety of data types. Both simulated and empirical test cases show that the DBI correctly ranks pairs of objects by their level of overlap and non-overlap, while other distances struggle to do it. Therefore the DBI can provide a finer level of definition than other available distances for empirical data sets, while generally agreeing with the broad results they provide. An implementation of DBI is provided for the R programming language.","En biolog\u00eda y paleontolog\u00eda, el c\u00e1lculo de distancias pareadas es un paso fundamental en muchos an\u00e1lisis estad\u00edsticos. Los coeficientes de distancia m\u00e1s comunes utilizan un \u00fanico valor por objeto y car\u00e1cter, pero hay escenarios donde hay m\u00faltiples observaciones por objeto. En estas situaciones, la informaci\u00f3n para el car\u00e1cter abarca un intervalo y los intervalos de un par de objetos pueden superponerse, complicando a\u00fan m\u00e1s el c\u00e1lculo de la distancia. Existen coeficientes que pueden manejar una gran cantidad de informaci\u00f3n por objeto, pero por la baja resoluci\u00f3n de sus resultados son poco detallados o bien tienen un costo computacional demasiado elevado, incluso para conjuntos de datos moderadamente grandes. Aqu\u00ed presentamos la Distancia Entre Intervalos (DBI por sus siglas en ingl\u00e9s) como una nueva distancia semim\u00e9trica que puede trabajar con objetos con una o m\u00e1s observaciones al analizarlos como intervalos. La DBI var\u00eda entre 0 y 1 cuando los intervalos de los objetos se superponen y de 1 a infinito cuando no hay superposici\u00f3n entre ellos. El coeficiente es f\u00e1cil de calcular y se puede aplicar a una amplia variedad de tipos de datos. Simulaciones computacionales y bases de datos emp\u00edricas muestran que DBI es mejor para reconocer las diferencias entre objetos seg\u00fan su variabilidad. Por lo tanto, la DBI puede proporcionar un mayor nivel de definici\u00f3n que otras distancias disponibles en sus resultados, mientras que est\u00e1 de acuerdo con la tendencia general de los resultados que brindan. Se proporciona una implementaci\u00f3n de DBI para el lenguaje de programaci\u00f3n R."],"dc:format":["application\/x-rar-compressed","application\/octet-stream","text\/plain"],"dc:language":["eng"],"dc:type":"dataset","dc:subject":["Distance coefficient","Distance matrix","Continuous characters","Intervals","Overlap","Coeficiente de distancia","Matriz de distancia","Caracteres continuos","Intervalos","Superposici\u00f3n"],"dc:rights":["info:eu-repo\/semantics\/openAccess"],"dc:identifier":"https:\/\/repositoriosdigitales.mincyt.gob.ar\/vufind\/Record\/RDUUNC_143dd03795ab0c68351356561ed7834e"}