{"id":"CONICETDig_ff4369dae9a7ce9fb0696af3a144ea7b","dc:title":"Genetic barcodes for species identification and phylogenetic estimation in ghost spiders (Araneae: Anyphaenidae: Amaurobioidinae): Phylogenetic datasets for phylogenetic analyses, and phylogenetic trees","dc:creator":"Ramirez, Martin Javier","dc:date":"2024","dc:description":["We combined the COI sequence data with legacy multigene sequence data to create a new, taxon-rich phylogeny for the Amaurobioidinae. We used sequences for four loci that have been used in previous studies on the subfamily: two mitochondrial loci, COI (658bp) and ribosomal subunit 16S (16S, 410bp); and two nuclear loci, Histone H3 (H3, 327bp) and ribosomal subunit 28S (28S, 839bp). We complemented the Amaurobioidinae data with sequences from several non-amaurobioidine anyphaenids and two clubionids as outgroups. Sequence alignment was performed using the MAFFT (ver. 7.308) plugin in Geneious, allowing MAFFT to automatically select an appropriate alignment strategy based on the properties of each locus, or with the online MAFFT server (https:\/\/mafft.cbrc.jp), which consistently selected the L-INS-i algorithm. Finally, alignments of the four loci were concatenated to construct a 2234 bp multigene sequence matrix containing 692 taxa, with about 55% missing\/gap data (\u201cfull\u201d matrix henceforth). To ensure that excessive missing data did not affect the resulting topology, we also constructed a reduced matrix by removing additional COI-only specimens so that each species and morphotype was represented by just one or two specimens for which all loci were available (where possible). After realignment, this reduced matrix was 2235 bp long, included 167 taxa, and had about 22% missing\/gap data (\u201creduced\u201d matrix henceforth). Phylogenetic analyses under maximum likelihood, including model selection, were then conducted with IQ-TREE 2. We performed phylogenetic analyses on both concatenated matrices (the full matrix and the reduced matrix) and on each individual locus. For model selection, we provided an initial scheme that partitioned the matrix by locus, and further partitioned the protein-coding loci (COI and H3) by codon position. We used ModelFinder and searched for the best partition scheme, all in IQ-TREE. The best models (partitions) for the full dataset were: GTR+F+I+G4 (16S), GTR+F+I+I+R4 (28S), TVM+F+I+I+R2 (COI-1), TIM2+F+R4 (COI-2), GTR+F+R5 (COI-3), TVMe+G4 (H3-1-H3-2), SYM+G4 (H3-3); and for the reduced dataset: GTR+F+I+G4 (16S), GTR+F+I+G4: (28S), GTR+F+I+G4: (COI-2), GTR+F+I+G4: (COI-3), TVM+F+I+G4: (COI-1, H3-2), GTR+F+I+G4: (H3-1), GTR+F+I+G4: (H3-3). For each dataset, once the best models and partitions were defined, we executed 10 independent replicates of tree calculations followed by 1000 ultrafast bootstrap replicates, and the replicate reaching the maximum likelihood was chosen. Phylogenetic analyses under parsimony were made with TNT, under equal weights, using the \u201cnew technology\u201d search with default values, asking for 10 independent hits to the minimal length, and submitting the resulting trees to a round of TBR branch swapping."],"dc:format":["application\/octet-stream"],"dc:language":["eng"],"dc:type":"dataset","dc:rights":["info:eu-repo\/semantics\/openAccess","https:\/\/creativecommons.org\/licenses\/by-nc-sa\/2.5\/ar\/"],"dc:relation":["info:eu-repo\/grantAgreement\/Ministerio de Ciencia. Tecnolog\u00eda e Innovaci\u00f3n Productiva. Agencia Nacional de Promoci\u00f3n Cient\u00edfica y Tecnol\u00f3gica\/2017-2689","info:eu-repo\/grantAgreement\/Ministerio de Ciencia, Tecnolog\u00eda e Innovaci\u00f3n Productiva. Agencia Nacional de Promoci\u00f3n Cient\u00edfica y Tecnol\u00f3gica. Fondo para la Investigaci\u00f3n Cient\u00edfica y Tecnol\u00f3gica\/2017-2689"],"dc:identifier":"https:\/\/repositoriosdigitales.mincyt.gob.ar\/vufind\/Record\/CONICETDig_ff4369dae9a7ce9fb0696af3a144ea7b"}