The increasing number of techniques introduced to describe organisms and taxa produce
multivariate datasets, often composed of relatively independent descriptors. Handling several descriptors can be laborious
and often unnecessary when their information is not congruent to that of other datasets used in the same study. On the
other hand, different levels of correlation between single descriptors to a whole dataset may suggest useful scientific hints.
The DADI (Distance-based Analysis for (optimal) Descriptor Identification) algorithm is proposed to allow a rapid and
complete analysis among descriptors coming from two different datasets with the same number of objects. DADI was
employed to select FTIR (Fourier Transform Infrared Spectroscopy) spectral wavelengths according to their correlation
with the 26S rDNA sequences of strains belonging to a yeast genus.
This procedure allowed to define a set of optimal wavelengths with an overall increase of the correlation between
FTIR and 26S data.
DADI can identify the FTIR wavenumbers best fitting to the chosen reference defining the descriptors to be
used in FTIR and possibly in other metabolomic analyses.