Identifying biological traces at crime scenes is a crucial challenge in forensic science. Determining the origin and nature of these traces can be critical to understanding the circumstances of a crime and guiding the investigation. Among the key questions, distinguishing between the different body fluids on a stain is paramount. The protein composition of fluids of interest contains elements in common, making their identification or the detection of complex mixtures a daunting task. To achieve this, we need to implement rigorous statistical approaches for the obtained proteomic profiles while cross-referencing experimental data with knowledge from the scientific literature. This is an essential condition for proposing a robust analytical strategy for investigators.
This study aims to develop three distinct tests based on proteomic data for the identification of human body fluids. The first test aims to construct panels of peptide biomarkers specific to each fluid. In order to take into account the compounds common to several fluids, the second test quantitatively evaluates the peptide abundance ratios according to the fluid of origin. Finally, the third test uses artificial intelligence, employing a multi-output Random Forest, to differentiate pure fluids from mixtures and to identify the fluids present in these mixtures.
In this aim, samples from 20 individuals per fluid (blood, saliva, semen, urine and vaginal fluid) and 78 mixtures were analyzed following a nanoLC-MS/MS bottom-up proteomics strategy. The results highlight the promising potential of proteomics for forensic body fluid identification. The evaluation of the three tests demonstrates their complementarity, as they focus on distinct features derived from a single experiment. While the development of the strategy has been performed using a DDA high-resolution mode, narrowing the peptide panel to propose a targeted analysis could be efficiently transferred to a robust targeted high-throughput analysis.