[Bjonnh.net]# _

authors Adriano Rutz ORCID , Maria Sorokina , Jakub Galgonek , Daniel Mietchen , Egon Willighagen , Arnaud Gaudry , James G. Graham ORCID , Ralf Stephan , Roderic Page , Jiří Vondrášek , Christoph Steinbeck ORCID , Guido F. Pauli ORCID , Jean-Luc Wolfender ORCID , Jonathan Bisson ORCID , Pierre-Marie Allard ORCID
journal bioRxiv
subjects Wikidata Natural products Databases Chemoinformatics

Contemporary bioinformatic and chemoinformatic capabilities hold promise to reshape knowledge management, analysis and interpretation of data in natural products research. Currently, reliance on a disparate set of non-standardized, insular, and specialized databases presents a series of challenges to data access, either within the discipline or to integration and interoperability between related domains. The fundamental elements of exchange are referenced structure-organism pairs that establish relationships between distinct molecular structures and the living organisms from which they were identified. Consolidating and sharing such information via an open platform has strong transformative potential for natural products research and beyond. This is the ultimate goal of the newly established LOTUS initiative, which has now completed the first steps toward the harmonization, curation, validation and open dissemination of 700,000+ referenced structure-organism pairs. LOTUS data is hosted on Wikidata and regularly mirrored on https://lotus.naturalproducts.net. Data sharing within the Wikidata framework broadens data access and interoperability, opening new possibilities for community curation and evolving publication models. Furthermore, embedding LOTUS data into the vast Wikidata knowledge graph will facilitate new biological and chemical insights. The LOTUS initiative represents an important advancement in the design and deployment of a comprehensive and collaborative natural products knowledge base.

Read More...
categories publications science

authors Adriano Rutz ORCID , Miwa Dounoue-Kubo , Simon Ollivier , Jonathan Bisson ORCID , Mohsen Bagheri , Tongchai Saesong , Samad Nejad Ebrahimi , Kornkanok Ingkaninan , Jean-Luc Wolfender ORCID , Pierre-Marie Allard ORCID
journal Frontiers in Plant Science
subjects Pharmacognosy Taxonomy Mass spectrometry Annotation

Mass spectrometry (MS) offers unrivalled sensitivity for the metabolite profiling of complex biological matrices encountered in natural products (NP) research. The massive and complex sets of spectral data generated by such platforms require computational approaches for their interpretation. Within such approaches, computational metabolite annotation automatically links spectral data to candidate structures via a score, which is usually established between the acquired data and experimental or theoretical spectral databases (DB). This process leads to various candidate structures for each MS features. However, at this stage, obtaining high annotation confidence level remains a challenge notably due to the extensive chemodiversity of specialized metabolomes. The design of a metascore is a way to capture complementary experimental attributes and improve the annotation process. Here, we show that integrating the taxonomic position of the biological source of the analyzed samples and candidate structures enhances confidence in metabolite annotation. A script is proposed to automatically input such information at various granularity levels (species, genus, and family) and complement the score obtained between experimental spectral data and output of available computational metabolite annotation tools (ISDB-DNP, MS-Finder, Sirius). In all cases, the consideration of the taxonomic distance allowed an efficient re-ranking of the candidate structures leading to a systematic enhancement of the recall and precision rates of the tools (1.5- to 7-fold increase in the F1 score). Our results clearly demonstrate the importance of considering taxonomic information in the process of specialized metabolites annotation. This requires to access structural data systematically documented with biological origin, both for new and previously reported NPs. In this respect, the establishment of an open structural DB of specialized metabolites and their associated metadata, particularly biological sources, is timely and critical for the NP research community.

Read More...
categories publications science

authors Pierre-Marie Allard ORCID , Jonathan Bisson ORCID , Antonio Azzollini ORCID , Guido F. Pauli ORCID , Geoffrey A Cordell , Jean-Luc Wolfender ORCID
journal Current Opinion in Biotechnology
subjects Pharmacognosy Computers Data Digital Nanopublications Mining Contextualization
Graphical abstract representing the need for mining and contextualization

Humans have co-evolved alongside numerous other organisms, some having a profound effect on health and nutrition. As the earliest pharmaceutical subject, pharmacognosy has evolved into a meta-discipline devoted to natural biomedical agents and their functional properties. While the acquisition of expanding data volumes is ongoing, contextualization is lagging. Thus, we assert that the establishment of an integrated and open databases ecosystem will nurture the discipline. After proposing an epistemological framework of knowledge acquisition in pharmacognosy, this study focuses on recent computational and analytical approaches. It then elaborates on the flux of research data, where good practices could foster the implementation of more integrated systems, which will in turn help shaping the future of pharmacognosy and determine its constitutional societal relevance.

Read More...
categories publications science

authors Pierre-Marie Allard ORCID , Tiphaine Péresse , Jonathan Bisson ORCID , Katia Gindro , Laurence Marcourt , Van Cuong Pham , Fanny Roussi , Marc Litaudon , Jean-Luc Wolfender ORCID
journal Analytical Chemistry
subjects Pharmacognosy Phytochemistry MS Fragmentation Molecular Networking Virtual pharmacognosy

Dereplication represents a key step for rapidly identifying known secondary metabolites in complex biological matrices. In this context, liquid-chromatography coupled to high resolution mass spectrometry (LC-HRMS) is increasingly used and, via untargeted data-dependent MS/MS experiments, massive amounts of detailed information on the chemical composition of crude extracts can be generated. An efficient exploitation of such data sets requires automated data treatment and access to dedicated fragmentation databases. Various novel bioinformatics approaches such as molecular networking (MN) and in-silico fragmentation tools have emerged recently and provide new perspective for early metabolite identification in natural products (NPs) research. Here we propose an innovative dereplication strategy based on the combination of MN with an extensive in-silico MS/MS fragmentation database of NPs. Using two case studies, we demonstrate that this combined approach offers a powerful tool to navigate through the chemistry of complex NPs extracts, dereplicate metabolites, and annotate analogues of database entries.

Read More...
categories publications science