LOTUS

Beyond drug discovery: Breaking the boundaries of natural products information

Jonathan BISSON, Adriano Rutz, Guido Pauli, Jean-Luc Wolfender, Pierre-Marie Allard


ACS Spring 2022

Expanding Cheminformatics to Industries Adjacent to Small Molecule Drug Discovery

Situation

Formerly at the University of Illinois Chicago

Now at Collaborative Drug Discovery Inc![1]

graphical abstract
1. (Sometimes strong) Opinions expressed are solely my own and do not express the views or opinions of my current or previous employer.

Who is behind it

Core team

Adriano Rutz, Jonathan Bisson, Pierre-Marie Allard

The first contributors

Maria Sorokina, Jiří Vondrášek, Daniel Mietchen, Egon Willighagen, Roderic Page, Ralf Stephan, Christoph Steinbeck, Jakub Galgonek, James Graham, Guido Pauli, Arnaud Gaudry, Jean-Luc Wolfender

And all the, too many to cite, contributors on Wikidata.

Funding

  • NIH U41 AT008706 / P50 AT000155 (NCCIH/ODS)

  • ChemBioSys (Project-ID 239748522, SFB 1127)

  • Alfred P.Sloan G-2019-11458

  • Elixir CZ MEYS LM2018131

The world of Natural Products

magic tree

Taxomomy is painful

Taxo

Classification

What is LOTUS ?

graphical abstract

Over 750,000 entries added as of today:

  • 250,000+ unique structures

  • 30,000+ organisms

  • 75,000+ references

LOTUS curation ?

cleaning
R
python

LOTUS bot

botedits
kotlin

Arabidopsis example

arabidopsis

Beta-sitosterol example

sitosterol

Queries

Original question

"Hey Wikidata, Which organisms are known to contain quercetin ?"

SPARQL translation

SELECT DISTINCT ?parent_taxon ?parent_taxonname ?taxon ?taxonname WHERE {
VALUES ?classes { wd:Q11173 wd:Q59199015 } ?compound wdt:P31 ?classes; wdt:P235 "REFJWTPEDVJJIY-UHFFFAOYSA-N".
?taxon wdt:P171 ?parent_taxon. { ?compound p:P703 ?stmt.
?stmt ps:P703 ?taxon. { ?stmt prov:wasDerivedFrom ?ref.
?ref pr:P248 ?art. ?art wdt:P356 ?art_doi.
} } ?taxon wdt:P225 ?taxonname.
?parent_taxon wdt:P225 ?parent_taxonname.}

Organisms containing quercetin

quercetin

Structures found in Quercus spp.

Compounds structurally similar to antibiotics

Sorted by parent taxon.

Antibios

Finding overlooked sources of chemicals

Which Zephyranthes species lack compounds known from at least two sister species?

Zephyr

Finding sources in the world

Maps

Pigments (Layla Michán)

graphical abstract

All natural products that are pigments

pigments

Fragrances, aromas

graphical abstract

Compounds similar to Eugenol

eugenol

Cosmetics and allergies

Sources of compounds similar to Dehydroabietic acid

Sachem

sachem

Allows us to link Wikidata with PubChem (we will see another integration in a few slides), DrugBank, ChEMBL, ChEBI. It also allows us to do substructure search and similarity searches.

Trees

Sitosterol interpreted biological tree

sitosterol tree 1

Sitosterol interpreted biological tree

sitosterol tree 2

Sitosterol interpreted biological tree

sitosterol tree 3

Sitosterol interpreted biological tree

sitosterol tree 4

Sitosterol interpreted biological tree

sitosterol tree 5

Tmap

Integrations

Pubchem Integration

Work by Tiejun Cheng, Evan Bolton and Adriano Rutz.

pubchem1

Pubchem Integration

Google Sheets (in beta test)

New chemical classification

A new rule-based approach to classify compounds using Wikidata by Ralf Stephan.

Perspectives

workflow 2

Perspectives

workflow 3

We need YOU!

Perspectives for industry

Enrich

Integration: data lakes, databases and knowledge graphs

Protocols and data are open. SPARQL endpoint allows integration.

Share

Contribute to the Open Data, get visibility. Same model as bio-assay data shared by companies on PubChem,

Join

Combine the skills, tools and knowledge of other sectors in unprecedented ways.

LOTUS