A KNIME-Based Analysis of the Zebrafish Photomotor Response Clusters the Phenotypes of 14 Classes of Neuroactive Molecules
2016-06-01, Copmans, Daniëlle, Meinl, Thorsten, Dietz, Christian, van Leeuwen, Matthijs, Ortmann, Julia, Berthold, Michael R., de Witte, Peter A. M.
Recently, the photomotor response (PMR) of zebrafish embryos was reported as a robust behavior that is useful for high-throughput neuroactive drug discovery and mechanism prediction. Given the complexity of the PMR, there is a need for rapid and easy analysis of the behavioral data. In this study, we developed an automated analysis workflow using the KNIME Analytics Platform and made it freely accessible. This workflow allows us to simultaneously calculate a behavioral fingerprint for all analyzed compounds and to further process the data. Furthermore, to further characterize the potential of PMR for mechanism prediction, we performed PMR analysis of 767 neuroactive compounds covering 14 different receptor classes using the KNIME workflow. We observed a true positive rate of 25% and a false negative rate of 75% in our screening conditions. Among the true positives, all receptor classes were represented, thereby confirming the utility of the PMR assay to identify a broad range of neuroactive molecules. By hierarchical clustering of the behavioral fingerprints, different phenotypical clusters were observed that suggest the utility of PMR for mechanism prediction for adrenergics, dopaminergics, serotonergics, metabotropic glutamatergics, opioids, and ion channel ligands.
KNIME-CDK : Workflow-driven cheminformatics
2013, Beisken, Stephan, Meinl, Thorsten, Wiswedel, Bernd, Figueiredo, Luis F. de, Berthold, Michael R., Steinbeck, Christoph
Cheminformaticians have to routinely process and analyse libraries of small molecules. Among other things, that includes the standardization of molecules, calculation of various descriptors, visualisation of molecular structures, and downstream analysis. For this purpose, scientific workflow platforms such as the Konstanz Information Miner can be used if provided with the right plug-in. A workflow-based cheminformatics tool provides the advantage of ease-of-use and interoperability between complementary cheminformatics packages within the same framework, hence facilitating the analysis process.
KNIME-CDK comprises functions for molecule conversion to/from common formats, generation of signatures, fingerprints, and molecular properties. It is based on the Chemistry Development Toolkit and uses the Chemical Markup Language for persistence. A comparison with the cheminformatics plug-in RDKit shows that KNIME-CDK supports a similar range of chemical classes and adds new functionality to the framework. We describe the design and integration of the plug-in, and demonstrate the usage of the nodes on ChEBI, a library of small molecules of biological interest.
KNIME-CDK is an open-source plug-in for the Konstanz Information Miner, a free workflow platform. KNIME-CDK is build on top of the open-source Chemistry Development Toolkit and allows for efficient cross-vendor structural cheminformatics. Its ease-of-use and modularity enables researchers to automate routine tasks and data analysis, bringing complimentary cheminformatics functionality to the workflow environment.
Maximum-Score Diversity Selection for Early Drug Discovery
2011-02-28, Meinl, Thorsten, Ostermann, Claude, Berthold, Michael R.
Diversity selection is a common task in early drug discovery. One drawback of current approaches is that usually only the structural diversity is taken into account and activity information is ignored. In this article we present a modified version of diversity selection - which we term "Maximum-Score Diversity Selection" - that additionally takes the estimated or predicted activities of the molecules into account. We show that finding an optimal solution to this problem is computationally very expensive (it is NP-hard) and therefore heuristic approaches are needed.
After a discussion of existing approaches we present our new method which is computationally far more efficient but at the same time produces comparable results. We conclude by validating these theoretical differences on several datasets.
Screening Chemicals for Receptor-Mediated Toxicological and Pharmacological Endpoints : Using Public Data to Build Screening Tools within a KNIME Workflow
2015, Steinmetz, Fabian P., Mellor, Claire L., Meinl, Thorsten, Cronin, Marc T. D.
Assessing compounds for their pharmacological and toxicological properties is of great importance for industry and regulatory agencies. In this study an approach using open source software and open access databases to build screening tools for receptor-mediated effects is presented. The retinoic acid receptor (RAR), as a pharmacologically and toxicologically relevant target, was chosen for this study. RAR agonists are used in the treatment of a number of dermal conditions and specific types of cancer, such as acute promyelocytic leukemia. However, when administered chronically, there is strong evidence that RAR agonists cause hepatosteatosis and liver injury. After compiling information on ligand-protein-interactions, common substructures and physico-chemical properties of ligands were identified manually and coded into SMARTS strings. Based on these SMARTS strings and calculated physico-chemical features, a rule-based screening workflow was built within the KNIME platform. The workflow was evaluated on two datasets: one with RAR agonists exclusively and another large, chemically diverse dataset containing only a few RAR agonists. Possible modifications and applications of screening workflows, dependent on their purpose, are presented.
KNIME - The Konstanz information miner : Version 2.0 and Beyond
2009, Berthold, Michael R., Cebron, Nicolas, Dill, Fabian, Gabriel, Thomas R., Kötter, Tobias, Meinl, Thorsten, Ohl, Peter, Thiel, Kilian, Wiswedel, Bernd
The Konstanz Information Miner is a modular environment, which enables easy visual assembly and interactive execution of a data pipeline. It is designed as a teaching, research and collaboration platform, which enables simple integration of new algorithms and tools as well as data manipulation or visualization methods in the form of new modules or nodes. In this paper we describe some of the design aspects of the underlying architecture, briey sketch how new nodes can be incorporated, and highlight some of the new features of version 2.0.
Flexible and transparent computational workflows for the prediction of target organ toxicity
2013, Richarz, Andrea-Nicole, Enoch, Steven J., Hewitt, Mark, Madden, Judith C., Przybylak, Katarzyna, Yang, Chihae, Berthold, Michael R., Meinl, Thorsten, Ohl, Peter, Cronin, Mark T. D.
In silico modeling of target organ toxicity has been held back in part by an inability to capture all relevant information into a meaningful reductionist approach. It has also been considered at times too simplistic, using data of often variable quality and seldom allowing the user to assess the relevance to the intended use. The purpose of this study was to develop a novel computational toxicology workflow system, to allow the users greater control and understanding of the target organ toxicity prediction. The workflows were built on the KNIME open-access platform which allows pipelining via a graphical user interface. Various building blocks, known as nodes, were incorporated, to access chemical inventories and/or databases, to profile structures and calculate properties and to report prediction results. The “basic” user sees a web-interface, whilst a “trained” user can go behind this to interrogate the nodes and, if required, link to additional data sources or investigate and update the models. The workflow was developed to address in particular the prediction of target organ toxicity of cosmetic ingredients. It comprises an inventory of over 4,400 unique chemical structures (cosmetic ingredients and related substances). The database contains repeat dose toxicity data for over 1,100 compounds including NOEL values. Thus, a user is able to search for similar compounds in the inventory file or database. The compound is then profiled using relevant structural alerts and chemotypes, currently comprising 108 alerts for protein reactivity, 85 for DNA binding, 32 for phospholipidosis and 16 for other liver toxicity endpoints. The workflows are flexible and transparent, they are successful in guiding a user through the process of making a prediction of target organ toxicity. Supported by the EU FP7 COSMOS Project.
Parallel and Distributed Data Pipelining with KNIME
2007, Sieb, Christoph, Meinl, Thorsten, Berthold, Michael R.
In recent years a new category of data analysis applications have evolved, known as data pipelining tools, which enable even nonexperts to perform complex analysis tasks on potentially huge amounts of data. Due to the complex and computing intensive analysis processes and methods used, it is often neither sufficient nor possible to simply rely on the increase of performance of single processors. Promising solutions to this problem are parallel and distributed approaches that can accelerate the analysis process. In this paper we discuss the parallel and distribution potential of pipelining tools by demonstrating several parallel and distributed implementations in the open source pipelining platform KNIME. We verify the practical applicability in a number of real world experiments.