{ "cells": [ { "cell_type": "markdown", "id": "d3e87432-1c82-4149-86d4-c0cce97f9c92", "metadata": {}, "source": [ "# Welcome to drugforge\n", "\n", "Welcome to the drugforge tutorial series! \n", "\n", "This notebook will run you through some of the base level abstractions used in our workflows and get you comfortable with the style of the package! " ] }, { "cell_type": "markdown", "id": "80d2b3cd-c190-4fcd-b089-ed230238700b", "metadata": {}, "source": [ "## Making your first Ligand\n", "\n", "We aim to provide high-level abstractions that allow conceptual operations on common objects in drug discovery without worrying about implementation details.\n", "\n", "Nothing could be more fundamental to drug discovery than a ligand so lets start there! `drugforge` has a `Ligand` schema that acts as a metadata rich serializable wrapper around a small molecule (backed by an SDF string). This is essential for allowing " ] }, { "cell_type": "code", "execution_count": null, "id": "5fff7910-ffe3-475a-a1cb-fa564614b994", "metadata": {}, "outputs": [], "source": [ "from drugforge.data.schema.ligand import Ligand\n", "\n", "# make a ligand from a SMILES string \n", "\n", "lig = Ligand.from_smiles(\"CC(Cc1ccc(cc1)C(C(=O)O)C)C\", compound_name=\"ibuprofen\") # compound name is mandatory " ] }, { "cell_type": "code", "execution_count": null, "id": "ff3c979f-6461-4a9f-bd61-9327952c7172", "metadata": {}, "outputs": [], "source": [ "# we can compute common properties of our ligand\n", "print(lig.inchi)\n", "print(lig.inchikey)\n", "print(lig.num_poses)\n", "print(lig.smiles)" ] }, { "cell_type": "code", "execution_count": null, "id": "9a25e5e8-cf0d-4414-9c22-b0bf9a51db12", "metadata": {}, "outputs": [], "source": [ "# our representation is fully serialisable as JSON, backed by storing an SDF file\n", "lig.json()" ] }, { "cell_type": "code", "execution_count": null, "id": "8dc86cef-2e7e-4aba-a4e5-734b5ac34bdf", "metadata": {}, "outputs": [], "source": [ "# serialize to JSON\n", "lig.to_json_file(\"my_ligand.json\")" ] }, { "cell_type": "code", "execution_count": null, "id": "d11b3c97-845e-4bf7-a4bc-469a4897ec3d", "metadata": {}, "outputs": [], "source": [ "# deserialize \n", "lig2 = Ligand.from_json_file(\"my_ligand.json\")\n", "# check for equality\n", "lig == lig2" ] }, { "cell_type": "code", "execution_count": null, "id": "f8fdfa8c-1e14-4225-9420-80a9d05effde", "metadata": {}, "outputs": [], "source": [ "# you can save it as an SDF file \n", "lig.to_sdf(\"my_sdf.sdf\")" ] }, { "cell_type": "markdown", "id": "b16aaf4c-9347-4786-a6cf-87f6b3dabc4e", "metadata": {}, "source": [ "These abstractions enable remote transmission of ligands, easy metadata tracking and simple equality testing between small molecules. All of our workflows make ample use of these abstractions to avoid extensive metadata interrogation at each step and allow easy flow through of identifiers. " ] }, { "cell_type": "code", "execution_count": null, "id": "81949701-b0d5-4e64-bda8-fe757352ace2", "metadata": {}, "outputs": [], "source": [ "# we can also easily make OpenEye molecules from ligands to work with OpenEye components. \n", "oemol = lig.to_oemol()\n", "oemol" ] }, { "cell_type": "code", "execution_count": null, "id": "ff1cadee-473d-48cf-8e13-4357430d1db2", "metadata": {}, "outputs": [], "source": [ "# we can also easily make RDKit molecules from ligands \n", "rdkit_mol = lig.to_rdkit()\n", "rdkit_mol" ] }, { "cell_type": "markdown", "id": "67d11408-d165-41c7-a1cf-53e76ca6f303", "metadata": {}, "source": [ "These translations allow easy use with chemoinformatics, structure based drug design toolkits and molecular simulation engines of all kinds. " ] }, { "cell_type": "markdown", "id": "04fece93-02d5-4886-bdfb-741e0f56b407", "metadata": {}, "source": [ "## Making your first Target\n", "\n", "Most drug discovery campaigns need a target! So how does `drugforge` handle these? A `Target` is a metadata rich serializable wrapper around a PDB file in much the same way as a `Ligand`. \n", "\n", "For this example we will use an ASAP target, the SARS-CoV-2 nsp3 Mac1 macrodomain that removes ADP ribose from viral and host cell proteins. The removal of this post-translational modification reduces the inflammatory and antiviral responses to infection — facilitating replication (see [here](https://www.mdpi.com/2076-0817/11/1/94) for review).\n", "\n", "See [SARS-CoV-2 nsp3 Mac1 targeting opportunity](https://asapdiscovery.notion.site/Targeting-Opportunity-SARS-CoV-2-nsp3-Mac1-macrodomain-47af24638b994e8ba786303ec743926e) for more information on Mac1. \n", "\n", "\n", "**NOTE: A target is designed to wrap only the protein component of a PDB file.** To work with a protein-ligand complex, you should use a `Complex` object (see later). Making a `Target` will automatically remove the small molecule components from a PDB file. \n" ] }, { "cell_type": "code", "execution_count": null, "id": "4de54f16-9448-4ad8-b48a-51de47890364", "metadata": {}, "outputs": [], "source": [ "# first lets grab a file from the `asapdiscovery` test suite\n", "from drugforge.data.testing.test_resources import fetch_test_file" ] }, { "cell_type": "code", "execution_count": null, "id": "04cf14bd-482b-4e64-88ea-8cc30216406d", "metadata": {}, "outputs": [], "source": [ "from drugforge.data.schema.target import Target" ] }, { "cell_type": "code", "execution_count": null, "id": "db408aea-02cf-45b3-9757-37d609e78621", "metadata": {}, "outputs": [], "source": [ "protein = fetch_test_file(\"SARS2_Mac1A-A1013.pdb\")\n", "print(type(protein)) # its a path to a real file" ] }, { "cell_type": "code", "execution_count": null, "id": "a6d094a4-6d54-4cfe-96a7-6a59feae69da", "metadata": {}, "outputs": [], "source": [ "mac1_target = Target.from_pdb(protein, target_name=\"Mac1A\")" ] }, { "cell_type": "code", "execution_count": null, "id": "a00c2bf0-975c-46ea-a03f-616074d68059", "metadata": {}, "outputs": [], "source": [ "# serialize to JSON\n", "mac1_target.to_json_file(\"target.json\")" ] }, { "cell_type": "code", "execution_count": null, "id": "0d295267-a88d-4034-a57c-267c8969f570", "metadata": {}, "outputs": [], "source": [ "# deserialize from JSON\n", "t2 = Target.from_json_file(\"target.json\")\n", "t2 == mac1_target" ] }, { "cell_type": "code", "execution_count": null, "id": "9dd4b029-376e-495c-90de-19c61958c139", "metadata": {}, "outputs": [], "source": [ "# also to a PDB file, only protein components included\n", "mac1_target.to_pdb(\"my_pdb.pdb\")" ] }, { "cell_type": "markdown", "id": "b1657f74-3b1e-499d-a318-bde4aec594cf", "metadata": {}, "source": [ "## Making your first Complex\n", "\n", "We have looked at `Targets` and `Ligands` now what about combining them? A complex is just that, a combination of a ligand and target object for easy handling of both small molecule and protein elements\n" ] }, { "cell_type": "code", "execution_count": null, "id": "360753cc-3932-45be-9298-ca2770269e81", "metadata": {}, "outputs": [], "source": [ "from drugforge.data.schema.complex import Complex" ] }, { "cell_type": "code", "execution_count": null, "id": "1f1337c1-a39e-4010-94fa-5b8e2383f616", "metadata": {}, "outputs": [], "source": [ "complx = Complex.from_pdb(protein, target_kwargs={\"target_name\": \"Mac1A\"}, ligand_kwargs={\"compound_name\": \"A1013\"})" ] }, { "cell_type": "code", "execution_count": null, "id": "0904d92a-17f3-48fb-ab8b-3cc07e98756e", "metadata": {}, "outputs": [], "source": [ "complx.ligand" ] }, { "cell_type": "code", "execution_count": null, "id": "82de2010-fde0-42ab-9b40-d3f1d5dc8372", "metadata": {}, "outputs": [], "source": [ "complx.target" ] }, { "cell_type": "code", "execution_count": null, "id": "4fc1a495-2a62-4f89-a650-b87c7c6a5335", "metadata": {}, "outputs": [], "source": [ "# can be serialized as one file with JSON \n", "complx.to_json_file(\"my_complex.json\")" ] }, { "cell_type": "code", "execution_count": null, "id": "edce296f-f4c1-41c9-9dd6-977088167818", "metadata": {}, "outputs": [], "source": [ "c2 = Complex.from_json_file(\"my_complex.json\")" ] }, { "cell_type": "code", "execution_count": null, "id": "351a2621-e80e-4dfd-9d88-2ce3f3f285ce", "metadata": {}, "outputs": [], "source": [ "c2 == complx" ] }, { "cell_type": "code", "execution_count": null, "id": "b37ee60d-17d2-48d8-8701-67a1a0d39aa4", "metadata": {}, "outputs": [], "source": [ "# you can make a combined OpenEye molecule easily\n", "complx.to_combined_oemol()" ] }, { "cell_type": "code", "execution_count": null, "id": "c0373a3c-7e56-4d7f-8671-89cc90a8c7bd", "metadata": {}, "outputs": [], "source": [ "# or save as a PDB file, protein and ligand included\n", "complx.to_pdb(\"my_complex.pdb\")" ] }, { "cell_type": "markdown", "id": "d1f7ae62-416a-4723-903a-39a1f1f2a129", "metadata": {}, "source": [ "## Summary\n", "\n", "Hopefully this has given you a nice introduction to the base level abstractions used by the `asapdiscovery` repo. Continue on to the next tutorials for more fun stuff. " ] }, { "cell_type": "code", "execution_count": null, "id": "e74160f7-3795-4c73-b10c-24ec6d310347", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.12" } }, "nbformat": 4, "nbformat_minor": 5 }