drugforge.data.util.utils

Functions

cdd_to_schema(cdd_csv[, out_json, out_csv])

Convert a CDD-downloaded and filtered CSV file into a JSON file containing

cdd_to_schema_pair(cdd_csv[, out_json, out_csv])

Convert a CDD-downloaded and filtered CSV file into a JSON file containing an EnantiomerPairList.

cdd_to_schema_v2(cdd_csv, target_prop[, ...])

Convert a CDD-downloaded and filtered CSV file into a JSON file containing a list[ExperimentalCompoundData].

check_empty_dataframe(df[, logger, fail, ...])

check_filelist_has_elements(filelist[, tag])

Check that a glob or list of files actually contains some elements - if not, raise an error.

check_name_length_and_truncate(name[, ...])

combine_files(paths, output_file)

construct_regex_function(pat[, fail_val, ...])

Construct a function that searches for the given regex pattern, either returning fail_val or raising an error if no match is found.

download_file(url, path)

Download a file and save it locally.

extract_compounds_from_filenames(fn_list, ...)

Extract a list of (xtal, compound_id) from fn_list.

filter_molecules_dataframe(mol_df[, ...])

Filter a dataframe of molecules to retain those specified.

get_path_string(module)

Get the absolute path as a string to an imported module.

get_sdf_fn_from_dataset(dataset, fragalysis_dir)

is_valid_smiles(smiles)

parse_fluorescence_data_cdd(mol_df[, ...])

Filter a dataframe of molecules to retain those specified. Required columns are:

seqres_to_res_list(seqres_str)

https://www.wwpdb.org/documentation/file-format-content/format33/sect3.html#SEQRES :Parameters: seqres_str

strip_smiles_salts(smiles)

Strip salts from a SMILES string.