drugforge.data.util.utils.extract_compounds_from_filenames

drugforge.data.util.utils.extract_compounds_from_filenames(fn_list, xtal_pat, compound_pat, fail_val=None)[source]

Extract a list of (xtal, compound_id) from fn_list.

Parameters:
  • fn_list (List[str]) – List of filenames

  • xtal_pat (Union[str, function]) – Regex pattern or function for extracting crystal structure ID from filename. If a function is passed, it is expected to return a single str giving the xtal name

  • compound_pat (Union[str, function]) – Regex pattern or function for extracting crystal structure ID from filename. If a function is passed, it is expected to return a single str giving the compound_id

  • fail_val (str, optional) – If a value is passed, this value will be returned from the re searches if a match isn’t found. If None (default), a ValueError will be raised from the re search

Returns:

List of (xtal, compound_id)

Return type:

List[Tuple[str, str]]