drugforge.spectrum.seq_alignment.Alignment

class drugforge.spectrum.seq_alignment.Alignment(blast_match: DataFrame, query: str, dir_save: Path)[source]

Bases: object

__init__(blast_match: DataFrame, query: str, dir_save: Path)[source]

An alignment object

Parameters:

blast_match (pd.DataFrame) – DataFrame with BLAST results
query (str) – Descriptor of query sequence
dir_save (Path) – Path for directory where results will be saved

Methods

`__init__`(blast_match, query, dir_save)	An alignment object
`csv_align_data`(input_alignment, output_file, ...)
`fasta_align_data`(input_alignment, output_file)	Modify sequences in multi-seq alignment to remove gap characters '-'
`multi_seq_alignment`(alignment_file)
`select_checkbox`()
`select_keyword`(match_string, selection_file)
`select_taxonomy`(match_string, selection_file)
`view_alignment`([fontsize, plot_width, ...])	"Bokeh sequence alignment view

static fasta_align_data(input_alignment, output_file)[source]: Modify sequences in multi-seq alignment to remove gap characters ‘-’

view_alignment(fontsize='11pt', plot_width=800, file_name='alignment', color_by_group=False, start_idx=0, skip=4, max_mismatch=2, reorder='')[source]

“Bokeh sequence alignment view: From: https://dmnfarrell.github.io/bioinformatics/bokeh-sequence-aligner

Parameters:

fontsize (str, optional) – Size of aminoacid one-letter IDs, by default “11pt”
plot_width (int, optional) – width of alignment plot, by default 800
file_name (str, optional) – suffix for html file, by default “alignment”
color_by_group (bool, optional) – View mode where matching aminoacids are colored, by default False
start_idx (int, optional) – Index of first aminiacid of reference sequence, by default 0
skip (int, optional) – Skip for displayed indexes of reference sequence , by default 4
max_mismatch (int, optional) – How many mismatches are tolerated for highlighted group match, by default 2

Returns:

Bokeh Column of layouts, path to saved html file.

Return type:

(bokeh.Column, str)