rfmix_reader.get_prefixes

rfmix_reader.get_prefixes(file_prefix, verbose=True)[source]

Retrieve and clean file prefixes for specified file types.

This function searches for files with a given prefix, cleans the prefixes, and constructs a list of dictionaries mapping specific file types to their corresponding file paths.

Parameters:
  • (str) (file_prefix) – The prefix used to identify relevant files. This can be a directory or a common prefix for the files.

  • (bool) (verbose) – True for progress information; False otherwise. Default:True.

Returns:

A list of dictionaries where each dictionary maps file types (e.g., “fb.tsv”, “rfmix.Q”) to their corresponding file paths.

Return type:

list of dict

Raises:

FileNotFoundError – If no files matching the given prefix are found.:

Example

Given a directory structure:
/data/

chr1.fb.tsv chr1.rfmix.Q chr2.fb.tsv chr2.rfmix.Q

Calling get_prefixes(“/data/”) will return:
[

{‘fb.tsv’: ‘/data/chr1.fb.tsv’, ‘rfmix.Q’: ‘/data/chr1.rfmix.Q’}, {‘fb.tsv’: ‘/data/chr2.fb.tsv’, ‘rfmix.Q’: ‘/data/chr2.rfmix.Q’}

]

Notes

  • This function assumes that the files follow a naming convention where the prefix is followed by a file type extension associated with RFMix (e.g., “.fb.tsv”, “.rfmix.Q”).

  • The function uses the glob module to search for files and the Path class from the pathlib module for path manipulations.

Dependencies

  • pathlib.Path

  • glob.glob

  • os.path.join

  • _clean_prefixes: A helper function to clean and sort file prefixes.