rfmix_reader.read_rfmix

rfmix_reader.read_rfmix(file_prefix, binary_dir='./binary_files', generate_binary=False, verbose=True)[source]

Read RFMix files into data frames and a Dask array.

Parameters:
  • file_prefix (str) – Path prefix to the set of RFMix files. It will load all of the chromosomes at once.

  • binary_dir (str, optional) – Path prefix to the binary version of RFMix (*fb.tsv) files. Default is “./binary_files”.

  • generate_binary (bool, optional) – True generate the binary file. Default: False.

  • verbose (bool, optional) – True for progress information; False otherwise. Default:True.

  • bed_format (bool, optional) – True for outputing BED format of haplotypes. Default is False.

Return type:

Tuple[DataFrame, DataFrame, Array]

Returns:

  • loci (pandas.DataFrame) – Loci information for the FB data.

  • rf_q (pandas.DataFrame) – Global ancestry by chromosome from RFMix.

  • admix (dask.array.Array) – Local ancestry per population (columns pop1*nsamples … popX*nsamples). This is in order of the populations see rf_q.

Notes

Local ancestry output will be either 0, 1, 2, or math.nan:

  • 0 No alleles are associated with this ancestry

  • 1 One allele is associated with this ancestry

  • 2 Both alleles are associated with this ancestry