rfmix_reader.admix_to_bed_chromosome

rfmix_reader.admix_to_bed_chromosome(loci, rf_q, admix, sample_num, verbose=True)[source]

Returns loci and admixture data to a BED (Browser Extensible Data) file for a specific chromosome.

This function processes genetic loci data along with admixture proportions and returns BED format DataFrame for a specific chromosome.

Parameters:
  • loci (DataFrame) – A DataFrame containing genetic loci information. Expected to have columns for chromosome, position, and other relevant genetic markers.

  • rf_q (DataFrame) – A DataFrame containing sample and population information. Used to derive sample IDs and population names.

  • admix (Array) – A Dask Array containing admixture proportions. The shape should be compatible with the number of loci and populations.

  • sample_num (int) – The column name including in data, will take the first population

  • verbose (bool) – True for progress information; False otherwise. Default:True.

Returns:

DataFrame – ‘chromosome’, ‘start’, ‘end’, and ancestry data columns.

Return type:

A DataFrame (pandas or cudf) in BED-like format with columns:

Notes

  • The function internally calls _generate_bed() to perform the actual BED formatting.

  • Column names in the output file are formatted as “{sample}_{population}”.

  • The output file includes data for all chromosomes present in the input loci DataFrame.

  • Large datasets may require significant processing time and disk space.

Example

>>> loci, rf_q, admix = read_rfmix(prefix_path)
>>> admix_to_bed_chromosome(loci_df, rf_q_df, admix_array, "chr22")