dnarecords.helper
DNARecords helper utilities.
Module Contents
Classes
Utility class to provide common functionalities used in other modules. |
- class dnarecords.helper.DNARecordsUtils[source]
Utility class to provide common functionalities used in other modules.
- static spark_session() pyspark.sql.SparkSession [source]
Gets the current spark session or builds a new one if none.
Ensures sparktfrecord libraries are available in the session.
- Returns
a spark session with sparktfrecord libraries available.
- Return type
SparkSession
- static init_hail() ModuleType [source]
Initializes Hail ensuring sparktfrecord libraries are available in the session. :return: the hail module (with Hail initialized) :rtype: ModuleType
- static dnarecords_tree(dnarecords_path) Dict[str, str] [source]
DNARecords directory structure.
Gets a dictionary with the full structure of a DNARecords dataset given a root path.
swrec -> <dnarecords_path>/data/swrec (sample wise dna tfrecords) vwrec -> <dnarecords_path>/data/vwrec (variant wise dna tfrecords) swpar -> <dnarecords_path>/data/swpar (sample wise dna parquet files) vwpar -> <dnarecords_path>/data/vwpar (variant wise dna parquet files) skeys -> <dnarecords_path>/meta/skeys (sample wise key mapping) vkeys -> <dnarecords_path>/meta/vkeys (variant wise key mapping) swpfs -> <dnarecords_path>/meta/swpfs (sample wise parquet files index) vwpfs -> <dnarecords_path>/meta/vwpfs (variant wise parquet files index) swrfs -> <dnarecords_path>/meta/swrfs (sample wise tfrecords index) vwrfs -> <dnarecords_path>/meta/vwrfs (variant wise tfrecords index) swpsc -> <dnarecords_path>/meta/swpsc (sample wise parquet schema) vwpsc -> <dnarecords_path>/meta/vwpsc (variant wise parquet schema) swrsc -> <dnarecords_path>/meta/swrsc (sample wise tfrecord schema) vwrsc -> <dnarecords_path>/meta/vwrsc (variant wise tfrecord schema)
- Returns
a dictionary with the structure of the DNARecords dataset.
- Return type
Dict[str,str]