Skip to content

Commit afacd4f

Browse files
hyanwongjeromekelleher
authored andcommitted
Clarify mapping order in map_to_vcf_model docs
I think it is correct that the order is by individual_id, if not specified, right @benjeffery ?
1 parent 995a7be commit afacd4f

File tree

1 file changed

+8
-6
lines changed

1 file changed

+8
-6
lines changed

python/tskit/trees.py

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -11048,28 +11048,30 @@ def map_to_vcf_model(
1104811048
mapping is created by first checking if the tree sequence contains individuals.
1104911049
If it does, the mapping is created using the individuals in the tree sequence.
1105011050
By default only the sample nodes of the individuals are included in the mapping,
11051-
unless `include_non_sample_nodes` is set to True, in which case all nodes
11051+
unless ``include_non_sample_nodes`` is set to True, in which case all nodes
1105211052
belonging to the individuals are included. Any individuals without any nodes
1105311053
will have no nodes in their row of the mapping, being essentially of zero ploidy.
1105411054
If no individuals are present, the mapping is created using only the sample nodes
1105511055
and the specified ploidy.
1105611056
1105711057
As the tskit data model allows non-integer positions, site positions and contig
1105811058
length are transformed to integer values suitable for VCF output. The
11059-
transformation is done using the `position_transform` function, which must
11059+
transformation is done using the ``position_transform`` function, which must
1106011060
return an integer numpy array the same dimension as the input. By default,
1106111061
this is set to ``numpy.round()`` which will round values to the nearest integer.
1106211062
11063-
If neither `name_metadata_key` nor `individual_names` is specified, the
11063+
If neither ``name_metadata_key`` nor ``individual_names`` is specified, the
1106411064
individual names are set to ``"tsk_{individual_id}"`` for each individual. If
11065-
no individuals are present, the individual names are set to "tsk_{i}" with
11066-
`0 <= i < num_sample_nodes/ploidy`.
11065+
no individuals are present, the individual names are set to ``"tsk_{i}"`` with
11066+
``0 <= i < num_sample_nodes/ploidy``.
1106711067
1106811068
A warning is emitted if any sample nodes do not have an individual ID.
1106911069
1107011070
:param list individuals: Specific individual IDs to include in the VCF. If not
1107111071
specified and the tree sequence contains individuals, all individuals are
11072-
included at least one node.
11072+
included that are associated with least one sample node (or at least one of
11073+
any node if ``include_non_sample_nodes`` is True), and the mapping arrays
11074+
will be in ascending order of the ID of the individual in the tree sequence.
1107311075
:param int ploidy: The ploidy, or number of nodes per individual. Only used when
1107411076
the tree sequence does not contain individuals. Cannot be used if the tree
1107511077
sequence contains individuals. Defaults to 1 if not specified.

0 commit comments

Comments
 (0)