Convert data to binary format for DATASET_BINARY
convert_to_binary.RdConvert various data formats to binary format (-1, 0, 1) required by DATASET_BINARY. This function provides flexible data conversion options and can detect if data is already in the correct format.
Usage
convert_to_binary(
data,
force_convert = FALSE,
verbose = TRUE,
negative_range = NULL,
zero_range = NULL,
positive_range = NULL,
lower_inclusive = TRUE,
upper_inclusive = TRUE,
auto_detect = TRUE
)Arguments
- data
A data frame with the first column as ID and subsequent columns as data to convert
- force_convert
Logical, whether to force conversion even if data appears to be already in binary format
- verbose
Logical, whether to print conversion messages
- negative_range
Numeric vector of length
2, range for converting to-1(e.g., c(0, 0.2))- zero_range
Numeric vector of length
2, range for converting to0(e.g., c(0.2, 0.8))- positive_range
Numeric vector of length
2, range for converting to1(e.g., c(0.8, 1))- lower_inclusive
Logical, whether middle range (zero_range) lower bound is inclusive (default: TRUE)
- upper_inclusive
Logical, whether middle range (zero_range) upper bound is inclusive (default: TRUE)
- auto_detect
Logical, whether to automatically detect data range and set ranges (default: TRUE)
Details
The function handles several data types with flexible range control:
Boolean/logical data: TRUE->1, FALSE->0
0-1 range data: Custom ranges for -1, 0, 1 conversion
Non-negative data: Custom ranges for -1, 0, 1 conversion
Already binary data: No conversion if already in -1, 0, 1 format
Range parameters work as follows:
Values in negative_range -> -1
Values in zero_range -> 0 (with inclusive/exclusive boundary control)
Values in positive_range -> 1
Values outside all ranges -> NA (with warning)
The middle range (zero_range) boundary control:
If lower_inclusive=TRUE: lower bound is inclusive
If lower_inclusive=FALSE: lower bound is exclusive
If upper_inclusive=TRUE: upper bound is inclusive
If upper_inclusive=FALSE: upper bound is exclusive
This allows flexible scenarios like:
Standard: negative->-1, middle->0, positive->1
Inverted: negative->0, middle->-1, positive->1
Custom: any range mapping to -1, 0, 1
Examples
# Convert 0-1 range data with default ranges
data <- data.frame(ID = c("A", "B", "C"),
Asia = c(0, 0.3, 0.8),
Europe = c(1, 0.7, 0.2))
convert_to_binary(data)
#> Converting data to binary format (-1, 0, 1)...
#> Range Asia Europe
#> <char> <int> <int>
#> 1: [0, 0.5) -> -1 2 1
#> 2: [0.5, 0.5] -> 0 0 0
#> 3: (0.5, 1] -> 1 1 2
#> ID Asia Europe
#> 1 A -1 1
#> 2 B -1 1
#> 3 C 1 -1
# Convert with custom ranges and boundary control
convert_to_binary(data,
negative_range = c(0, 0.2),
zero_range = c(0.2, 0.8),
positive_range = c(0.8, 1),
lower_inclusive = TRUE, # [0.2, ...]
upper_inclusive = TRUE) # [..., 0.8]
#> Converting data to binary format (-1, 0, 1)...
#> Range Asia Europe
#> <char> <int> <int>
#> 1: [0, 0.2) -> -1 1 0
#> 2: [0.2, 0.8] -> 0 2 2
#> 3: (0.8, 1] -> 1 0 1
#> ID Asia Europe
#> 1 A -1 1
#> 2 B 0 0
#> 3 C 0 0
# Convert boolean data
data_bool <- data.frame(ID = c("A", "B", "C"),
Present = c(TRUE, FALSE, TRUE),
Absent = c(FALSE, TRUE, FALSE))
convert_to_binary(data_bool)
#> Converting data to binary format (-1, 0, 1)...
#> ID Present Absent
#> 1 A 1 0
#> 2 B 0 1
#> 3 C 1 0