| Title: | Jensen-Shannon Divergence Estimation, Confidence Intervals, and Distribution Plots |
|---|---|
| Description: | Estimates Jensen-Shannon divergence (JSD) for quantifying distributional differences between two groups on a given variable. Supports both continuous and discrete variables, with tools for point estimation, bootstrap confidence intervals, and visualization of raw group-specific distributions. |
| Authors: | Yueqin Hu [aut, cre], Yiran Zhou [aut], Wenjuan Liu [aut] |
| Maintainer: | Yueqin Hu <[email protected]> |
| License: | GPL (>= 3) |
| Version: | 0.1.0 |
| Built: | 2026-06-01 08:34:55 UTC |
| Source: | https://github.com/cran/jsdtools |
Fixed integration range for continuous JSD
fixed_range(x, y, qrange = c(0.001, 0.999), extend = 3)fixed_range(x, y, qrange = c(0.001, 0.999), extend = 3)
x |
Numeric vector for group 1. |
y |
Numeric vector for group 2. |
qrange |
Quantile range used to determine the main data span. |
extend |
Extension multiplier based on IQR. |
Named numeric vector with elements 'L' and 'U'.
Unified front-end for JSD estimation for continuous and discrete variables.
jsd(x, y, type = c("auto", "continuous", "discrete"), base = 2, ...)jsd(x, y, type = c("auto", "continuous", "discrete"), base = 2, ...)
x |
First vector. |
y |
Second vector. |
type |
One of '"auto"', '"continuous"', or '"discrete"'. |
base |
Logarithm base. Defaults to 2. Use 'exp(1)' for nats. |
... |
Additional arguments passed to the type-specific estimator. |
An object of class '"jsd_estimate"'.
Unified front-end for JSD confidence interval estimation for continuous and discrete variables.
jsd_ci( x, y, type = c("auto", "continuous", "discrete"), B = 1000, conf_level = 0.95, base = 2, seed = NULL, ... )jsd_ci( x, y, type = c("auto", "continuous", "discrete"), B = 1000, conf_level = 0.95, base = 2, seed = NULL, ... )
x |
First vector. |
y |
Second vector. |
type |
One of '"auto"', '"continuous"', or '"discrete"'. |
B |
Number of bootstrap replicates. |
conf_level |
Confidence level. Defaults to 0.95. |
base |
Logarithm base. Defaults to 2. Use 'exp(1)' for nats. |
seed |
Optional random seed. |
... |
Additional arguments passed to the type-specific bootstrap estimator. |
An object of class '"jsd_ci"'.
Computes Jensen-Shannon divergence (JSD) between two numeric vectors using kernel density estimation (KDE) and numerical integration.
jsd_continuous( x, y, L = NULL, U = NULL, base = 2, bw = "nrd0", kernel = "gaussian", grid_n = 4096, qrange = c(0.001, 0.999), extend = 3, eps = 1e-12, renormalize = TRUE, na_rm = TRUE )jsd_continuous( x, y, L = NULL, U = NULL, base = 2, bw = "nrd0", kernel = "gaussian", grid_n = 4096, qrange = c(0.001, 0.999), extend = 3, eps = 1e-12, renormalize = TRUE, na_rm = TRUE )
x |
Numeric vector for group 1. |
y |
Numeric vector for group 2. |
L |
Optional lower integration bound. |
U |
Optional upper integration bound. |
base |
Logarithm base. Defaults to 2. Use 'exp(1)' for nats. |
bw |
Bandwidth passed to [stats::density()]. |
kernel |
Kernel passed to [stats::density()]. |
grid_n |
Number of grid points used for KDE. |
qrange |
Quantile range used when 'L' and 'U' are not supplied. |
extend |
Extension multiplier for the automatically chosen range. |
eps |
Small constant for numerical stability. |
renormalize |
Logical; renormalize estimated densities over the grid? |
na_rm |
Logical; remove missing values? |
An object of class '"jsd_estimate"'.
Bootstrap confidence interval for continuous JSD
jsd_continuous_ci( x, y, B = 1000, conf_level = 0.95, L = NULL, U = NULL, base = 2, bw = "nrd0", kernel = "gaussian", grid_n = 4096, qrange = c(0.001, 0.999), extend = 3, eps = 1e-12, renormalize = TRUE, seed = NULL, na_rm = TRUE, na_rm_failed = TRUE )jsd_continuous_ci( x, y, B = 1000, conf_level = 0.95, L = NULL, U = NULL, base = 2, bw = "nrd0", kernel = "gaussian", grid_n = 4096, qrange = c(0.001, 0.999), extend = 3, eps = 1e-12, renormalize = TRUE, seed = NULL, na_rm = TRUE, na_rm_failed = TRUE )
x |
Numeric vector for group 1. |
y |
Numeric vector for group 2. |
B |
Number of bootstrap replicates. |
conf_level |
Confidence level. Defaults to 0.95. |
L |
Optional lower integration bound. |
U |
Optional upper integration bound. |
base |
Logarithm base. Defaults to 2. Use 'exp(1)' for nats. |
bw |
Bandwidth passed to [stats::density()]. |
kernel |
Kernel passed to [stats::density()]. |
grid_n |
Number of grid points used for KDE. |
qrange |
Quantile range used when 'L' and 'U' are not supplied. |
extend |
Extension multiplier for the automatically chosen range. |
eps |
Small constant for numerical stability. |
renormalize |
Logical; renormalize estimated densities over the grid? |
seed |
Optional random seed. |
na_rm |
Logical; remove missing values? |
na_rm_failed |
Logical; drop failed bootstrap draws when summarizing? |
An object of class '"jsd_ci"'.
Computes Jensen-Shannon divergence (JSD) between two discrete variables using empirical probability mass functions.
jsd_discrete( x, y, support = NULL, base = 2, eps = 1e-12, add_smoothing = FALSE, na_rm = TRUE )jsd_discrete( x, y, support = NULL, base = 2, eps = 1e-12, add_smoothing = FALSE, na_rm = TRUE )
x |
Vector for group 1. Can be numeric, factor, character, or logical. |
y |
Vector for group 2. Can be numeric, factor, character, or logical. |
support |
Optional support values. If 'NULL', the union of observed values in 'x' and 'y' is used. |
base |
Logarithm base. Defaults to 2. Use 'exp(1)' for nats. |
eps |
Small constant for numerical stability. |
add_smoothing |
Logical; add 1 to each cell count? |
na_rm |
Logical; remove missing values? |
An object of class '"jsd_estimate"'.
Bootstrap confidence interval for discrete JSD
jsd_discrete_ci( x, y, B = 1000, conf_level = 0.95, support = NULL, base = 2, eps = 1e-12, add_smoothing = FALSE, seed = NULL, na_rm = TRUE, na_rm_failed = TRUE )jsd_discrete_ci( x, y, B = 1000, conf_level = 0.95, support = NULL, base = 2, eps = 1e-12, add_smoothing = FALSE, seed = NULL, na_rm = TRUE, na_rm_failed = TRUE )
x |
Vector for group 1. |
y |
Vector for group 2. |
B |
Number of bootstrap replicates. |
conf_level |
Confidence level. Defaults to 0.95. |
support |
Optional support values. |
base |
Logarithm base. Defaults to 2. Use 'exp(1)' for nats. |
eps |
Small constant for numerical stability. |
add_smoothing |
Logical; add 1 to each cell count? |
seed |
Optional random seed. |
na_rm |
Logical; remove missing values? |
na_rm_failed |
Logical; drop failed bootstrap draws when summarizing? |
An object of class '"jsd_ci"'.
Plot two continuous distributions
plot_continuous( x, y, group_names = c("Group 1", "Group 2"), bins = 30, style = c("both", "hist", "density"), main = "Two-group raw distributions", xlab = "Value", ylab = "Density", col_x = rgb(0.2, 0.4, 0.8, 0.4), col_y = rgb(0.8, 0.2, 0.2, 0.4), line_col_x = "#2F5FB3", line_col_y = "#CC3333", lwd = 2, na_rm = TRUE, show_jsd = TRUE, jsd_digits = 3 )plot_continuous( x, y, group_names = c("Group 1", "Group 2"), bins = 30, style = c("both", "hist", "density"), main = "Two-group raw distributions", xlab = "Value", ylab = "Density", col_x = rgb(0.2, 0.4, 0.8, 0.4), col_y = rgb(0.8, 0.2, 0.2, 0.4), line_col_x = "#2F5FB3", line_col_y = "#CC3333", lwd = 2, na_rm = TRUE, show_jsd = TRUE, jsd_digits = 3 )
x |
Numeric vector for group 1. |
y |
Numeric vector for group 2. |
group_names |
Group labels. |
bins |
Approximate number of histogram bins. |
style |
One of '"both"', '"hist"', or '"density"'. |
main |
Plot title. |
xlab |
X-axis label. |
ylab |
Y-axis label. |
col_x |
Fill color for group 1. |
col_y |
Fill color for group 2. |
line_col_x |
Line color for group 1 density. |
line_col_y |
Line color for group 2 density. |
lwd |
Line width for density curves. |
na_rm |
Logical; remove missing values? |
show_jsd |
Logical; whether to display JSD on the plot. |
jsd_digits |
Number of digits for displayed JSD. |
Invisibly returns plotting data.
Plot two discrete distributions with overlap
plot_discrete( x, y, support = NULL, group_names = c("Group 1", "Group 2"), main = "Two-group discrete distributions", xlab = "Value", ylab = "Proportion", col_x = adjustcolor("#2F5FB3", alpha.f = 0.2), col_y = adjustcolor("#CC3333", alpha.f = 0.2), overlap_col = adjustcolor("grey55", alpha.f = 0.35), line_col_x = "#2F5FB3", line_col_y = "#CC3333", lwd = 2, pch = 16, cex_pt = 1.1, las = 1, bar_width = 0.2, show_jsd = TRUE, jsd_digits = 3, na_rm = TRUE )plot_discrete( x, y, support = NULL, group_names = c("Group 1", "Group 2"), main = "Two-group discrete distributions", xlab = "Value", ylab = "Proportion", col_x = adjustcolor("#2F5FB3", alpha.f = 0.2), col_y = adjustcolor("#CC3333", alpha.f = 0.2), overlap_col = adjustcolor("grey55", alpha.f = 0.35), line_col_x = "#2F5FB3", line_col_y = "#CC3333", lwd = 2, pch = 16, cex_pt = 1.1, las = 1, bar_width = 0.2, show_jsd = TRUE, jsd_digits = 3, na_rm = TRUE )
x |
Vector for group 1. |
y |
Vector for group 2. |
support |
Optional support values. |
group_names |
Group labels. |
main |
Plot title. |
xlab |
X-axis label. |
ylab |
Y-axis label. |
col_x |
Color for group 1. |
col_y |
Color for group 2. |
overlap_col |
Fill color for overlap bars. |
line_col_x |
Line color for group 1. |
line_col_y |
Line color for group 2. |
lwd |
Line width. |
pch |
Point character. |
cex_pt |
Point size. |
las |
Axis label style for x-axis. |
bar_width |
Width of overlap bars. |
show_jsd |
Logical; whether to display JSD on the plot. |
jsd_digits |
Number of digits for displayed JSD. |
na_rm |
Logical; remove missing values? |
Invisibly returns plotting data.
Unified front-end for plotting continuous or discrete two-group raw distributions.
plot_dist(x, y, type = c("auto", "continuous", "discrete"), ...)plot_dist(x, y, type = c("auto", "continuous", "discrete"), ...)
x |
First vector. |
y |
Second vector. |
type |
One of '"auto"', '"continuous"', or '"discrete"'. |
... |
Additional arguments passed to the type-specific plotting function. |
Invisibly returns plotting data.