Beta Distribution

The Beta distribution is an absolute continuous probability distribution with support $S = [0,1]$, parameterized by two shape parameters, $\alpha > 0$ and $\beta > 0$.

Usage

Beta(shape1 = 1, shape2 = 1)

# S4 method for class 'Beta,numeric'
d(distr, x, log = FALSE)

# S4 method for class 'Beta,numeric'
p(distr, q, lower.tail = TRUE, log.p = FALSE)

# S4 method for class 'Beta,numeric'
qn(distr, p, lower.tail = TRUE, log.p = FALSE)

# S4 method for class 'Beta,numeric'
r(distr, n)

# S4 method for class 'Beta'
mean(x)

# S4 method for class 'Beta'
median(x)

# S4 method for class 'Beta'
mode(x)

# S4 method for class 'Beta'
var(x)

# S4 method for class 'Beta'
sd(x)

# S4 method for class 'Beta'
skew(x)

# S4 method for class 'Beta'
kurt(x)

# S4 method for class 'Beta'
entro(x)

# S4 method for class 'Beta'
finf(x)

llbeta(x, shape1, shape2)

# S4 method for class 'Beta,numeric'
ll(distr, x)

ebeta(x, type = "mle", ...)

# S4 method for class 'Beta,numeric'
mle(
  distr,
  x,
  par0 = "same",
  method = "L-BFGS-B",
  lower = 1e-05,
  upper = Inf,
  na.rm = FALSE
)

# S4 method for class 'Beta,numeric'
me(distr, x, na.rm = FALSE)

# S4 method for class 'Beta,numeric'
same(distr, x, na.rm = FALSE)

vbeta(shape1, shape2, type = "mle")

# S4 method for class 'Beta'
avar_mle(distr)

# S4 method for class 'Beta'
avar_me(distr)

# S4 method for class 'Beta'
avar_same(distr)

Arguments

shape1, shape2: numeric. The non-negative distribution parameters.
distr: an object of class Beta.
x: For the density function, x is a numeric vector of quantiles. For the moments functions, x is an object of class Beta. For the log-likelihood and the estimation functions, x is the sample of observations.
log, log.p: logical. Should the logarithm of the probability be returned?
q: numeric. Vector of quantiles.
lower.tail: logical. If TRUE (default), probabilities are $P(X \leq x)$, otherwise $P(X > x)$.
p: numeric. Vector of probabilities.
n: number of observations. If length(n) > 1, the length is taken to be the number required.
type: character, case ignored. The estimator type (mle, me, or same).
...: extra arguments.
par0, method, lower, upper: arguments passed to optim for the mle optimization. See Details.
na.rm: logical. Should the NA values be removed?

Value

Each type of function returns a different type of object:

Distribution Functions: When supplied with one argument (distr), the d(), p(), q(), r(), ll() functions return the density, cumulative probability, quantile, random sample generator, and log-likelihood functions, respectively. When supplied with both arguments (distr and x), they evaluate the aforementioned functions directly.
Moments: Returns a numeric, either vector or matrix depending on the moment and the distribution. The moments() function returns a list with all the available methods.
Estimation: Returns a list, the estimators of the unknown parameters. Note that in distribution families like the binomial, multinomial, and negative binomial, the size is not returned, since it is considered known.
Variance: Returns a named matrix. The asymptotic covariance matrix of the estimator.

Details

The probability density function (PDF) of the Beta distribution is given by: $$ f(x; \alpha, \beta) = \frac{x^{\alpha - 1} (1 - x)^{\beta - 1}}{B(\alpha, \beta)}, \quad \alpha\in\mathbb{R}_+, \, \beta\in\mathbb{R}_+,$$ for $x \in S = [0, 1]$, where $B(\alpha, \beta)$ is the Beta function: $$ B(\alpha, \beta) = \int_0^1 t^{\alpha - 1} (1 - t)^{\beta - 1} dt.$$

The MLE of the beta distribution parameters is not available in closed form and has to be approximated numerically. This is done with optim(). Specifically, instead of solving a bivariate optimization problem w.r.t $(\alpha, \beta)$, the optimization can be performed on the parameter sum $\alpha_0:=\alpha + \beta \in(0,+\infty)$. The default method used is the L-BFGS-B method with lower bound 1e-5 and upper bound Inf. The par0 argument can either be a numeric (satisfying lower <= par0 <= upper) or a character specifying the closed-form estimator to be used as initialization for the algorithm ("me" or "same" - the default value).

References

Tamae, H., Irie, K. & Kubokawa, T. (2020), A score-adjusted approach to closed-form estimators for the gamma and beta distributions, Japanese Journal of Statistics and Data Science 3, 543–561.
Papadatos, N. (2022), On point estimators for gamma and beta distributions, arXiv preprint arXiv:2205.10799.

Examples

# -----------------------------------------------------
# Beta Distribution Example
# -----------------------------------------------------

# Create the distribution
a <- 3
b <- 5
D <- Beta(a, b)

# ------------------
# dpqr Functions
# ------------------

d(D, c(0.3, 0.8, 0.5)) # density function
#> [1] 2.268945 0.107520 1.640625
p(D, c(0.3, 0.8, 0.5)) # distribution function
#> [1] 0.3529305 0.9953280 0.7734375
qn(D, c(0.4, 0.8)) # inverse distribution function
#> [1] 0.3205858 0.5167578
x <- r(D, 100) # random generator function

# alternative way to use the function
df <- d(D) ; df(x) # df is a function itself
#>   [1] 1.24976524 2.30385545 2.30436535 2.24366783 2.23826444 1.71563415
#>   [7] 2.28263219 2.10668839 0.92787399 2.09917742 0.87045906 2.30370469
#>  [13] 1.48498302 1.75732014 2.17337196 1.79839550 2.03897019 0.57529221
#>  [19] 2.08534655 1.75118856 1.63522573 2.29181156 2.14188666 2.28679623
#>  [25] 0.62262194 1.70675168 2.19031209 0.87536478 2.30028329 1.86864335
#>  [31] 2.27587144 1.88068569 0.94438439 0.96563800 2.26940820 1.80165906
#>  [37] 0.25464248 2.28467252 1.12007652 2.27401345 1.98190602 1.65113317
#>  [43] 1.97108943 2.07624080 2.18326721 2.19801819 2.09421572 1.07112503
#>  [49] 2.30443077 1.17180825 2.06449080 2.25139933 1.68945212 0.05499604
#>  [55] 1.87985716 0.79420668 2.07331701 0.93008267 0.07703167 1.84906139
#>  [61] 0.51237483 1.87774483 2.19057215 2.24206926 1.63973348 2.30240492
#>  [67] 2.20344361 2.12521282 1.52874921 1.27431968 2.20775377 1.30479661
#>  [73] 1.79648555 1.84284598 1.90451186 2.29945722 1.19903507 2.21674330
#>  [79] 2.30117732 1.93620566 0.13667862 1.39705135 1.06884493 0.59395349
#>  [85] 1.56541121 1.76378323 1.02060804 0.36888836 1.87454563 2.18034118
#>  [91] 1.54011318 2.18363285 0.95997386 1.53127942 2.23343322 2.28408194
#>  [97] 1.37190707 1.33874755 2.29566516 2.06006047

# ------------------
# Moments
# ------------------

mean(D) # Expectation
#> [1] 0.375
var(D) # Variance
#> [1] 0.02604167
sd(D) # Standard Deviation
#> [1] 0.1613743
skew(D) # Skewness
#> [1] 0.3098387
kurt(D) # Excess Kurtosis
#> [1] 0.04
entro(D) # Entropy
#> [1] -0.4301508
finf(D) # Fisher Information Matrix
#>            shape1      shape2
#> shape1  0.2617971 -0.13313701
#> shape2 -0.1331370  0.08818594

# List of all available moments
mom <- moments(D)
mom$mean # expectation
#> [1] 0.375

# ------------------
# Point Estimation
# ------------------

ll(D, x)
#> [1] 39.25421
llbeta(x, a, b)
#> [1] 39.25421

ebeta(x, type = "mle")
#> $shape1
#> [1] 3.60909
#> 
#> $shape2
#> [1] 4.991035
#> 
ebeta(x, type = "me")
#> $shape1
#> [1] 3.545896
#> 
#> $shape2
#> [1] 4.931701
#> 
ebeta(x, type = "same")
#> $shape1
#> [1] 3.584778
#> 
#> $shape2
#> [1] 4.985779
#> 

mle(D, x)
#> $shape1
#> [1] 3.60909
#> 
#> $shape2
#> [1] 4.991035
#> 
me(D, x)
#> $shape1
#> [1] 3.545896
#> 
#> $shape2
#> [1] 4.931701
#> 
same(D, x)
#> $shape1
#> [1] 3.584778
#> 
#> $shape2
#> [1] 4.985779
#> 
e(D, x, type = "mle")
#> $shape1
#> [1] 3.60909
#> 
#> $shape2
#> [1] 4.991035
#> 

mle("beta", x) # the distr argument can be a character
#> $shape1
#> [1] 3.60909
#> 
#> $shape2
#> [1] 4.991035
#> 

# ------------------
# Estimator Variance
# ------------------

vbeta(a, b, type = "mle")
#>          shape1   shape2
#> shape1 16.44844 24.83272
#> shape2 24.83272 48.83039
vbeta(a, b, type = "me")
#>          shape1   shape2
#> shape1 17.64848 26.56970
#> shape2 26.56970 51.39394
vbeta(a, b, type = "same")
#>          shape1   shape2
#> shape1 16.57719 24.96198
#> shape2 24.96198 49.01071

avar_mle(D)
#>          shape1   shape2
#> shape1 16.44844 24.83272
#> shape2 24.83272 48.83039
avar_me(D)
#>          shape1   shape2
#> shape1 17.64848 26.56970
#> shape2 26.56970 51.39394
avar_same(D)
#>          shape1   shape2
#> shape1 16.57719 24.96198
#> shape2 24.96198 49.01071

v(D, type = "mle")
#>          shape1   shape2
#> shape1 16.44844 24.83272
#> shape2 24.83272 48.83039