Consider a country that is deciding whether to buy a vaccine a1 or wait for probably a better one in the pipeline a2. Let's say the efficiacy of the vaccine in question is w. The country might determine the "loss" of taking action as,
ℓ(w,a)={10(1−w),if a=BUY100,otherwise
In statistical inference, the goal is not to make a decision but to provide the summary of statistical evidence. This would be the task of first figuring out θ. Based on that statistical summary, we would want a decision.
Decision theory combines the statistical knowledge gained from information in the samples with other relevant aspects of the problem to make the best decision.
Knowledge of possible consequences (quantified in the loss function)
Prior information
The Bayesian expected loss of taking an action a is under the posterior,
ρ(π⋆,a)=Eπ⋆[ℓ(θ,a)]
A frequentist decision-theorist seeks to evaluate risk for every θ and a decision rule δ(x) (which directly gives us an action in the no-data case) as
R(θ,δ)=EX[ℓ(θ,δ(x))]
So for a problem with no-data, R(θ,δ)=ℓ(θ,δ). The Bayes risk is then just
r(π,δ)=Eπ[R(θ,δ)]
Regarding randomized decision functions, leaving decisions up to chance seems ridiculous in practice. We will rarely use a randomized rule. But is often a useful tool for analysis.
Decision Principles
The Conditional Bayes Principle: Pick a Bayes action a which minimizes ρ.
a⋆=argamin{ρ(π⋆,a)=Eπ⋆[ℓ(θ,a)]}
Frequentist Decision Principles:1 Now these are hard to reason about because we can have many non-dominating decision rules. Risk functions to pick a decision rule is hard in practice. There are more principles to guide the choice.
Bayes Risk: This is a single number, so we just pick the decision rule that.
δπ⋆=argδmin{r(π,δ)=Eπ[R(θ,δ)]}
Minimax: supθ∈ΘR(θ,δ⋆), through a randomized decision rule. This is a worst-case rule.
δinfθsup{R(θ,δ)=EX[ℓ(θ,δ(x))]}
Invariance
This is similar to other frequentist principles for inference: like maximum likelihood estimators, unbiasedness, minimum variance, and lease squares risk.