Five-Level Portfolio Optimization: From Abstention to Multi-Objective AI
Author: Ivchenko, Oleh Affiliation: Odessa National Polytechnic University Series: AI Portfolio Optimisation Year: 2025
Abstract
The Decision Readiness Levels (DRL) framework prescribes one of five optimization strategies for each pharmaceutical portfolio segment, conditioned on that segment’s Decision Readiness Index (DRI) score. This paper provides a complete specification of DRL-1 through DRL-5: the conditions under which each level is appropriate, the optimization methods employed at each level, the mathematical formulations, and the implementation considerations. We demonstrate that the five-level taxonomy covers the full spectrum from information-constrained abstention (DRL-1) to sophisticated multi-objective AI optimization (DRL-5), providing a principled path from conservative to aggressive portfolio management as information conditions improve. Worked examples illustrate strategy selection and expected outcomes at each level.
1. Introduction
Portfolio optimization has produced a rich body of methods: linear programming, mean-variance optimization, CVaR minimization, genetic algorithms, multi-objective evolutionary methods. What has been lacking is a principled framework for deciding which method is appropriate when. The common practice of selecting an optimization algorithm based on organizational habit, available software, or analyst preference — rather than on the actual information environment — is a significant source of portfolio management error.
The Decision Readiness Levels (DRL) framework addresses this gap. DRL maps DRI scores (as defined in Article 2) to five strategy tiers, each with a specific optimization algorithm, mathematical formulation, and applicability condition. The result is a decision tree that is explicit, auditable, and calibrated to information quality.
flowchart TD
A[Portfolio Segment] --> B{Compute DRI Score}
B -->|DRI < 0.20| C[DRL-1: Abstention
Freeze allocations
Collect data]
B -->|0.20 ≤ DRI < 0.40| D[DRL-2: Proportional Rebalancing
Ordinal ranking rules]
B -->|0.40 ≤ DRI < 0.60| E[DRL-3: Linear Programming
Constrained LP optimization]
B -->|0.60 ≤ DRI < 0.80| F[DRL-4: CVaR Optimization
Stochastic risk-aware LP]
B -->|DRI ≥ 0.80| G[DRL-5: Multi-Objective AI
NSGA-II + preference learning]
C --> H[Monitor & Improve DRI]
H --> B
style C fill:#ffcccc
style D fill:#ffe0cc
style E fill:#fff4cc
style F fill:#ccf0cc
style G fill:#cce0ff
2. DRL-1: Abstention
2.1 Trigger Condition
DRL-1 applies when DRI < 0.20. At this level, information quality is so low that any optimization — however conservative — is more likely to degrade portfolio performance than improve it. The only rational strategy is to maintain current allocations unchanged.
2.2 Rationale
The abstention decision is counterintuitive in a culture that equates action with competence. However, the operations research literature provides strong theoretical support: when the uncertainty set is unbounded or uncharacterizable, robust optimization degenerates and expected utility maximization is undefined. In practical terms: if you cannot reliably estimate demand, costs, or risks for a portfolio segment, any optimization model you build is fitting noise.
2.3 Implementation
DRL-1 implementation requires:
- Freezing all allocation changes for the affected segments
- Initiating active data collection to improve DRI dimensions
- Setting a monitoring schedule to re-evaluate DRI at defined intervals
- Documenting the DRL-1 designation for audit purposes
DRL-1 is not permanent. The correct response to a DRL-1 designation is to identify which DRI dimensions are lowest and invest in improving them. Article 2 provides guidance on dimension-specific improvement strategies.
2.4 Example
A pharmaceutical company managing orphan disease products in a conflict-affected region has DRI = 0.12 for this segment: R1 = 0.20 (most patient records inaccessible), R5 = 0.05 (distribution network destroyed), R3 = 0.10 (supplier status unknown). The HPF system designates DRL-1 and halts all optimization attempts while humanitarian data collection efforts proceed.
3. DRL-2: Proportional Rebalancing
3.1 Trigger Condition
DRL-2 applies when 0.20 ≤ DRI < 0.40. At this level, some reliable information is available but insufficient to support model-based optimization. Simple proportional rules can extract value without requiring reliable forecasts.
3.2 Mathematical Formulation
The DRL-2 strategy applies a proportional rebalancing rule:
$$xi^{new} = xi^{old} \cdot \frac{\hat{r}i}{\sumj xj^{old} \cdot \hat{r}j} \cdot X_{total}$$
where $xi$ is the allocation to product $i$, $\hat{r}i$ is a simple rank-order estimate of relative performance (not a precise forecast), and $X_{total}$ is the total budget. Allocations are adjusted proportionally to relative performance rankings, subject to minimum and maximum bounds.
3.3 Key Properties
DRL-2 does not require:
- Precise demand forecasts
- Accurate cost models
- Risk quantification
It requires only:
- Relative ranking of products by recent performance indicators
- Minimum and maximum allocation bounds
- Total budget
This minimal data requirement makes DRL-2 appropriate for high-uncertainty environments where some ordinal information is available but cardinal estimates are unreliable.
3.4 Example
A segment with DRI = 0.33 (R1 = 0.65, R2 = 0.28, others low due to post-shock recovery) can be managed with DRL-2: products with the best recent sell-through ratios receive proportionally more allocation, without requiring precise forecast models.
4. DRL-3: Linear Programming
4.1 Trigger Condition
DRL-3 applies when 0.40 ≤ DRI < 0.60. At this level, demand forecasts and cost estimates are available with meaningful accuracy, supporting constrained linear optimization.
4.2 Mathematical Formulation
$$\max{x} \sumi pi xi$$
Subject to: $$\sumi ci xi \leq B \quad \text{(budget constraint)}$$ $$li \leq xi \leq ui \quad \forall i \quad \text{(allocation bounds)}$$ $$\sumi xi = X_{total} \quad \text{(total allocation)}$$ $$Ax \leq b \quad \text{(category constraints)}$$
where $pi$ is the expected profit per unit for product $i$, $xi$ is the allocation, $c_i$ is the unit cost, $B$ is the total budget, and $A, b$ encode category-level constraints.
4.3 Data Requirements
DRL-3 requires:
- Point estimates of demand (not distribution estimates — those come at DRL-4)
- Unit cost estimates with accuracy within ±15%
- Hard constraints (budget, storage capacity, regulatory minimums)
4.4 Risk Handling
At DRL-3, risk is handled conservatively through constraint tightening: budget constraints are set at 90% of actual budget, minimum allocations are set slightly above regulatory minimums, and demand estimates are discounted by a fixed percentage (default 10%) to provide a buffer against forecast error. This approach is less sophisticated than DRL-4’s CVaR optimization but appropriate for the available information quality.
4.5 Example
A mature OTC portfolio recovering from a supply disruption has DRI = 0.52: sufficient demand history is now available (6 months post-disruption), costs are known, regulatory status is clear, but market conditions remain somewhat uncertain. LP optimization identifies an allocation that increases expected profit by 8% over the status quo while respecting all operational constraints.
5. DRL-4: CVaR Optimization
5.1 Trigger Condition
DRL-4 applies when 0.60 ≤ DRI < 0.80. At this level, risk distributions are estimable with meaningful accuracy, supporting risk-aware optimization through Conditional Value at Risk (CVaR) minimization.
5.2 Mathematical Formulation
$$\min{x, \zeta} \left[ -\mu(x) + \lambda \cdot CVaR\alpha(x) \right]$$
where:
$$CVaR\alpha(x) = \zeta + \frac{1}{(1-\alpha)n} \sum{s=1}^{n} \max(L_s(x) – \zeta, 0)$$
with $Ls(x) = -rs^T x$ being the loss under scenario $s$, $\alpha$ the confidence level (default 0.95), $\lambda$ a risk aversion parameter, and $\mu(x)$ the expected return.
Scenarios are generated from the historical demand distribution, augmented with stress scenarios corresponding to observed risk events in R3.
5.3 Data Requirements
DRL-4 requires:
- Demand distributions (not just point estimates)
- Risk scenario catalog with occurrence probabilities
- Covariance structure of returns across portfolio segments
5.4 Practical Advantage
CVaR optimization is particularly valuable when tail risks are material. A pharmaceutical portfolio exposed to supply concentration risk (few suppliers) or demand concentration risk (few therapeutic categories) has fat-tailed loss distributions that mean-variance approaches systematically underestimate. CVaR explicitly optimizes the tail, making it appropriate for pharmaceutical portfolios with structural fragility.
5.5 Example
A specialty pharmaceutical portfolio has DRI = 0.71. Risk scenarios have been constructed from observed supply disruptions and demand shocks over the past three years. CVaR optimization at α = 0.95 identifies an allocation that reduces the expected shortfall in the worst 5% of scenarios by 23% while maintaining 97% of the expected return achievable with LP.
6. DRL-5: Multi-Objective AI Optimization
6.1 Trigger Condition
DRL-5 applies when DRI ≥ 0.80. At this level, information quality is sufficient to support multi-objective optimization with AI-generated preference models.
6.2 Mathematical Formulation
DRL-5 employs a multi-objective evolutionary algorithm (MOEA) to generate the Pareto-optimal frontier across multiple objectives:
$$\text{Pareto-optimize: } F(x) = [f1(x), f2(x), \ldots, f_k(x)]$$
where objectives $f_k$ may include:
- Expected portfolio return
- Portfolio CVaR (tail risk)
- Supply chain resilience score
- Regulatory compliance buffer
- Strategic diversity index
The MOEA generates a set of Pareto-optimal solutions, which are then ranked using a learned preference model (trained on historical decision-maker choices) to select a preferred solution.
6.3 AI Integration
At DRL-5, machine learning models contribute in three ways:
- Demand forecasting: Deep learning models (LSTM or Transformer-based) provide distribution forecasts with uncertainty quantification.
- Risk modeling: Anomaly detection and causal inference identify emerging risks before they manifest in standard metrics.
- Preference learning: Inverse reinforcement learning infers decision-maker preferences from historical choices, enabling preference-aware Pareto solution selection.
6.4 Example
A major portfolio segment covering high-volume chronic disease medications has DRI = 0.87 following a stable 18-month post-disruption period. DRL-5 optimization generates 2,400 Pareto-optimal solutions across four objectives. The preference learning model identifies a solution that achieves 94% of maximum expected return, 89% of minimum CVaR, and maximizes supply resilience — a combination that matches the decision-maker’s revealed preferences from the past 24 months.
graph LR
subgraph DRL-1["DRL-1 (DRI < 0.20)"]
A1[No optimization
Freeze status quo]
end
subgraph DRL2["DRL-2 (0.20–0.40)"]
A2[Proportional
Rebalancing]
end
subgraph DRL3["DRL-3 (0.40–0.60)"]
A3[Linear
Programming LP]
end
subgraph DRL4["DRL-4 (0.60–0.80)"]
A4[CVaR Stochastic
Optimization]
end
subgraph DRL5["DRL-5 (DRI ≥ 0.80)"]
A5[NSGA-II Multi-
Objective AI]
end
DRL-1 -->|DRI improves| DRL2
DRL2 -->|DRI improves| DRL3
DRL3 -->|DRI improves| DRL4
DRL4 -->|DRI improves| DRL5
DRL5 -->|DRI drops| DRL4
7. Strategy Comparison and Selection Guidance
| Factor | DRL-1 | DRL-2 | DRL-3 | DRL-4 | DRL-5 |
|---|---|---|---|---|---|
| Min DRI | — | 0.20 | 0.40 | 0.60 | 0.80 |
| Data requirement | None | Ordinal rankings | Point estimates | Distributions | Full uncertainty quant. |
| Computation time | Trivial | Seconds | Minutes | Hours | Hours–Days |
| Expected improvement | 0% | 3–8% | 8–15% | 12–22% | 18–30% |
| Risk of backfiring | N/A | Low | Medium | Low (explicit risk) | Low (explicit risk) |
Expected improvement figures are illustrative estimates from HPF-P pilot deployments and should be validated in specific organizational contexts.
xychart-beta
title "Expected Portfolio Improvement by DRL Level"
x-axis ["DRL-1", "DRL-2", "DRL-3", "DRL-4", "DRL-5"]
y-axis "Expected Improvement (%)" 0 --> 35
bar [0, 5.5, 11.5, 17, 24]
line [0, 5.5, 11.5, 17, 24]
8. Conclusion
The five-level DRL framework provides a complete and principled taxonomy for pharmaceutical portfolio optimization strategy selection. By mapping information quality (DRI) to optimization method (DRL), the framework ensures that decision-making resources are allocated appropriately: conservative approaches where information is limited, sophisticated AI methods where information quality justifies them.
The key insight is that optimization method selection should be a function of information availability, not of organizational preference or algorithmic fashion. DRL operationalises this principle in a form that is quantitative, auditable, and directly implementable in the HPF-P platform.
References
- Rockafellar, R. T., & Uryasev, S. (2000). Optimization of conditional value-at-risk. Journal of Risk, 2(3), 21–41.
- Deb, K., Pratap, A., Agarwal, S., & Meyarivan, T. (2002). A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation, 6(2), 182–197.
- Abbeel, P., & Ng, A. Y. (2004). Apprenticeship learning via inverse reinforcement learning. ICML.
- Markowitz, H. (1952). Portfolio selection. Journal of Finance, 7(1), 77–91.