HPF-P FrameworkFramework Research · Article 3 of 6

By Oleh Ivchenko · HPF-P is a proprietary methodology under active research development.

Five-Level Portfolio Optimization: From Abstention to Multi-Objective AI

📚 Academic Citation: Ivchenko, O. (2026). Five-Level Portfolio Optimization: From Abstention to Multi-Objective AI. Research article: Five-Level Portfolio Optimization: From Abstention to Multi-Objective AI. ONPU. DOI: 10.5281/zenodo.18845442

Author: Ivchenko, Oleh Affiliation: Odessa National Polytechnic University Series: AI Portfolio Optimisation Year: 2025

Abstract

The Decision Readiness Levels (DRL) framework prescribes one of five optimization strategies for each pharmaceutical portfolio segment, conditioned on that segment’s Decision Readiness Index (DRI) score. This paper provides a complete specification of DRL-1 through DRL-5: the conditions under which each level is appropriate, the optimization methods employed at each level, the mathematical formulations, and the implementation considerations. We demonstrate that the five-level taxonomy covers the full spectrum from information-constrained abstention (DRL-1) to sophisticated multi-objective AI optimization (DRL-5), providing a principled path from conservative to aggressive portfolio management as information conditions improve. Worked examples illustrate strategy selection and expected outcomes at each level.

1. Introduction

Portfolio optimization has produced a rich body of methods: linear programming, mean-variance optimization, CVaR minimization, genetic algorithms, multi-objective evolutionary methods. What has been lacking is a principled framework for deciding which method is appropriate when. The common practice of selecting an optimization algorithm based on organizational habit, available software, or analyst preference — rather than on the actual information environment — is a significant source of portfolio management error.

The Decision Readiness Levels (DRL) framework addresses this gap. DRL maps DRI scores (as defined in Article 2) to five strategy tiers, each with a specific optimization algorithm, mathematical formulation, and applicability condition. The result is a decision tree that is explicit, auditable, and calibrated to information quality.

flowchart TD
    A[Portfolio Segment] --> B{Compute DRI Score}
    B -->|DRI < 0.20| C[DRL-1: Abstention
Freeze allocations
Collect data]
    B -->|0.20 ≤ DRI < 0.40| D[DRL-2: Proportional Rebalancing
Ordinal ranking rules]
    B -->|0.40 ≤ DRI < 0.60| E[DRL-3: Linear Programming
Constrained LP optimization]
    B -->|0.60 ≤ DRI < 0.80| F[DRL-4: CVaR Optimization
Stochastic risk-aware LP]
    B -->|DRI ≥ 0.80| G[DRL-5: Multi-Objective AI
NSGA-II + preference learning]
    C --> H[Monitor & Improve DRI]
    H --> B
    style C fill:#ffcccc
    style D fill:#ffe0cc
    style E fill:#fff4cc
    style F fill:#ccf0cc
    style G fill:#cce0ff

2. DRL-1: Abstention

2.1 Trigger Condition

DRL-1 applies when DRI < 0.20. At this level, information quality is so low that any optimization — however conservative — is more likely to degrade portfolio performance than improve it. The only rational strategy is to maintain current allocations unchanged.

2.2 Rationale

The abstention decision is counterintuitive in a culture that equates action with competence. However, the operations research literature provides strong theoretical support: when the uncertainty set is unbounded or uncharacterizable, robust optimization degenerates and expected utility maximization is undefined. In practical terms: if you cannot reliably estimate demand, costs, or risks for a portfolio segment, any optimization model you build is fitting noise.

2.3 Implementation

DRL-1 implementation requires:

Freezing all allocation changes for the affected segments
Initiating active data collection to improve DRI dimensions
Setting a monitoring schedule to re-evaluate DRI at defined intervals
Documenting the DRL-1 designation for audit purposes

DRL-1 is not permanent. The correct response to a DRL-1 designation is to identify which DRI dimensions are lowest and invest in improving them. Article 2 provides guidance on dimension-specific improvement strategies.

2.4 Example

A pharmaceutical company managing orphan disease products in a conflict-affected region has DRI = 0.12 for this segment: R1 = 0.20 (most patient records inaccessible), R5 = 0.05 (distribution network destroyed), R3 = 0.10 (supplier status unknown). The HPF system designates DRL-1 and halts all optimization attempts while humanitarian data collection efforts proceed.

3. DRL-2: Proportional Rebalancing

3.1 Trigger Condition

DRL-2 applies when 0.20 ≤ DRI < 0.40. At this level, some reliable information is available but insufficient to support model-based optimization. Simple proportional rules can extract value without requiring reliable forecasts.

3.2 Mathematical Formulation

The DRL-2 strategy applies a proportional rebalancing rule:

$$xi^{new} = xi^{old} \cdot \frac{\hat{r}i}{\sumj xj^{old} \cdot \hat{r}j} \cdot X_{total}$$

where $xi$ is the allocation to product $i$, $\hat{r}i$ is a simple rank-order estimate of relative performance (not a precise forecast), and $X_{total}$ is the total budget. Allocations are adjusted proportionally to relative performance rankings, subject to minimum and maximum bounds.

3.3 Key Properties

DRL-2 does not require:

Precise demand forecasts
Accurate cost models
Risk quantification

It requires only:

Relative ranking of products by recent performance indicators
Minimum and maximum allocation bounds
Total budget

This minimal data requirement makes DRL-2 appropriate for high-uncertainty environments where some ordinal information is available but cardinal estimates are unreliable.

3.4 Example

A segment with DRI = 0.33 (R1 = 0.65, R2 = 0.28, others low due to post-shock recovery) can be managed with DRL-2: products with the best recent sell-through ratios receive proportionally more allocation, without requiring precise forecast models.

4. DRL-3: Linear Programming

4.1 Trigger Condition

DRL-3 applies when 0.40 ≤ DRI < 0.60. At this level, demand forecasts and cost estimates are available with meaningful accuracy, supporting constrained linear optimization.

4.2 Mathematical Formulation

$$\max{x} \sumi pi xi$$

Subject to: $$\sumi ci xi \leq B \quad \text{(budget constraint)}$$ $$li \leq xi \leq ui \quad \forall i \quad \text{(allocation bounds)}$$ $$\sumi xi = X_{total} \quad \text{(total allocation)}$$ $$Ax \leq b \quad \text{(category constraints)}$$

where $pi$ is the expected profit per unit for product $i$, $xi$ is the allocation, $c_i$ is the unit cost, $B$ is the total budget, and $A, b$ encode category-level constraints.

4.3 Data Requirements

DRL-3 requires:

Point estimates of demand (not distribution estimates — those come at DRL-4)
Unit cost estimates with accuracy within ±15%
Hard constraints (budget, storage capacity, regulatory minimums)

4.4 Risk Handling

At DRL-3, risk is handled conservatively through constraint tightening: budget constraints are set at 90% of actual budget, minimum allocations are set slightly above regulatory minimums, and demand estimates are discounted by a fixed percentage (default 10%) to provide a buffer against forecast error. This approach is less sophisticated than DRL-4’s CVaR optimization but appropriate for the available information quality.

4.5 Example

A mature OTC portfolio recovering from a supply disruption has DRI = 0.52: sufficient demand history is now available (6 months post-disruption), costs are known, regulatory status is clear, but market conditions remain somewhat uncertain. LP optimization identifies an allocation that increases expected profit by 8% over the status quo while respecting all operational constraints.

5. DRL-4: CVaR Optimization

5.1 Trigger Condition

DRL-4 applies when 0.60 ≤ DRI < 0.80. At this level, risk distributions are estimable with meaningful accuracy, supporting risk-aware optimization through Conditional Value at Risk (CVaR) minimization.

5.2 Mathematical Formulation

$$\min{x, \zeta} \left[ -\mu(x) + \lambda \cdot CVaR\alpha(x) \right]$$

where:

$$CVaR\alpha(x) = \zeta + \frac{1}{(1-\alpha)n} \sum{s=1}^{n} \max(L_s(x) – \zeta, 0)$$

with $Ls(x) = -rs^T x$ being the loss under scenario $s$, $\alpha$ the confidence level (default 0.95), $\lambda$ a risk aversion parameter, and $\mu(x)$ the expected return.

Scenarios are generated from the historical demand distribution, augmented with stress scenarios corresponding to observed risk events in R3.

5.3 Data Requirements

DRL-4 requires:

Demand distributions (not just point estimates)
Risk scenario catalog with occurrence probabilities
Covariance structure of returns across portfolio segments

5.4 Practical Advantage

CVaR optimization is particularly valuable when tail risks are material. A pharmaceutical portfolio exposed to supply concentration risk (few suppliers) or demand concentration risk (few therapeutic categories) has fat-tailed loss distributions that mean-variance approaches systematically underestimate. CVaR explicitly optimizes the tail, making it appropriate for pharmaceutical portfolios with structural fragility.

5.5 Example

A specialty pharmaceutical portfolio has DRI = 0.71. Risk scenarios have been constructed from observed supply disruptions and demand shocks over the past three years. CVaR optimization at α = 0.95 identifies an allocation that reduces the expected shortfall in the worst 5% of scenarios by 23% while maintaining 97% of the expected return achievable with LP.

6. DRL-5: Multi-Objective AI Optimization

6.1 Trigger Condition

DRL-5 applies when DRI ≥ 0.80. At this level, information quality is sufficient to support multi-objective optimization with AI-generated preference models.

6.2 Mathematical Formulation

DRL-5 employs a multi-objective evolutionary algorithm (MOEA) to generate the Pareto-optimal frontier across multiple objectives:

$$\text{Pareto-optimize: } F(x) = [f1(x), f2(x), \ldots, f_k(x)]$$

where objectives $f_k$ may include:

Expected portfolio return
Portfolio CVaR (tail risk)
Supply chain resilience score
Regulatory compliance buffer
Strategic diversity index

The MOEA generates a set of Pareto-optimal solutions, which are then ranked using a learned preference model (trained on historical decision-maker choices) to select a preferred solution.

6.3 AI Integration

At DRL-5, machine learning models contribute in three ways:

Demand forecasting: Deep learning models (LSTM or Transformer-based) provide distribution forecasts with uncertainty quantification.
Risk modeling: Anomaly detection and causal inference identify emerging risks before they manifest in standard metrics.
Preference learning: Inverse reinforcement learning infers decision-maker preferences from historical choices, enabling preference-aware Pareto solution selection.

6.4 Example

A major portfolio segment covering high-volume chronic disease medications has DRI = 0.87 following a stable 18-month post-disruption period. DRL-5 optimization generates 2,400 Pareto-optimal solutions across four objectives. The preference learning model identifies a solution that achieves 94% of maximum expected return, 89% of minimum CVaR, and maximizes supply resilience — a combination that matches the decision-maker’s revealed preferences from the past 24 months.

graph LR
    subgraph DRL-1["DRL-1 (DRI < 0.20)"]
        A1[No optimization
Freeze status quo]
    end
    subgraph DRL2["DRL-2 (0.20–0.40)"]
        A2[Proportional
Rebalancing]
    end
    subgraph DRL3["DRL-3 (0.40–0.60)"]
        A3[Linear
Programming LP]
    end
    subgraph DRL4["DRL-4 (0.60–0.80)"]
        A4[CVaR Stochastic
Optimization]
    end
    subgraph DRL5["DRL-5 (DRI ≥ 0.80)"]
        A5[NSGA-II Multi-
Objective AI]
    end
    DRL-1 -->|DRI improves| DRL2
    DRL2 -->|DRI improves| DRL3
    DRL3 -->|DRI improves| DRL4
    DRL4 -->|DRI improves| DRL5
    DRL5 -->|DRI drops| DRL4

7. Strategy Comparison and Selection Guidance

Factor	DRL-1	DRL-2	DRL-3	DRL-4	DRL-5
Min DRI	—	0.20	0.40	0.60	0.80
Data requirement	None	Ordinal rankings	Point estimates	Distributions	Full uncertainty quant.
Computation time	Trivial	Seconds	Minutes	Hours	Hours–Days
Expected improvement	0%	3–8%	8–15%	12–22%	18–30%
Risk of backfiring	N/A	Low	Medium	Low (explicit risk)	Low (explicit risk)

Expected improvement figures are illustrative estimates from HPF-P pilot deployments and should be validated in specific organizational contexts.

xychart-beta
    title "Expected Portfolio Improvement by DRL Level"
    x-axis ["DRL-1", "DRL-2", "DRL-3", "DRL-4", "DRL-5"]
    y-axis "Expected Improvement (%)" 0 --> 35
    bar [0, 5.5, 11.5, 17, 24]
    line [0, 5.5, 11.5, 17, 24]

8. Conclusion

The five-level DRL framework provides a complete and principled taxonomy for pharmaceutical portfolio optimization strategy selection. By mapping information quality (DRI) to optimization method (DRL), the framework ensures that decision-making resources are allocated appropriately: conservative approaches where information is limited, sophisticated AI methods where information quality justifies them.

The key insight is that optimization method selection should be a function of information availability, not of organizational preference or algorithmic fashion. DRL operationalises this principle in a form that is quantitative, auditable, and directly implementable in the HPF-P platform.

References

Rockafellar, R. T., & Uryasev, S. (2000). Optimization of conditional value-at-risk. Journal of Risk, 2(3), 21–41.
Deb, K., Pratap, A., Agarwal, S., & Meyarivan, T. (2002). A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation, 6(2), 182–197.
Abbeel, P., & Ng, A. Y. (2004). Apprenticeship learning via inverse reinforcement learning. ICML.
Markowitz, H. (1952). Portfolio selection. Journal of Finance, 7(1), 77–91.

Version History · 1 revisions