30 Articles · 4 Research Phases · 2026 · In Progress
Abstract
How do large language models remember? The key-value cache is the dominant memory structure in transformer inference, yet its behaviour, limitations, and optimisation remain poorly systematised. This series provides a rigorous, benchmark-driven investigation of AI memory systems: from KV-cache fundamentals and attention memory patterns through compression techniques, architectural comparisons, and distributed caching infrastructure, to the economics of context caching and emerging paradigms that blur the line between attention memory and retrieval-augmented memory. Across 30 articles organised in four phases — Foundation & Benchmarking, Optimisation Techniques, Infrastructure, and Economics & Emerging Directions — the series builds a unified evidence base for understanding, measuring, and improving how transformer models store, retrieve, and forget information.
Articles Technical Research · 29 published
All Articles
1 KV-Cache Fundamentals — How Transformers Remember (and Forget) DOI 2/10 70 s t a b i l f r · w d o p h c g m x Badge Metric Value Status Description [s] Reviewed Sources 7% ○ ≥80% from editorially reviewed sources [t] Trusted 100% ✓ ≥80% from verified, high-quality sources [a] DOI 100% ✓ ≥80% have a Digital Object Identifier [b] CrossRef 14% ○ ≥80% indexed in CrossRef [i] Indexed 93% ✓ ≥80% have metadata indexed [l] Academic 14% ○ ≥80% from journals/conferences/preprints [f] Free Access 0% ○ ≥80% are freely accessible [r] References 14 refs ✓ Minimum 10 references required [w] Words [REQ] 2,794 ✓ Minimum 2,000 words for a full research article. Current: 2,794 [d] DOI [REQ] ✓ ✓ Zenodo DOI registered for persistent citation. DOI: 10.5281/zenodo.19112532 [o] ORCID [REQ] ✓ ✓ Author ORCID verified for academic identity [p] Peer Reviewed [REQ] — ✗ Peer reviewed by an assigned reviewer [h] Freshness [REQ] 57% ✗ ≥80% of references from 2025–2026. Current: 57% [c] Data Charts 0 ○ Original data charts from reproducible analysis (min 2). Current: 0 [g] Code — ○ Source code available on GitHub [m] Diagrams 3 ✓ Mermaid architecture/flow diagrams. Current: 3 [x] Cited by 0 ○ Referenced by 0 other hub article(s)
Score = Ref Trust (82 × 60%) + Required (3/5 × 30%) + Optional (1/4 × 10%)
Technical Research · Mar 19, 2026 · 14 min read
2 Attention Memory Patterns — What Models Actually Store in KV-Cache DOI 2/10 70 s t a b i l f r · w d o p h c g m x Badge Metric Value Status Description [s] Reviewed Sources 11% ○ ≥80% from editorially reviewed sources [t] Trusted 100% ✓ ≥80% from verified, high-quality sources [a] DOI 95% ✓ ≥80% have a Digital Object Identifier [b] CrossRef 11% ○ ≥80% indexed in CrossRef [i] Indexed 100% ✓ ≥80% have metadata indexed [l] Academic 11% ○ ≥80% from journals/conferences/preprints [f] Free Access 11% ○ ≥80% are freely accessible [r] References 19 refs ✓ Minimum 10 references required [w] Words [REQ] 2,736 ✓ Minimum 2,000 words for a full research article. Current: 2,736 [d] DOI [REQ] ✓ ✓ Zenodo DOI registered for persistent citation. DOI: 10.5281/zenodo.19116558 [o] ORCID [REQ] ✓ ✓ Author ORCID verified for academic identity [p] Peer Reviewed [REQ] — ✗ Peer reviewed by an assigned reviewer [h] Freshness [REQ] 39% ✗ ≥80% of references from 2025–2026. Current: 39% [c] Data Charts 0 ○ Original data charts from reproducible analysis (min 2). Current: 0 [g] Code — ○ Source code available on GitHub [m] Diagrams 3 ✓ Mermaid architecture/flow diagrams. Current: 3 [x] Cited by 0 ○ Referenced by 0 other hub article(s)
Score = Ref Trust (82 × 60%) + Required (3/5 × 30%) + Optional (1/4 × 10%)
Technical Research · Mar 19, 2026 · 14 min read
3 Context Window Utilization — How Much of the Window Do Models Really Use? DOI 2/10 65 s t a b i l f r · w d o p h c g m x Badge Metric Value Status Description [s] Reviewed Sources 7% ○ ≥80% from editorially reviewed sources [t] Trusted 93% ✓ ≥80% from verified, high-quality sources [a] DOI 80% ✓ ≥80% have a Digital Object Identifier [b] CrossRef 0% ○ ≥80% indexed in CrossRef [i] Indexed 100% ✓ ≥80% have metadata indexed [l] Academic 7% ○ ≥80% from journals/conferences/preprints [f] Free Access 13% ○ ≥80% are freely accessible [r] References 15 refs ✓ Minimum 10 references required [w] Words [REQ] 2,874 ✓ Minimum 2,000 words for a full research article. Current: 2,874 [d] DOI [REQ] ✓ ✓ Zenodo DOI registered for persistent citation. DOI: 10.5281/zenodo.19160303 [o] ORCID [REQ] ✓ ✓ Author ORCID verified for academic identity [p] Peer Reviewed [REQ] — ✗ Peer reviewed by an assigned reviewer [h] Freshness [REQ] 77% ✗ ≥80% of references from 2025–2026. Current: 77% [c] Data Charts 0 ○ Original data charts from reproducible analysis (min 2). Current: 0 [g] Code — ○ Source code available on GitHub [m] Diagrams 3 ✓ Mermaid architecture/flow diagrams. Current: 3 [x] Cited by 0 ○ Referenced by 0 other hub article(s)
Score = Ref Trust (74 × 60%) + Required (3/5 × 30%) + Optional (1/4 × 10%)
Technical Research · Mar 22, 2026 · 14 min read
4 Long-Context Retrieval Benchmarks — Needle-in-Haystack and Beyond DOI 10/10 61 s t a b i l f r · w d o p h c g m x Badge Metric Value Status Description [s] Reviewed Sources 17% ○ ≥80% from editorially reviewed sources [t] Trusted 83% ✓ ≥80% from verified, high-quality sources [a] DOI 25% ○ ≥80% have a Digital Object Identifier [b] CrossRef 0% ○ ≥80% indexed in CrossRef [i] Indexed 100% ✓ ≥80% have metadata indexed [l] Academic 67% ○ ≥80% from journals/conferences/preprints [f] Free Access 92% ✓ ≥80% are freely accessible [r] References 12 refs ✓ Minimum 10 references required [w] Words [REQ] 2,043 ✓ Minimum 2,000 words for a full research article. Current: 2,043 [d] DOI [REQ] ✓ ✓ Zenodo DOI registered for persistent citation. DOI: 10.5281/zenodo.19163187 [o] ORCID [REQ] ✓ ✓ Author ORCID verified for academic identity [p] Peer Reviewed [REQ] — ✗ Peer reviewed by an assigned reviewer [h] Freshness [REQ] 10% ✗ ≥80% of references from 2025–2026. Current: 10% [c] Data Charts 0 ○ Original data charts from reproducible analysis (min 2). Current: 0 [g] Code — ○ Source code available on GitHub [m] Diagrams 3 ✓ Mermaid architecture/flow diagrams. Current: 3 [x] Cited by 0 ○ Referenced by 0 other hub article(s)
Score = Ref Trust (67 × 60%) + Required (3/5 × 30%) + Optional (1/4 × 10%)
Technical Research · Mar 22, 2026 · 10 min read
5 Memory Degradation Curves — How Accuracy Decays with Context Length DOI 2/10 69 s t a b i l f r · w d o p h c g m x Badge Metric Value Status Description [s] Reviewed Sources 0% ○ ≥80% from editorially reviewed sources [t] Trusted 87% ✓ ≥80% from verified, high-quality sources [a] DOI 80% ✓ ≥80% have a Digital Object Identifier [b] CrossRef 0% ○ ≥80% indexed in CrossRef [i] Indexed 93% ✓ ≥80% have metadata indexed [l] Academic 0% ○ ≥80% from journals/conferences/preprints [f] Free Access 13% ○ ≥80% are freely accessible [r] References 15 refs ✓ Minimum 10 references required [w] Words [REQ] 2,523 ✓ Minimum 2,000 words for a full research article. Current: 2,523 [d] DOI [REQ] ✓ ✓ Zenodo DOI registered for persistent citation. DOI: 10.5281/zenodo.19170557 [o] ORCID [REQ] ✓ ✓ Author ORCID verified for academic identity [p] Peer Reviewed [REQ] — ✗ Peer reviewed by an assigned reviewer [h] Freshness [REQ] 92% ✓ ≥80% of references from 2025–2026. Current: 92% [c] Data Charts 0 ○ Original data charts from reproducible analysis (min 2). Current: 0 [g] Code — ○ Source code available on GitHub [m] Diagrams 3 ✓ Mermaid architecture/flow diagrams. Current: 3 [x] Cited by 0 ○ Referenced by 0 other hub article(s)
Score = Ref Trust (70 × 60%) + Required (4/5 × 30%) + Optional (1/4 × 10%)
Technical Research · Mar 22, 2026 · 13 min read
6 KV-Cache Compression Benchmarks — Quantization vs Eviction vs Pruning DOI 8/10 72 s t a b i l f r · w d o p h c g m x Badge Metric Value Status Description [s] Reviewed Sources 0% ○ ≥80% from editorially reviewed sources [t] Trusted 93% ✓ ≥80% from verified, high-quality sources [a] DOI 87% ✓ ≥80% have a Digital Object Identifier [b] CrossRef 0% ○ ≥80% indexed in CrossRef [i] Indexed 100% ✓ ≥80% have metadata indexed [l] Academic 0% ○ ≥80% from journals/conferences/preprints [f] Free Access 13% ○ ≥80% are freely accessible [r] References 15 refs ✓ Minimum 10 references required [w] Words [REQ] 2,393 ✓ Minimum 2,000 words for a full research article. Current: 2,393 [d] DOI [REQ] ✓ ✓ Zenodo DOI registered for persistent citation. DOI: 10.5281/zenodo.19176966 [o] ORCID [REQ] ✓ ✓ Author ORCID verified for academic identity [p] Peer Reviewed [REQ] — ✗ Peer reviewed by an assigned reviewer [h] Freshness [REQ] 92% ✓ ≥80% of references from 2025–2026. Current: 92% [c] Data Charts 0 ○ Original data charts from reproducible analysis (min 2). Current: 0 [g] Code — ○ Source code available on GitHub [m] Diagrams 3 ✓ Mermaid architecture/flow diagrams. Current: 3 [x] Cited by 0 ○ Referenced by 0 other hub article(s)
Score = Ref Trust (75 × 60%) + Required (4/5 × 30%) + Optional (1/4 × 10%)
Technical Research · Mar 23, 2026 · 12 min read
7 Cross-Architecture Memory Comparison — Llama vs Mistral vs Gemma vs Qwen DOI 6/10 63 s t a b i l f r · w d o p h c g m x Badge Metric Value Status Description [s] Reviewed Sources 0% ○ ≥80% from editorially reviewed sources [t] Trusted 79% ○ ≥80% from verified, high-quality sources [a] DOI 64% ○ ≥80% have a Digital Object Identifier [b] CrossRef 0% ○ ≥80% indexed in CrossRef [i] Indexed 100% ✓ ≥80% have metadata indexed [l] Academic 7% ○ ≥80% from journals/conferences/preprints [f] Free Access 36% ○ ≥80% are freely accessible [r] References 14 refs ✓ Minimum 10 references required [w] Words [REQ] 2,222 ✓ Minimum 2,000 words for a full research article. Current: 2,222 [d] DOI [REQ] ✓ ✓ Zenodo DOI registered for persistent citation. DOI: 10.5281/zenodo.19183148 [o] ORCID [REQ] ✓ ✓ Author ORCID verified for academic identity [p] Peer Reviewed [REQ] — ✗ Peer reviewed by an assigned reviewer [h] Freshness [REQ] 58% ✗ ≥80% of references from 2025–2026. Current: 58% [c] Data Charts 5 ✓ Original data charts from reproducible analysis (min 2). Current: 5 [g] Code — ○ Source code available on GitHub [m] Diagrams 3 ✓ Mermaid architecture/flow diagrams. Current: 3 [x] Cited by 0 ○ Referenced by 0 other hub article(s)
Score = Ref Trust (66 × 60%) + Required (3/5 × 30%) + Optional (2/4 × 10%)
Technical Research · Mar 23, 2026 · 11 min read
8 Prompt Caching Efficiency — Measuring Reuse Across Real Workloads DOI 1/10 72 s t a b i l f r · w d o p h c g m x Badge Metric Value Status Description [s] Reviewed Sources 9% ○ ≥80% from editorially reviewed sources [t] Trusted 91% ✓ ≥80% from verified, high-quality sources [a] DOI 73% ○ ≥80% have a Digital Object Identifier [b] CrossRef 0% ○ ≥80% indexed in CrossRef [i] Indexed 100% ✓ ≥80% have metadata indexed [l] Academic 9% ○ ≥80% from journals/conferences/preprints [f] Free Access 18% ○ ≥80% are freely accessible [r] References 11 refs ✓ Minimum 10 references required [w] Words [REQ] 2,628 ✓ Minimum 2,000 words for a full research article. Current: 2,628 [d] DOI [REQ] ✓ ✓ Zenodo DOI registered for persistent citation. DOI: 10.5281/zenodo.19187992 [o] ORCID [REQ] ✓ ✓ Author ORCID verified for academic identity [p] Peer Reviewed [REQ] — ✗ Peer reviewed by an assigned reviewer [h] Freshness [REQ] 89% ✓ ≥80% of references from 2025–2026. Current: 89% [c] Data Charts 5 ✓ Original data charts from reproducible analysis (min 2). Current: 5 [g] Code — ○ Source code available on GitHub [m] Diagrams 3 ✓ Mermaid architecture/flow diagrams. Current: 3 [x] Cited by 0 ○ Referenced by 0 other hub article(s)
Score = Ref Trust (72 × 60%) + Required (4/5 × 30%) + Optional (2/4 × 10%)
Technical Research · Mar 23, 2026 · 13 min read
9 Multi-Turn Memory — How Conversation History Degrades Model Performance DOI 2/10 55 s t a b i l f r · w d o p h c g m x Badge Metric Value Status Description [s] Reviewed Sources 0% ○ ≥80% from editorially reviewed sources [t] Trusted 86% ✓ ≥80% from verified, high-quality sources [a] DOI 7% ○ ≥80% have a Digital Object Identifier [b] CrossRef 0% ○ ≥80% indexed in CrossRef [i] Indexed 100% ✓ ≥80% have metadata indexed [l] Academic 71% ○ ≥80% from journals/conferences/preprints [f] Free Access 93% ✓ ≥80% are freely accessible [r] References 14 refs ✓ Minimum 10 references required [w] Words [REQ] 1,597 ✗ Minimum 2,000 words for a full research article. Current: 1,597 [d] DOI [REQ] ✓ ✓ Zenodo DOI registered for persistent citation. DOI: 10.5281/zenodo.19195991 [o] ORCID [REQ] ✓ ✓ Author ORCID verified for academic identity [p] Peer Reviewed [REQ] — ✗ Peer reviewed by an assigned reviewer [h] Freshness [REQ] 0% ✗ ≥80% of references from 2025–2026. Current: 0% [c] Data Charts 5 ✓ Original data charts from reproducible analysis (min 2). Current: 5 [g] Code — ○ Source code available on GitHub [m] Diagrams 3 ✓ Mermaid architecture/flow diagrams. Current: 3 [x] Cited by 0 ○ Referenced by 0 other hub article(s)
Score = Ref Trust (63 × 60%) + Required (2/5 × 30%) + Optional (2/4 × 10%)
Technical Research · Mar 23, 2026 · 8 min read
10 Meta-Analysis of Context Benchmarks — Building a Unified Evaluation Framework DOI 1/10 64 s t a b i l f r · w d o p h c g m x Badge Metric Value Status Description [s] Reviewed Sources 19% ○ ≥80% from editorially reviewed sources [t] Trusted 94% ✓ ≥80% from verified, high-quality sources [a] DOI 6% ○ ≥80% have a Digital Object Identifier [b] CrossRef 0% ○ ≥80% indexed in CrossRef [i] Indexed 100% ✓ ≥80% have metadata indexed [l] Academic 81% ✓ ≥80% from journals/conferences/preprints [f] Free Access 75% ○ ≥80% are freely accessible [r] References 16 refs ✓ Minimum 10 references required [w] Words [REQ] 2,526 ✓ Minimum 2,000 words for a full research article. Current: 2,526 [d] DOI [REQ] ✓ ✓ Zenodo DOI registered for persistent citation. DOI: 10.5281/zenodo.19199439 [o] ORCID [REQ] ✓ ✓ Author ORCID verified for academic identity [p] Peer Reviewed [REQ] — ✗ Peer reviewed by an assigned reviewer [h] Freshness [REQ] 14% ✗ ≥80% of references from 2025–2026. Current: 14% [c] Data Charts 5 ✓ Original data charts from reproducible analysis (min 2). Current: 5 [g] Code — ○ Source code available on GitHub [m] Diagrams 3 ✓ Mermaid architecture/flow diagrams. Current: 3 [x] Cited by 0 ○ Referenced by 0 other hub article(s)
Score = Ref Trust (68 × 60%) + Required (3/5 × 30%) + Optional (2/4 × 10%)
Technical Research · Mar 24, 2026 · 13 min read
11 Paged Attention and Virtual Memory for LLM Inference DOI 2/10 61 s t a b i l f r · w d o p h c g m x Badge Metric Value Status Description [s] Reviewed Sources 17% ○ ≥80% from editorially reviewed sources [t] Trusted 75% ○ ≥80% from verified, high-quality sources [a] DOI 33% ○ ≥80% have a Digital Object Identifier [b] CrossRef 17% ○ ≥80% indexed in CrossRef [i] Indexed 92% ✓ ≥80% have metadata indexed [l] Academic 50% ○ ≥80% from journals/conferences/preprints [f] Free Access 67% ○ ≥80% are freely accessible [r] References 12 refs ✓ Minimum 10 references required [w] Words [REQ] 2,912 ✓ Minimum 2,000 words for a full research article. Current: 2,912 [d] DOI [REQ] ✓ ✓ Zenodo DOI registered for persistent citation. DOI: 10.5281/zenodo.19203099 [o] ORCID [REQ] ✓ ✓ Author ORCID verified for academic identity [p] Peer Reviewed [REQ] — ✗ Peer reviewed by an assigned reviewer [h] Freshness [REQ] 40% ✗ ≥80% of references from 2025–2026. Current: 40% [c] Data Charts 4 ✓ Original data charts from reproducible analysis (min 2). Current: 4 [g] Code — ○ Source code available on GitHub [m] Diagrams 3 ✓ Mermaid architecture/flow diagrams. Current: 3 [x] Cited by 0 ○ Referenced by 0 other hub article(s)
Score = Ref Trust (63 × 60%) + Required (3/5 × 30%) + Optional (2/4 × 10%)
Technical Research · Mar 24, 2026 · 15 min read
12 Grouped-Query Attention — Cache-Efficient Architecture Design DOI 1/10 69 s t a b i l f r · w d o p h c g m x Badge Metric Value Status Description [s] Reviewed Sources 5% ○ ≥80% from editorially reviewed sources [t] Trusted 95% ✓ ≥80% from verified, high-quality sources [a] DOI 90% ✓ ≥80% have a Digital Object Identifier [b] CrossRef 5% ○ ≥80% indexed in CrossRef [i] Indexed 90% ✓ ≥80% have metadata indexed [l] Academic 10% ○ ≥80% from journals/conferences/preprints [f] Free Access 19% ○ ≥80% are freely accessible [r] References 21 refs ✓ Minimum 10 references required [w] Words [REQ] 2,403 ✓ Minimum 2,000 words for a full research article. Current: 2,403 [d] DOI [REQ] ✓ ✓ Zenodo DOI registered for persistent citation. DOI: 10.5281/zenodo.19209159 [o] ORCID [REQ] ✓ ✓ Author ORCID verified for academic identity [p] Peer Reviewed [REQ] — ✗ Peer reviewed by an assigned reviewer [h] Freshness [REQ] 42% ✗ ≥80% of references from 2025–2026. Current: 42% [c] Data Charts 5 ✓ Original data charts from reproducible analysis (min 2). Current: 5 [g] Code — ○ Source code available on GitHub [m] Diagrams 3 ✓ Mermaid architecture/flow diagrams. Current: 3 [x] Cited by 0 ○ Referenced by 0 other hub article(s)
Score = Ref Trust (76 × 60%) + Required (3/5 × 30%) + Optional (2/4 × 10%)
Technical Research · Mar 24, 2026 · 12 min read
13 Speculative Decoding and Cache Reuse DOI 4/10 63 s t a b i l f r · w d o p h c g m x Badge Metric Value Status Description [s] Reviewed Sources 0% ○ ≥80% from editorially reviewed sources [t] Trusted 94% ✓ ≥80% from verified, high-quality sources [a] DOI 6% ○ ≥80% have a Digital Object Identifier [b] CrossRef 0% ○ ≥80% indexed in CrossRef [i] Indexed 100% ✓ ≥80% have metadata indexed [l] Academic 83% ✓ ≥80% from journals/conferences/preprints [f] Free Access 94% ✓ ≥80% are freely accessible [r] References 18 refs ✓ Minimum 10 references required [w] Words [REQ] 2,662 ✓ Minimum 2,000 words for a full research article. Current: 2,662 [d] DOI [REQ] ✓ ✓ Zenodo DOI registered for persistent citation. DOI: 10.5281/zenodo.19210815 [o] ORCID [REQ] ✓ ✓ Author ORCID verified for academic identity [p] Peer Reviewed [REQ] — ✗ Peer reviewed by an assigned reviewer [h] Freshness [REQ] 13% ✗ ≥80% of references from 2025–2026. Current: 13% [c] Data Charts 4 ✓ Original data charts from reproducible analysis (min 2). Current: 4 [g] Code — ○ Source code available on GitHub [m] Diagrams 3 ✓ Mermaid architecture/flow diagrams. Current: 3 [x] Cited by 0 ○ Referenced by 0 other hub article(s)
Score = Ref Trust (67 × 60%) + Required (3/5 × 30%) + Optional (2/4 × 10%)
Technical Research · Mar 24, 2026 · 13 min read
14 Semantic Prompt Caching — Beyond Exact Match DOI 3/10 63 s t a b i l f r · w d o p h c g m x Badge Metric Value Status Description [s] Reviewed Sources 0% ○ ≥80% from editorially reviewed sources [t] Trusted 91% ✓ ≥80% from verified, high-quality sources [a] DOI 9% ○ ≥80% have a Digital Object Identifier [b] CrossRef 0% ○ ≥80% indexed in CrossRef [i] Indexed 100% ✓ ≥80% have metadata indexed [l] Academic 73% ○ ≥80% from journals/conferences/preprints [f] Free Access 91% ✓ ≥80% are freely accessible [r] References 11 refs ✓ Minimum 10 references required [w] Words [REQ] 2,328 ✓ Minimum 2,000 words for a full research article. Current: 2,328 [d] DOI [REQ] ✓ ✓ Zenodo DOI registered for persistent citation. DOI: 10.5281/zenodo.19211071 [o] ORCID [REQ] ✓ ✓ Author ORCID verified for academic identity [p] Peer Reviewed [REQ] — ✗ Peer reviewed by an assigned reviewer [h] Freshness [REQ] 0% ✗ ≥80% of references from 2025–2026. Current: 0% [c] Data Charts 4 ✓ Original data charts from reproducible analysis (min 2). Current: 4 [g] Code — ○ Source code available on GitHub [m] Diagrams 3 ✓ Mermaid architecture/flow diagrams. Current: 3 [x] Cited by 0 ○ Referenced by 0 other hub article(s)
Score = Ref Trust (66 × 60%) + Required (3/5 × 30%) + Optional (2/4 × 10%)
Technical Research · Mar 24, 2026 · 12 min read
15 Token Pruning and Attention Sparsity DOI 1/10 72 s t a b i l f r · w d o p h c g m x Badge Metric Value Status Description [s] Reviewed Sources 75% ○ ≥80% from editorially reviewed sources [t] Trusted 75% ○ ≥80% from verified, high-quality sources [a] DOI 81% ✓ ≥80% have a Digital Object Identifier [b] CrossRef 75% ○ ≥80% indexed in CrossRef [i] Indexed 75% ○ ≥80% have metadata indexed [l] Academic 75% ○ ≥80% from journals/conferences/preprints [f] Free Access 88% ✓ ≥80% are freely accessible [r] References 16 refs ✓ Minimum 10 references required [w] Words [REQ] 2,298 ✓ Minimum 2,000 words for a full research article. Current: 2,298 [d] DOI [REQ] ✓ ✓ Zenodo DOI registered for persistent citation. DOI: 10.5281/zenodo.19269070 [o] ORCID [REQ] ✗ ✗ Author ORCID verified for academic identity [p] Peer Reviewed [REQ] — ✗ Peer reviewed by an assigned reviewer [h] Freshness [REQ] 92% ✓ ≥80% of references from 2025–2026. Current: 92% [c] Data Charts 0 ○ Original data charts from reproducible analysis (min 2). Current: 0 [g] Code ✓ ✓ Source code available on GitHub [m] Diagrams 3 ✓ Mermaid architecture/flow diagrams. Current: 3 [x] Cited by 0 ○ Referenced by 0 other hub article(s)
Score = Ref Trust (82 × 60%) + Required (3/5 × 30%) + Optional (2/4 × 10%)
Technical Research · Mar 28, 2026 · 11 min read
— Production Cache Monitoring — Metrics and Capacity Planning (Draft — in preparation)
— Cross-Model Cache Transfer and Universal Formats (Draft — in preparation)
16 Cross-Layer KV-Cache Sharing DOI 2/10 54 s t a b i l f r · w d o p h c g m x Badge Metric Value Status Description [s] Reviewed Sources 15% ○ ≥80% from editorially reviewed sources [t] Trusted 35% ○ ≥80% from verified, high-quality sources [a] DOI 85% ✓ ≥80% have a Digital Object Identifier [b] CrossRef 15% ○ ≥80% indexed in CrossRef [i] Indexed 30% ○ ≥80% have metadata indexed [l] Academic 15% ○ ≥80% from journals/conferences/preprints [f] Free Access 25% ○ ≥80% are freely accessible [r] References 20 refs ✓ Minimum 10 references required [w] Words [REQ] 2,141 ✓ Minimum 2,000 words for a full research article. Current: 2,141 [d] DOI [REQ] ✓ ✓ Zenodo DOI registered for persistent citation. DOI: 10.5281/zenodo.19291014 [o] ORCID [REQ] ✓ ✓ Author ORCID verified for academic identity [p] Peer Reviewed [REQ] — ✗ Peer reviewed by an assigned reviewer [h] Freshness [REQ] 76% ✗ ≥80% of references from 2025–2026. Current: 76% [c] Data Charts 4 ✓ Original data charts from reproducible analysis (min 2). Current: 4 [g] Code ✓ ✓ Source code available on GitHub [m] Diagrams 3 ✓ Mermaid architecture/flow diagrams. Current: 3 [x] Cited by 0 ○ Referenced by 0 other hub article(s)
Score = Ref Trust (47 × 60%) + Required (3/5 × 30%) + Optional (3/4 × 10%)
Technical Research · Mar 28, 2026 · 11 min read
17 Sliding Window and Compressive Caching for Infinite Context DOI 2/10 61 s t a b i l f r · w d o p h c g m x Badge Metric Value Status Description [s] Reviewed Sources 26% ○ ≥80% from editorially reviewed sources [t] Trusted 35% ○ ≥80% from verified, high-quality sources [a] DOI 83% ✓ ≥80% have a Digital Object Identifier [b] CrossRef 26% ○ ≥80% indexed in CrossRef [i] Indexed 35% ○ ≥80% have metadata indexed [l] Academic 22% ○ ≥80% from journals/conferences/preprints [f] Free Access 30% ○ ≥80% are freely accessible [r] References 23 refs ✓ Minimum 10 references required [w] Words [REQ] 2,250 ✓ Minimum 2,000 words for a full research article. Current: 2,250 [d] DOI [REQ] ✓ ✓ Zenodo DOI registered for persistent citation. DOI: 10.5281/zenodo.19299498 [o] ORCID [REQ] ✓ ✓ Author ORCID verified for academic identity [p] Peer Reviewed [REQ] — ✗ Peer reviewed by an assigned reviewer [h] Freshness [REQ] 80% ✓ ≥80% of references from 2025–2026. Current: 80% [c] Data Charts 4 ✓ Original data charts from reproducible analysis (min 2). Current: 4 [g] Code ✓ ✓ Source code available on GitHub [m] Diagrams 3 ✓ Mermaid architecture/flow diagrams. Current: 3 [x] Cited by 0 ○ Referenced by 0 other hub article(s)
Score = Ref Trust (49 × 60%) + Required (4/5 × 30%) + Optional (3/4 × 10%)
Technical Research · Mar 28, 2026 · 11 min read
18 Flash Attention's Role in Memory-Efficient Inference DOI 4/10 68 s t a b i l f r · w d o p h c g m x Badge Metric Value Status Description [s] Reviewed Sources 55% ○ ≥80% from editorially reviewed sources [t] Trusted 55% ○ ≥80% from verified, high-quality sources [a] DOI 75% ○ ≥80% have a Digital Object Identifier [b] CrossRef 55% ○ ≥80% indexed in CrossRef [i] Indexed 50% ○ ≥80% have metadata indexed [l] Academic 25% ○ ≥80% from journals/conferences/preprints [f] Free Access 45% ○ ≥80% are freely accessible [r] References 20 refs ✓ Minimum 10 references required [w] Words [REQ] 2,893 ✓ Minimum 2,000 words for a full research article. Current: 2,893 [d] DOI [REQ] ✓ ✓ Zenodo DOI registered for persistent citation. DOI: 10.5281/zenodo.19303451 [o] ORCID [REQ] ✓ ✓ Author ORCID verified for academic identity [p] Peer Reviewed [REQ] — ✗ Peer reviewed by an assigned reviewer [h] Freshness [REQ] 80% ✓ ≥80% of references from 2025–2026. Current: 80% [c] Data Charts 5 ✓ Original data charts from reproducible analysis (min 2). Current: 5 [g] Code ✓ ✓ Source code available on GitHub [m] Diagrams 3 ✓ Mermaid architecture/flow diagrams. Current: 3 [x] Cited by 0 ○ Referenced by 0 other hub article(s)
Score = Ref Trust (60 × 60%) + Required (4/5 × 30%) + Optional (3/4 × 10%)
Technical Research · Mar 29, 2026 · 14 min read
19 Distributed KV-Cache in Multi-GPU Serving DOI 3/10 75 s t a b i l f r · w d o p h c g m x Badge Metric Value Status Description [s] Reviewed Sources 65% ○ ≥80% from editorially reviewed sources [t] Trusted 65% ○ ≥80% from verified, high-quality sources [a] DOI 82% ✓ ≥80% have a Digital Object Identifier [b] CrossRef 65% ○ ≥80% indexed in CrossRef [i] Indexed 65% ○ ≥80% have metadata indexed [l] Academic 47% ○ ≥80% from journals/conferences/preprints [f] Free Access 47% ○ ≥80% are freely accessible [r] References 17 refs ✓ Minimum 10 references required [w] Words [REQ] 2,267 ✓ Minimum 2,000 words for a full research article. Current: 2,267 [d] DOI [REQ] ✓ ✓ Zenodo DOI registered for persistent citation. DOI: 10.5281/zenodo.19310103 [o] ORCID [REQ] ✓ ✓ Author ORCID verified for academic identity [p] Peer Reviewed [REQ] — ✗ Peer reviewed by an assigned reviewer [h] Freshness [REQ] 86% ✓ ≥80% of references from 2025–2026. Current: 86% [c] Data Charts 4 ✓ Original data charts from reproducible analysis (min 2). Current: 4 [g] Code ✓ ✓ Source code available on GitHub [m] Diagrams 3 ✓ Mermaid architecture/flow diagrams. Current: 3 [x] Cited by 0 ○ Referenced by 0 other hub article(s)
Score = Ref Trust (72 × 60%) + Required (4/5 × 30%) + Optional (3/4 × 10%)
Technical Research · Mar 29, 2026 · 11 min read
20 Disaggregated Prefill and Decode Architectures DOI 2/10 71 s t a b i l f r · w d o p h c g m x Badge Metric Value Status Description [s] Reviewed Sources 56% ○ ≥80% from editorially reviewed sources [t] Trusted 56% ○ ≥80% from verified, high-quality sources [a] DOI 81% ✓ ≥80% have a Digital Object Identifier [b] CrossRef 56% ○ ≥80% indexed in CrossRef [i] Indexed 56% ○ ≥80% have metadata indexed [l] Academic 50% ○ ≥80% from journals/conferences/preprints [f] Free Access 19% ○ ≥80% are freely accessible [r] References 16 refs ✓ Minimum 10 references required [w] Words [REQ] 2,157 ✓ Minimum 2,000 words for a full research article. Current: 2,157 [d] DOI [REQ] ✓ ✓ Zenodo DOI registered for persistent citation. DOI: 10.5281/zenodo.19316904 [o] ORCID [REQ] ✓ ✓ Author ORCID verified for academic identity [p] Peer Reviewed [REQ] — ✗ Peer reviewed by an assigned reviewer [h] Freshness [REQ] 85% ✓ ≥80% of references from 2025–2026. Current: 85% [c] Data Charts 4 ✓ Original data charts from reproducible analysis (min 2). Current: 4 [g] Code ✓ ✓ Source code available on GitHub [m] Diagrams 3 ✓ Mermaid architecture/flow diagrams. Current: 3 [x] Cited by 0 ○ Referenced by 0 other hub article(s)
Score = Ref Trust (66 × 60%) + Required (4/5 × 30%) + Optional (3/4 × 10%)
Technical Research · Mar 29, 2026 · 11 min read
21 Cache-Aware Request Scheduling and Batching DOI 1/10 74 s t a b i l f r · w d o p h c g m x Badge Metric Value Status Description [s] Reviewed Sources 50% ○ ≥80% from editorially reviewed sources [t] Trusted 72% ○ ≥80% from verified, high-quality sources [a] DOI 67% ○ ≥80% have a Digital Object Identifier [b] CrossRef 50% ○ ≥80% indexed in CrossRef [i] Indexed 56% ○ ≥80% have metadata indexed [l] Academic 72% ○ ≥80% from journals/conferences/preprints [f] Free Access 67% ○ ≥80% are freely accessible [r] References 18 refs ✓ Minimum 10 references required [w] Words [REQ] 2,876 ✓ Minimum 2,000 words for a full research article. Current: 2,876 [d] DOI [REQ] ✓ ✓ Zenodo DOI registered for persistent citation. DOI: 10.5281/zenodo.19325142 [o] ORCID [REQ] ✓ ✓ Author ORCID verified for academic identity [p] Peer Reviewed [REQ] — ✗ Peer reviewed by an assigned reviewer [h] Freshness [REQ] 80% ✓ ≥80% of references from 2025–2026. Current: 80% [c] Data Charts 5 ✓ Original data charts from reproducible analysis (min 2). Current: 5 [g] Code ✓ ✓ Source code available on GitHub [m] Diagrams 3 ✓ Mermaid architecture/flow diagrams. Current: 3 [x] Cited by 0 ○ Referenced by 0 other hub article(s)
Score = Ref Trust (70 × 60%) + Required (4/5 × 30%) + Optional (3/4 × 10%)
Technical Research · Mar 30, 2026 · 14 min read
22 Memory Hierarchy — DRAM, HBM, and SSD-Backed Caches DOI 1/10 53 s t a b i l f r · w d o p h c g m x Badge Metric Value Status Description [s] Reviewed Sources 15% ○ ≥80% from editorially reviewed sources [t] Trusted 54% ○ ≥80% from verified, high-quality sources [a] DOI 38% ○ ≥80% have a Digital Object Identifier [b] CrossRef 15% ○ ≥80% indexed in CrossRef [i] Indexed 23% ○ ≥80% have metadata indexed [l] Academic 54% ○ ≥80% from journals/conferences/preprints [f] Free Access 69% ○ ≥80% are freely accessible [r] References 13 refs ✓ Minimum 10 references required [w] Words [REQ] 1,733 ✗ Minimum 2,000 words for a full research article. Current: 1,733 [d] DOI [REQ] ✓ ✓ Zenodo DOI registered for persistent citation. DOI: 10.5281/zenodo.19329971 [o] ORCID [REQ] ✓ ✓ Author ORCID verified for academic identity [p] Peer Reviewed [REQ] — ✗ Peer reviewed by an assigned reviewer [h] Freshness [REQ] 80% ✓ ≥80% of references from 2025–2026. Current: 80% [c] Data Charts 4 ✓ Original data charts from reproducible analysis (min 2). Current: 4 [g] Code ✓ ✓ Source code available on GitHub [m] Diagrams 3 ✓ Mermaid architecture/flow diagrams. Current: 3 [x] Cited by 0 ○ Referenced by 0 other hub article(s)
Score = Ref Trust (45 × 60%) + Required (3/5 × 30%) + Optional (3/4 × 10%)
Technical Research · Mar 30, 2026 · 9 min read
23 Cache Coherence in Multi-Tenant Deployments DOI 1/10 68 s t a b i l f r · w d o p h c g m x Badge Metric Value Status Description [s] Reviewed Sources 40% ○ ≥80% from editorially reviewed sources [t] Trusted 60% ○ ≥80% from verified, high-quality sources [a] DOI 65% ○ ≥80% have a Digital Object Identifier [b] CrossRef 40% ○ ≥80% indexed in CrossRef [i] Indexed 45% ○ ≥80% have metadata indexed [l] Academic 60% ○ ≥80% from journals/conferences/preprints [f] Free Access 50% ○ ≥80% are freely accessible [r] References 20 refs ✓ Minimum 10 references required [w] Words [REQ] 2,358 ✓ Minimum 2,000 words for a full research article. Current: 2,358 [d] DOI [REQ] ✓ ✓ Zenodo DOI registered for persistent citation. DOI: 10.5281/zenodo.19336721 [o] ORCID [REQ] ✓ ✓ Author ORCID verified for academic identity [p] Peer Reviewed [REQ] — ✗ Peer reviewed by an assigned reviewer [h] Freshness [REQ] 82% ✓ ≥80% of references from 2025–2026. Current: 82% [c] Data Charts 4 ✓ Original data charts from reproducible analysis (min 2). Current: 4 [g] Code ✓ ✓ Source code available on GitHub [m] Diagrams 3 ✓ Mermaid architecture/flow diagrams. Current: 3 [x] Cited by 0 ○ Referenced by 0 other hub article(s)
Score = Ref Trust (61 × 60%) + Required (4/5 × 30%) + Optional (3/4 × 10%)
Technical Research · Mar 30, 2026 · 12 min read
24 Production Cache Monitoring — Metrics and Capacity Planning DOI 1/10 69 s t a b i l f r · w d o p h c g m x Badge Metric Value Status Description [s] Reviewed Sources 48% ○ ≥80% from editorially reviewed sources [t] Trusted 57% ○ ≥80% from verified, high-quality sources [a] DOI 57% ○ ≥80% have a Digital Object Identifier [b] CrossRef 52% ○ ≥80% indexed in CrossRef [i] Indexed 61% ○ ≥80% have metadata indexed [l] Academic 57% ○ ≥80% from journals/conferences/preprints [f] Free Access 35% ○ ≥80% are freely accessible [r] References 23 refs ✓ Minimum 10 references required [w] Words [REQ] 2,607 ✓ Minimum 2,000 words for a full research article. Current: 2,607 [d] DOI [REQ] ✓ ✓ Zenodo DOI registered for persistent citation. DOI: 10.5281/zenodo.19340506 [o] ORCID [REQ] ✓ ✓ Author ORCID verified for academic identity [p] Peer Reviewed [REQ] — ✗ Peer reviewed by an assigned reviewer [h] Freshness [REQ] 85% ✓ ≥80% of references from 2025–2026. Current: 85% [c] Data Charts 5 ✓ Original data charts from reproducible analysis (min 2). Current: 5 [g] Code ✓ ✓ Source code available on GitHub [m] Diagrams 3 ✓ Mermaid architecture/flow diagrams. Current: 3 [x] Cited by 0 ○ Referenced by 0 other hub article(s)
Score = Ref Trust (62 × 60%) + Required (4/5 × 30%) + Optional (3/4 × 10%)
Technical Research · Mar 30, 2026 · 13 min read
25 The Economics of Context Caching — Cost Models and Break-Even DOI 1/10 87 s t a b i l f r · w d o p h c g m x Badge Metric Value Status Description [s] Reviewed Sources 83% ✓ ≥80% from editorially reviewed sources [t] Trusted 94% ✓ ≥80% from verified, high-quality sources [a] DOI 83% ✓ ≥80% have a Digital Object Identifier [b] CrossRef 83% ✓ ≥80% indexed in CrossRef [i] Indexed 86% ✓ ≥80% have metadata indexed [l] Academic 83% ✓ ≥80% from journals/conferences/preprints [f] Free Access 60% ○ ≥80% are freely accessible [r] References 35 refs ✓ Minimum 10 references required [w] Words [REQ] 2,944 ✓ Minimum 2,000 words for a full research article. Current: 2,944 [d] DOI [REQ] ✓ ✓ Zenodo DOI registered for persistent citation. DOI: 10.5281/zenodo.19343122 [o] ORCID [REQ] ✓ ✓ Author ORCID verified for academic identity [p] Peer Reviewed [REQ] — ✗ Peer reviewed by an assigned reviewer [h] Freshness [REQ] 84% ✓ ≥80% of references from 2025–2026. Current: 84% [c] Data Charts 4 ✓ Original data charts from reproducible analysis (min 2). Current: 4 [g] Code ✓ ✓ Source code available on GitHub [m] Diagrams 3 ✓ Mermaid architecture/flow diagrams. Current: 3 [x] Cited by 0 ○ Referenced by 0 other hub article(s)
Score = Ref Trust (92 × 60%) + Required (4/5 × 30%) + Optional (3/4 × 10%)
Technical Research · Mar 31, 2026 · 15 min read
26 Cache-Augmented Retrieval — RAG Meets KV-Cache DOI 1/10 62 s t a b i l f r · w d o p h c g m x Badge Metric Value Status Description [s] Reviewed Sources 30% ○ ≥80% from editorially reviewed sources [t] Trusted 55% ○ ≥80% from verified, high-quality sources [a] DOI 40% ○ ≥80% have a Digital Object Identifier [b] CrossRef 30% ○ ≥80% indexed in CrossRef [i] Indexed 35% ○ ≥80% have metadata indexed [l] Academic 55% ○ ≥80% from journals/conferences/preprints [f] Free Access 85% ✓ ≥80% are freely accessible [r] References 20 refs ✓ Minimum 10 references required [w] Words [REQ] 3,487 ✓ Minimum 2,000 words for a full research article. Current: 3,487 [d] DOI [REQ] ✓ ✓ Zenodo DOI registered for persistent citation. DOI: 10.5281/zenodo.19348524 [o] ORCID [REQ] ✓ ✓ Author ORCID verified for academic identity [p] Peer Reviewed [REQ] — ✗ Peer reviewed by an assigned reviewer [h] Freshness [REQ] 85% ✓ ≥80% of references from 2025–2026. Current: 85% [c] Data Charts 4 ✓ Original data charts from reproducible analysis (min 2). Current: 4 [g] Code ✓ ✓ Source code available on GitHub [m] Diagrams 3 ✓ Mermaid architecture/flow diagrams. Current: 3 [x] Cited by 0 ○ Referenced by 0 other hub article(s)
Score = Ref Trust (50 × 60%) + Required (4/5 × 30%) + Optional (3/4 × 10%)
Technical Research · Mar 31, 2026 · 17 min read
27 Retrieval-Augmented Memory vs Pure Attention Memory DOI 1/10 61 s t a b i l f r · w d o p h c g m x Badge Metric Value Status Description [s] Reviewed Sources 16% ○ ≥80% from editorially reviewed sources [t] Trusted 68% ○ ≥80% from verified, high-quality sources [a] DOI 26% ○ ≥80% have a Digital Object Identifier [b] CrossRef 16% ○ ≥80% indexed in CrossRef [i] Indexed 21% ○ ≥80% have metadata indexed [l] Academic 74% ○ ≥80% from journals/conferences/preprints [f] Free Access 89% ✓ ≥80% are freely accessible [r] References 19 refs ✓ Minimum 10 references required [w] Words [REQ] 2,202 ✓ Minimum 2,000 words for a full research article. Current: 2,202 [d] DOI [REQ] ✓ ✓ Zenodo DOI registered for persistent citation. DOI: 10.5281/zenodo.19354653 [o] ORCID [REQ] ✓ ✓ Author ORCID verified for academic identity [p] Peer Reviewed [REQ] — ✗ Peer reviewed by an assigned reviewer [h] Freshness [REQ] 87% ✓ ≥80% of references from 2025–2026. Current: 87% [c] Data Charts 4 ✓ Original data charts from reproducible analysis (min 2). Current: 4 [g] Code ✓ ✓ Source code available on GitHub [m] Diagrams 3 ✓ Mermaid architecture/flow diagrams. Current: 3 [x] Cited by 0 ○ Referenced by 0 other hub article(s)
Score = Ref Trust (49 × 60%) + Required (4/5 × 30%) + Optional (3/4 × 10%)
Technical Research · Mar 31, 2026 · 11 min read
28 Biological Memory Models and Their AI Analogues DOI 1/10 51 s t a b i l f r · w d o p h c g m x Badge Metric Value Status Description [s] Reviewed Sources 6% ○ ≥80% from editorially reviewed sources [t] Trusted 41% ○ ≥80% from verified, high-quality sources [a] DOI 12% ○ ≥80% have a Digital Object Identifier [b] CrossRef 6% ○ ≥80% indexed in CrossRef [i] Indexed 6% ○ ≥80% have metadata indexed [l] Academic 71% ○ ≥80% from journals/conferences/preprints [f] Free Access 88% ✓ ≥80% are freely accessible [r] References 17 refs ✓ Minimum 10 references required [w] Words [REQ] 2,727 ✓ Minimum 2,000 words for a full research article. Current: 2,727 [d] DOI [REQ] ✓ ✓ Zenodo DOI registered for persistent citation. DOI: 10.5281/zenodo.19360007 [o] ORCID [REQ] ✓ ✓ Author ORCID verified for academic identity [p] Peer Reviewed [REQ] — ✗ Peer reviewed by an assigned reviewer [h] Freshness [REQ] 93% ✓ ≥80% of references from 2025–2026. Current: 93% [c] Data Charts 4 ✓ Original data charts from reproducible analysis (min 2). Current: 4 [g] Code ✓ ✓ Source code available on GitHub [m] Diagrams 3 ✓ Mermaid architecture/flow diagrams. Current: 3 [x] Cited by 0 ○ Referenced by 0 other hub article(s)
Score = Ref Trust (33 × 60%) + Required (4/5 × 30%) + Optional (3/4 × 10%)
Technical Research · Mar 31, 2026 · 14 min read
29 The Future of AI Memory — From Fixed Windows to Persistent State DOI 1/10 56 s t a b i l f r · w d o p h c g m x Badge Metric Value Status Description [s] Reviewed Sources 5% ○ ≥80% from editorially reviewed sources [t] Trusted 55% ○ ≥80% from verified, high-quality sources [a] DOI 20% ○ ≥80% have a Digital Object Identifier [b] CrossRef 5% ○ ≥80% indexed in CrossRef [i] Indexed 10% ○ ≥80% have metadata indexed [l] Academic 80% ✓ ≥80% from journals/conferences/preprints [f] Free Access 95% ✓ ≥80% are freely accessible [r] References 20 refs ✓ Minimum 10 references required [w] Words [REQ] 2,000 ✓ Minimum 2,000 words for a full research article. Current: 2,000 [d] DOI [REQ] ✓ ✓ Zenodo DOI registered for persistent citation. DOI: 10.5281/zenodo.19363248 [o] ORCID [REQ] ✓ ✓ Author ORCID verified for academic identity [p] Peer Reviewed [REQ] — ✗ Peer reviewed by an assigned reviewer [h] Freshness [REQ] 88% ✓ ≥80% of references from 2025–2026. Current: 88% [c] Data Charts 4 ✓ Original data charts from reproducible analysis (min 2). Current: 4 [g] Code ✓ ✓ Source code available on GitHub [m] Diagrams 3 ✓ Mermaid architecture/flow diagrams. Current: 3 [x] Cited by 0 ○ Referenced by 0 other hub article(s)
Score = Ref Trust (41 × 60%) + Required (4/5 × 30%) + Optional (3/4 × 10%)
Technical Research · Apr 1, 2026 · 10 min read
29 published1,513 total views358 min total readingMar 2026 – Apr 2026 published
Scope
Foundation & Benchmarking (1–10): KV-cache fundamentals, attention memory patterns, context window utilisation, long-context retrieval benchmarks, memory degradation curves, compression benchmarks, cross-architecture comparison, prompt caching efficiency, multi-turn memory, and a meta-analysis of context benchmarks
Optimisation Techniques (11–18): Paged attention, grouped-query attention, speculative decoding, semantic prompt caching, token pruning, cross-layer cache sharing, sliding window and compressive caching, flash attention
Infrastructure (19–24): Distributed KV-cache, disaggregated prefill/decode, cache-aware scheduling, memory hierarchy (DRAM/HBM/SSD), cache coherence in multi-tenant systems, production monitoring
Economics & Emerging (25–30): Cost models, cache-augmented retrieval (RAG meets KV-cache), cross-model cache transfer, retrieval-augmented vs attention memory, biological memory analogues, the future of AI memory
Editorial Standards
Each article follows academic academic conventions: formal abstract, structured methodology, empirical evidence or rigorous literature synthesis, proper citations with DOIs, and reproducible analysis where applicable. The series maintains a monochrome design language consistent with the Stabilarity Research Hub editorial identity.
Submit
Researchers and practitioners working on KV-cache optimisation, long-context inference, or memory-efficient architectures are invited to suggest topics, share benchmarks, or propose guest contributions. Contact via the GitHub repository .