🧭 Should Defence-Adjacent Startups Use Claude? A Clear Framework
TechCrunch has published a detailed piece this weekend addressing the most practical question that has been surfacing in developer forums and startup communities throughout the week: for startups working at the intersection of technology and defence — from logistics optimisation to satellite data analysis to veteran services — does the Pentagon's supply chain designation change the risk calculus of building on Claude? The answer is more nuanced than yes or no, and the piece provides a useful framework for founders and engineering leads trying to make an informed decision rather than a reactive one.
A practical framework for the decision
- Direct DoD contractor? Review carefully: if your company holds or seeks direct DoD prime or sub-contracts, your compliance team needs to assess whether the designation requires certification of non-use — this is the narrow category actually affected
- Dual-use civilian/defence work? Likely unaffected: startups working in sectors with both civilian and defence applications — cybersecurity, logistics, geospatial — and not holding direct DoD contracts are not within the scope of the designation
- Defence-adjacent civilian work? No change: veteran services, government procurement platforms, defence industry HR and training tools — none of these fall within the supply chain risk framework unless they involve direct DoD contracts
- The practical hedge: for any startup in an uncertain position, building a model-agnostic abstraction layer (so Claude, GPT-4, or Gemini can be swapped without a re-architecture) is good engineering practice regardless — and becomes specifically valuable in a volatile regulatory environment
Bottom line for most startups: the designation affects a narrow slice of the AI vendor ecosystem that is in direct contractual relationship with the DoD. For the vast majority of startups — including those with defence sector customers — Claude's commercial availability is unchanged. The correct response to uncertainty is a documented risk assessment, not a platform switch.
startups
enterprise
compliance
AI policy
retrospective
🧭 Prompt Caching in Production — A Deep Dive for Cost-Conscious Developers
With millions of new users arriving daily and API traffic at record levels, prompt caching — Anthropic's mechanism for reusing the computed key-value state of a large system prompt across multiple requests — has become one of the most discussed cost and latency optimisations in the developer community. The feature has been generally available since late 2025 and the documentation is thorough, but real-world production usage is surfacing patterns and edge cases that are worth understanding before designing your caching strategy. This entry summarises the practical lessons that have emerged from developers sharing their experiences in the Anthropic developer community and on social media.
How prompt caching works and when it pays off
- What gets cached: the prefix of your prompt — typically the system prompt and any large context block — is cached server-side after its first use; subsequent requests that share the identical prefix pay a reduced input token price and see dramatically lower time-to-first-token
- Cache hit conditions: the cached prefix must be byte-identical — any change, including whitespace, breaks the cache; design your system prompt to be stable and put any dynamic content at the end, after the cache boundary marker
- Cost structure: cached input tokens are priced at approximately 10% of standard input token price; write tokens (first use) are priced at 125% — so you need approximately 2 cache hits to break even, and the economics improve substantially with more
- Cache lifetime: caches expire after 5 minutes of non-use; for high-traffic applications this is rarely an issue, but low-traffic applications with large system prompts may see sporadic cache misses
- Best use cases: large RAG context blocks, long coding context files, multi-document analysis with a shared document set, and any application where the system prompt exceeds ~1,000 tokens and requests are frequent
Quick win: if your application has a system prompt longer than 2,000 tokens and handles more than a few requests per hour, enabling prompt caching is one of the highest-ROI optimisations available — typically reducing both cost and latency by 50–80% on the cached portion of the prompt.
prompt caching
API
cost optimisation
best practices
retrospective