# Tampa Dynamics — Full Content

Founder-led, engineering-first consultancy that designs and builds secure, cloud-native platforms and AI workflows for regulated industries (healthcare, legal, financial services).

Site: https://tampadynamics.com
Generated: 2026-05-21T21:59:57.853Z

This file concatenates every published article, guide, case study, and comparison on the site. Each entry is preceded by its canonical URL.

# Blog

---

## AI Document Analysis for Regulated Industries: A Production Architecture Guide
Source: https://tampadynamics.com/blog/ai-document-analysis

> How to design AI document analysis pipelines that hold up under HIPAA, SOC 2, and legal review. Extraction, RAG, accuracy thresholds, hallucination mitigation, and the architectural decisions that determine whether your system passes audit.

Date: 2026-02-14

Document analysis is one of the highest-value applications of AI in regulated industries — and one of the most frequently misunderstood. Teams often come in expecting a solution that "reads documents and answers questions." What they get, if the system is not designed carefully, is a pipeline that appears to work in demos and fails in production on the edge cases that matter most.

This guide covers what document analysis actually involves at the system design level, where the real complexity lives, and how to make architecture decisions that hold up under compliance scrutiny.

---

## What Document Analysis Actually Means

"Document analysis" is not a single capability. It is a family of distinct tasks, and conflating them is the source of most project failures.

**Extraction** is pulling structured data from unstructured text — dates, names, dollar amounts, clause identifiers, diagnosis codes. The input is a document; the output is a structured record.

**Classification** is assigning a document or section to a category — contract type, claim status, document priority. The input is text or a document; the output is a label.

**Understanding** — the capability most teams actually want — is answering questions about document content, summarizing complex documents, identifying inconsistencies, or reasoning across multiple documents. This is where large language models are most useful and where hallucination risk is highest.

Each of these tasks has different accuracy characteristics, different failure modes, and different implementation requirements. A system that needs to extract structured fields from a known document template has very different architecture requirements than one that needs to answer open-ended questions about a collection of contracts.

Before designing anything, define which of these tasks your system needs to do, in what combination, with what accuracy threshold, and under what regulatory constraints.

---

## Rule-Based vs. ML-Based Approaches

The default assumption in 2026 is that ML — specifically LLMs — is the right tool for all document analysis tasks. That assumption is worth interrogating.

**Rule-based extraction** using regular expressions, template matching, or structured parsers is still the right choice when:

- Document structure is consistent and predictable (e.g., specific form types, standard templates)
- The extracted fields are well-defined and have predictable formats
- Auditability requires deterministic, inspectable logic
- The document volume justifies the upfront engineering investment

A prior authorization form that always places the diagnosis code in the same position does not need a language model. A deterministic parser is faster, cheaper, more accurate on that specific template, and easier to audit.

**ML-based approaches** — including LLMs and fine-tuned models — are appropriate when:

- Document structure varies significantly across instances
- The task requires semantic understanding, not just pattern matching
- Documents contain natural language reasoning that must be interpreted, not just extracted

**The practical recommendation** is a layered architecture: rule-based extraction for structured, predictable fields; ML models for classification and semantic tasks; LLMs for understanding and generation tasks that cannot be reduced to extraction or classification. Reserve the most expensive, least deterministic components for the tasks where they are genuinely necessary.

---

## OCR and Document Preprocessing

Language models do not read PDFs. They read text. Before any ML-based document analysis can occur, your documents need to be converted to clean text, and that conversion step is where many production systems degrade.

### OCR Quality Is a Limiting Factor

For scanned documents — common in healthcare (faxed records, scanned intake forms) and legal (historical contracts, court filings) — OCR quality directly determines downstream accuracy. A language model cannot reason correctly about text that has been garbled by a poor OCR pass.

Key OCR considerations:

- **Engine selection** — AWS Textract, Google Document AI, and Azure Document Intelligence each have different accuracy profiles across document types. Evaluate on your actual document corpus, not benchmarks.
- **Document quality preprocessing** — Deskewing, denoising, contrast normalization, and resolution normalization upstream of OCR materially improve output quality.
- **Table and form detection** — General OCR reads text linearly. Documents with tables, checkboxes, and multi-column layouts require layout-aware extraction to preserve the semantic relationships between fields.
- **Confidence scoring** — Production OCR pipelines should expose per-field confidence scores and route low-confidence extractions to human review rather than passing them silently to downstream components.

### Text Normalization and Chunking

After OCR, raw text typically requires normalization — handling line breaks, hyphenation artifacts, header/footer stripping, and encoding issues — before it is useful for ML processing.

For RAG systems specifically, chunking strategy is a significant architectural decision. Document chunks that are too small lose context; chunks that are too large dilute relevance scores and exceed context windows. The right strategy depends on document structure: paragraph-based chunking for narrative documents, section-based chunking for structured reports, hierarchical chunking for documents with clear heading hierarchies.

---

## RAG Architecture for Document Q&A

Retrieval-Augmented Generation (RAG) is the standard architecture for document question-answering in production systems. Rather than loading entire documents into a model's context window — which has cost, latency, and context length limitations — RAG retrieves the specific passages most relevant to a query and passes only those to the model.

### The Core Pipeline

A RAG document analysis pipeline consists of:

1. **Ingestion** — Documents are preprocessed, OCR'd if necessary, chunked, and converted to embeddings using an embedding model (text-embedding-3-large, Cohere embed-v3, or similar). Embeddings are stored in a vector database (Pinecone, pgvector, OpenSearch, Weaviate).

2. **Retrieval** — At query time, the user query is embedded using the same model, and the vector store returns the k most semantically similar chunks.

3. **Augmentation** — Retrieved chunks are assembled into a prompt context and passed to a language model along with the query and any system instructions.

4. **Generation** — The language model produces an answer grounded in the retrieved context.

### Hybrid Search

Pure vector similarity search has known failure modes: it can miss exact matches, struggle with proper nouns and identifiers, and rank tangentially related content highly based on surface-level semantic similarity. Production systems typically combine dense vector search with sparse keyword search (BM25) in a hybrid retrieval step. This captures both semantic relevance and keyword precision.

### Re-ranking

After initial retrieval, a cross-encoder re-ranker evaluates each retrieved chunk against the query with more precision than the initial embedding similarity. Re-ranking improves precision at the cost of latency. For regulated workflows where accuracy is more important than speed, the trade-off is usually worth it.

### Attribution

Every answer generated by a RAG system should be traceable to its source chunks. This means:

- Returning source document identifiers and chunk positions alongside generated answers
- Displaying citations in the UI so users can verify claims against source documents
- Logging which chunks were retrieved and which contributed to the final answer — this is your audit trail

Attribution is not optional in regulated industries. An AI that produces correct-looking answers without provenance is not useful for legal review, clinical decision support, or financial due diligence.

---

## Use Cases by Vertical

### Legal: Contract Review and Due Diligence

Legal document analysis typically involves:

- **Clause extraction and classification** — Identifying indemnification clauses, limitation of liability language, auto-renewal provisions, and non-standard terms across large contract sets
- **Obligation and deadline extraction** — Pulling dates, notice periods, and party-specific obligations into structured summaries
- **Inconsistency detection** — Flagging conflicts between document sections or between a contract and a template standard
- **Due diligence Q&A** — Answering questions across a data room of hundreds of documents during M&A or financing processes

The accuracy requirement in legal is extremely high. A system that misses a jurisdiction-specific limitation clause in a commercial contract creates real liability. Human review of AI-flagged issues is not optional — the AI's role is to triage and surface, not to conclude.

Attorney-client privilege considerations also shape system architecture. Legal documents in a RAG system must not be retrievable across client matter boundaries. Strict tenant isolation at the vector store and data layer is required.

### Healthcare: Prior Authorization and Clinical Documentation

Healthcare document analysis use cases include:

- **Prior authorization support** — Extracting relevant clinical criteria from patient records and matching them against payer requirements to support authorization requests
- **Clinical documentation assistance** — Extracting structured information from unstructured clinical notes to populate fields in downstream systems
- **Referral and discharge summary processing** — Parsing incoming referral documents to route and triage efficiently

HIPAA applies to the entire pipeline. The PHI in clinical documents must be handled with the same controls as any other PHI: access-controlled storage, audit logging of every retrieval, BAA with all vendors whose infrastructure processes the documents, and de-identification before data reaches any vendor that cannot provide a BAA.

### Finance: Due Diligence and Regulatory Filing Analysis

Financial services document analysis includes:

- **SEC filing analysis** — Extracting financial figures, risk factors, and forward-looking statements from 10-Ks and 10-Qs
- **Loan document review** — Identifying covenant terms, trigger conditions, and non-standard provisions across credit agreements
- **Regulatory correspondence** — Classifying and routing regulatory notices and examination findings

Financial document analysis has its own auditability requirements: investment decisions supported by AI analysis may need to demonstrate that the supporting information was accurate and appropriately sourced.

---

## Accuracy vs. Cost Trade-offs

Every document analysis system involves trade-offs between accuracy, latency, and cost. These trade-offs need to be explicit, not implicit.

**Embedding model quality** varies significantly. Higher-quality embedding models improve retrieval precision but increase per-document indexing cost and per-query latency. Evaluate on your document corpus before committing to a model.

**Generation model selection** is the largest cost variable. GPT-4o, Claude 3.5 Sonnet, and their peers produce higher-quality answers on complex documents than smaller models, but at significantly higher per-query cost. For high-volume, lower-complexity extractions, a smaller model or a fine-tuned model may provide adequate accuracy at a fraction of the cost.

**Chunk count and context length** — retrieving more chunks per query improves recall but increases prompt size, cost, and the risk of the model being confused by tangential content.

The right architecture is not the one that maximizes accuracy on all tasks — it is the one that applies the right level of capability to each task, with human review at the points where errors have the most consequence.

---

## Hallucination Risks and Mitigation

Hallucination — the model generating plausible-sounding but incorrect content — is the central reliability problem in LLM-based document analysis. In regulated industries, a hallucinated clause interpretation or fabricated clinical detail can cause direct harm.

Mitigation strategies, in order of effectiveness:

**Constrain the generation task.** Extraction tasks with explicit output schemas (JSON with defined fields) hallucinate far less than open-ended summarization tasks. Where possible, decompose complex Q&A into a series of constrained extraction sub-tasks.

**Ground answers in retrieved text.** Instruct the model to answer only based on provided context and to explicitly state when the context does not contain sufficient information to answer. Evaluate whether models follow this instruction reliably on your task.

**Verify claims against source text.** Post-generation verification — checking that specific claims in the output can be found verbatim or near-verbatim in the source chunks — catches fabrications that the model produced despite constrained prompting.

**Human review at high-stakes decision points.** No mitigation strategy eliminates hallucination. For decisions with significant consequences — a contract interpretation that will be executed, a clinical documentation entry that will affect care — human review is not a fallback. It is a required step in the workflow design.

---

## Compliance Considerations

### Data Retention and Storage

Documents ingested into a document analysis system need retention policies. In regulated industries, this means:

- Defining retention periods based on document type and regulatory requirements
- Implementing deletion capabilities that cover both raw documents and their derived embeddings
- Ensuring deletion of a document removes it from the vector store as well (a frequently missed step — deleting the source document does not automatically delete its embeddings)

### Access Controls

Document-level access controls in a RAG system are more complex than in a traditional document management system. You need access controls that operate at the retrieval layer — not just at the document storage layer — so that a query from User A cannot surface documents that User A does not have rights to see.

This typically means:

- Tagging chunks at indexing time with access control metadata (document owner, matter, tenant, sensitivity classification)
- Filtering retrieval results by the requesting user's access rights before chunks are passed to the model
- Auditing which documents were retrieved for each query

### Audit Logging

Every document retrieval and every AI generation event is an auditable action in regulated workflows. Your audit log should record the query, the retrieved document identifiers, the model and version used, and the generated output. This log is your evidence that the system operated correctly if the output is ever challenged.

---

## Human-in-the-Loop Design Patterns

The framing that AI replaces human review is the wrong model for regulated industries. The right frame is that AI changes the nature of human review — reducing the time spent on mechanical scanning and increasing the time spent on judgment.

Effective human-in-the-loop patterns for document analysis:

**Triage and prioritization** — AI classifies documents by urgency, complexity, or risk level. Humans review in AI-determined priority order, rather than sequential processing.

**Flagging, not concluding** — AI identifies sections or provisions that warrant attention. Humans evaluate the flagged items. The AI does not render a final judgment; it guides human attention.

**Confidence-gated automation** — High-confidence extractions (e.g., standard date fields from a consistent form) proceed automatically. Low-confidence extractions route to a human review queue. Thresholds are calibrated based on the cost of errors.

**Active review interfaces** — Rather than presenting AI output as a finished product, present it as an annotated draft. Reviewers can accept, reject, or modify each AI-generated annotation. This surfaces model errors, creates training data for improvement, and ensures the human genuinely engages with the output.

The design of the review interface is as important as the design of the underlying AI pipeline. A system that makes it easy for reviewers to rubber-stamp AI output is not a safe human-in-the-loop system.

---

## Building Document Analysis Systems That Hold Up

Document analysis in regulated industries is an engineering problem more than it is an AI problem. The AI components — embedding models, language models, vector stores — are available and capable. The harder work is designing pipelines with appropriate accuracy controls, building attribution into the output from the start, enforcing document-level access controls at the retrieval layer, and designing review interfaces that make human oversight practical rather than performative.

If your team is evaluating or building a document analysis system for legal, healthcare, or financial workflows, [an architecture review](/contact) is a structured way to identify the decisions that will be expensive to change later. We also cover the overlap between document analysis and broader AI system design in our [healthcare AI consulting](/healthcare-ai-consulting) and [legal AI consulting](/legal-ai-consulting) practices.

---

## AI for Small Business: Practical Use Cases That Don't Require a Data Science Team
Source: https://tampadynamics.com/blog/ai-for-small-business

> Practical AI use cases for small businesses — from document processing to customer support automation. No machine learning expertise required.

Date: 2026-01-28

Most small business owners have absorbed two years of AI headlines and are left with a version of the same question: what, specifically, is this supposed to do for my business?

The honest answer is narrower than the headlines suggest. AI is not going to transform your operations overnight, and the use cases that actually work in practice are more specific and more modest than the ones described in product marketing. But there are genuine time savings and quality improvements available to businesses with fewer than 50 employees, with no data science background required, using tools that cost less than a full-time hire.

This post describes the use cases that are actually viable for small businesses right now, what each one requires to implement, and where the pitfalls are.

---

## The Gap Between AI Hype and Small Business Reality

Enterprise AI implementations make the news. A hospital system deploying AI-assisted radiology, a law firm using AI for contract review at scale, a logistics company optimizing routes across millions of data points — these are real use cases, but they involve specialized models, significant integration work, and teams with engineering capacity to build and maintain them.

Small businesses operate in a different context. You have limited engineering resources, limited budget for tooling, existing software that was not built to integrate with anything, and workflows that live in a combination of email threads, spreadsheets, and institutional knowledge.

The use cases that work for small businesses share a few characteristics. They address a task that is repetitive and time-consuming but does not require deep domain judgment. They produce outputs that a human reviews before they matter. And they use general-purpose AI capabilities — language understanding, document parsing, summarization — that do not require custom model training.

---

## Use Cases That Actually Work

### Document Processing and Data Extraction

If your business handles paper or PDF-based documents — invoices, contracts, intake forms, insurance documents, applications — there is almost certainly an AI tool that can reduce the manual data entry burden.

Document AI tools (Google Document AI, AWS Textract, and the document processing capabilities built into platforms like Zapier or Make) can extract structured data from unstructured documents with reasonable accuracy. An insurance agency that was manually keying data from carrier documents into a management system can often automate 70-80% of that extraction.

What you actually need to implement this:
- A consistent document format, or a set of templates that account for format variation
- A review step where a human spot-checks the extracted data — especially for high-stakes fields like dollar amounts, dates, and names
- A destination system that can receive the extracted data (most practice management, CRM, and accounting platforms have APIs or native integrations)

What you should not expect: perfect accuracy without review. Document extraction tools are accurate enough to eliminate most manual keying, but not accurate enough to operate without a human quality check on anything that matters.

### Email Draft Generation

Email is the highest-volume writing task for most small businesses, and it is also one of the clearest applications of current AI capabilities. Tools like Gmail's Help Me Write, Outlook Copilot, or standalone tools built on GPT-4 or Claude can draft responses to customer inquiries, follow-up sequences, proposal emails, and client communications.

The workflow that works: provide the AI with context (the incoming email, the key points you need to address, your preferred tone), let it draft, review and edit, send. For experienced users, this shifts email from a writing task to an editing task, which is faster.

The failure mode: treating AI drafts as final output without review. AI email drafts are consistently fluent and often completely wrong about specific details — pricing, availability, commitments you may or may not have made. The review step is not optional.

A secondary use case: summarizing email threads. If you have been cc'd on a 30-message thread and need to understand where things stand, asking an AI to summarize the thread is faster than reading it top-to-bottom, and the summary is usually accurate enough to be useful.

### Customer FAQ Automation

If your business receives the same 10-20 questions repeatedly — business hours, pricing, process, requirements, turnaround time — an AI-powered FAQ tool can handle the first response layer for a significant portion of inbound inquiries.

The implementation options range from simple (a website chatbot trained on your FAQ content, using a tool like Intercom, Tidio, or Freshdesk) to more involved (a custom RAG system that queries your knowledge base and routes complex questions to humans).

For most small businesses, the simple implementation is the right starting point. The chatbot handles the routine questions that were previously answered by whoever checked email. Complex or novel questions escalate to a human. The chatbot is honest about its limitations and does not try to answer what it does not know.

The critical design principle: the chatbot should not be trying to do everything. Define the scope tightly. What are the 15 questions you receive most often? Train the system on those, and make the escalation path to a human clear and easy. A chatbot that tries to answer everything and answers many things wrong is worse than no chatbot.

### Meeting Notes and Summaries

If your business involves regular client meetings, team standups, or sales calls, transcription and summarization tools have become genuinely useful. Otter.ai, Fireflies.ai, and the built-in transcription capabilities in Zoom and Teams can produce accurate transcripts, summarize action items, and generate structured meeting notes automatically.

The time savings compound. A 60-minute client meeting that previously required 20-30 minutes of note-writing can produce a summarized document automatically, with action items identified and attributed to specific people.

The implementation is straightforward — most of these tools integrate directly with your video conferencing platform. The primary consideration is disclosure: many states and most professional contexts require that participants be informed when a meeting is being recorded and transcribed. Make this part of your meeting opening.

### Invoice Processing and Accounts Payable

For businesses that receive a significant number of vendor invoices — construction, retail, restaurants, professional services with multiple vendors — AI-assisted invoice processing can reduce the hours spent on manual entry.

Tools like Dext (formerly Receipt Bank), Hubdoc, or the built-in AI capabilities in QuickBooks and Xero can extract line items, amounts, dates, and vendor information from invoices with reasonable accuracy, and route them to the appropriate GL code based on learned patterns.

This is not fully automated accounts payable — someone still needs to approve invoices and catch anomalies. But shifting from manual entry to review-and-approve significantly reduces the time cost for businesses processing 20 or more invoices per month.

---

## Common Pitfalls

**Data quality problems surface immediately.** AI tools are sensitive to inconsistency in the data they process. If your customer data has inconsistent naming conventions, if your document formats vary widely, or if your knowledge base has contradictory information, AI tools will reflect those inconsistencies in their outputs. Before deploying any AI automation, clean the data it will work with.

**Integration complexity is usually underestimated.** The AI tool itself is often straightforward. Connecting it to your existing systems — your CRM, your accounting software, your industry-specific platform — is where projects stall. Many small business platforms expose limited APIs or require middleware. Budget time and potentially budget for an integration resource.

**Cost is not always as low as advertised.** Most AI tools have free or low-cost entry tiers, but production usage scales with volume. A chatbot handling 500 conversations per month costs more than a chatbot handling 50. Document processing tools often price per page. Model API costs for custom implementations can be significant at volume. Run the numbers at your actual expected volume, not the minimum-tier pricing.

**Accuracy thresholds vary by task.** For email drafting, 80% accuracy is fine — you are reviewing every email anyway. For invoice processing, an error rate that results in miscoded expenses compounds into significant accounting problems over time. Match your accuracy expectations to the stakes of the task, and design review workflows accordingly.

---

## How to Evaluate AI Vendors

The AI tool market is crowded and the marketing is often indistinguishable across vendors. Evaluating vendors effectively means looking past the demo.

**Ask for a pilot on your actual data.** Any vendor worth considering will let you test with a sample of the specific document types, emails, or use cases you intend to automate. If the tool cannot handle your real-world inputs, the demo is irrelevant.

**Understand the data handling model.** Where does your data go? Is it used to train the vendor's models? Who has access to it? For businesses handling client data, proprietary information, or anything sensitive, these questions matter before you integrate a tool.

**Evaluate the integration path before committing.** Do they have a native integration with your existing systems? If not, is there a documented API? What does the actual integration work look like, and who will do it?

**Check what happens when it is wrong.** Every AI tool will produce incorrect outputs sometimes. The question is whether the tool makes errors visible, whether there is a review step in the workflow, and what the process is for correcting errors and feeding that correction back into the system.

---

## Build vs. Buy vs. Off-the-Shelf

Small businesses have three options for AI implementation, and they are not equally appropriate for every use case.

**Off-the-shelf tools** (Otter.ai, Intercom, Dext, Gmail's AI features) are the right starting point for use cases where general-purpose tools cover your needs. Low cost, low implementation effort, limited customization.

**Configured platforms** (Zapier AI, Make.com, HubSpot AI features) sit in the middle. They require more setup and some technical knowledge, but they allow you to connect multiple systems and customize workflows in ways that off-the-shelf tools do not support.

**Custom builds** (bespoke AI integrations, custom RAG systems, purpose-built document processing pipelines) make sense when your use case is specific enough that no off-the-shelf tool covers it, or when the volume or accuracy requirements exceed what general tools can deliver. Custom builds require technical resources and ongoing maintenance — they are not a small business starting point, but they are sometimes the right answer for a specific high-value workflow.

The decision framework: start with off-the-shelf and validate that the use case actually saves time and produces acceptable quality. Move to custom builds only when you have evidence that the use case has meaningful value and that general tools cannot deliver the quality or integration you need.

---

## When to Get Outside Help

For businesses that have identified a high-value use case but hit a wall on implementation, the question of when to bring in outside help is practical. A few indicators:

- The use case requires integrating with a system that has a non-obvious API or no native integration with AI tools
- Data quality issues are significant enough that they need remediation before automation can work
- The workflow involves sensitive data (client information, financial records, medical information) where vendor selection and security configuration need careful attention
- You have tried an off-the-shelf tool and it does not produce sufficient accuracy for your specific document types or use case

Bringing in a development partner for AI implementation does not mean a large engagement. A focused scoping conversation — describing what you are trying to automate, what your existing systems are, and what your data looks like — is enough to determine whether there is a tractable path and what it realistically costs.

---

## Frequently Asked Questions

### Do I need technical expertise on my team to use AI tools?

For off-the-shelf tools — no. Email AI, meeting transcription, basic chatbots, and accounting AI features are designed for non-technical users. For configured platform integrations, some technical comfort helps. For custom builds, you need development resources either internally or through a partner.

### Will AI replace my employees?

Not the ones doing complex, judgment-intensive work. AI is most useful for automating the repetitive, time-consuming portions of knowledge work — the email drafting, the data entry, the note-taking — which frees people to do the parts of their job that require judgment and relationships. The businesses that use AI well tend to redeploy time, not reduce headcount.

### How do I know if a use case will actually save time?

Track the time cost of the task you are trying to automate before you implement anything. If someone is spending 8 hours per week on manual invoice entry, an automation that reduces that to 2 hours of review is worth real money. If a task consumes 30 minutes per week, automation may not be worth the implementation effort.

---

If you are evaluating AI for a specific workflow and are not sure whether the use case is tractable or which approach makes sense, [an architecture conversation](/contact) with our team can give you a practical answer quickly — no sales cycle, no vague roadmap.

---

## AI Procurement Checklist for Healthcare CIOs
Source: https://tampadynamics.com/blog/ai-procurement-checklist-healthcare

> A practical checklist for evaluating AI vendors and AI projects in healthcare — the questions to ask before money moves and the red flags to watch for in vendor responses.

Date: 2026-05-04

AI procurement in healthcare is harder than software procurement was a decade ago. The data path is more complex, the regulatory exposure is broader, and the vendor ecosystem includes everything from established health IT firms to two-person startups with a Cohere wrapper.

This is the checklist we wish more CIOs had when we walk into a procurement conversation. It is organized by the questions auditors and Security Officers actually ask, not by the marketing categories vendors use.

## Data path

1. **Where does PHI go?** Trace every system the data touches: ingestion, embedding, vector storage, model invocation, post-processing, logging, analytics. Each is on the BAA or it is not.
2. **What model is the vendor using, and does it have a BAA?** "We use AI" is not an answer. The specific model and the specific provider matter. AWS Bedrock under the AWS BAA, Azure OpenAI under Microsoft's BAA, and Google Cloud's Vertex under Google's BAA are different vendors with different documentation. Get specifics.
3. **Where does the model run?** A model "deployed in your VPC" is different from a model "accessed via the vendor's API endpoint." Both can be appropriate. Mixing them up is not.
4. **Does the vendor train on customer data?** Default policy at major LLM providers is that enterprise tier customers' data is not used for training, but verify in the contract, not the marketing page.

## Tenant isolation

5. **How is your data separated from other customers' data?** Index per customer, namespace per customer, or shared resources with metadata filters. Each has different security implications. Get specifics.
6. **Can you produce evidence of isolation?** Vendor diagrams are not evidence. Test data, configuration, and logs are.
7. **What happens if another customer is breached?** A vendor whose architecture has a single shared resource is one breach away from being yours. A vendor with hard isolation is not.

## Audit logging

8. **What does the vendor log?** Every model invocation, with prompts, retrieved context, and outputs, is the right answer. "Aggregate metrics" is not.
9. **Can you access the audit log?** Some vendors keep the AI audit log internal. For HIPAA work, you need to be able to query "every interaction touching this patient" without a support ticket.
10. **How long is the audit log retained?** Six years is the HIPAA minimum. Some vendors retain shorter; some retain forever (which has its own discovery exposure).
11. **What is in the audit log for tool calls?** If the AI calls back into your systems, every tool invocation needs a log entry with parameters and results.

## Right-to-deletion

12. **If a patient asks for deletion, what is the path?** PHI in source systems, PHI in the vector index, PHI in audit logs. Each needs a path. "We can delete records on request" without specifics is not enough.
13. **If you terminate the contract, what happens to your data?** Data return, data destruction, attestation. Spell out the timeline.

## Access control

14. **How are users authenticated to the AI system?** Federation with your IdP (Entra, Okta, Cognito) is the default expectation. Vendor-managed user accounts are a flag.
15. **How does role-based access work?** A clinician seeing patient AI summaries should see only their patients. A pharmacist seeing prior-auth recommendations should see only the cases assigned to them. Test the access model with realistic scenarios.

## Outputs

16. **Is every output cited?** Free-form answers without source citations are not auditable. Compliance teams should reject them.
17. **What happens when retrieval comes back empty?** A safe system tells the user it does not know. A dangerous system makes something up.
18. **What is the human-in-the-loop boundary?** For any decision with clinical or regulatory consequence, there should be a human review step. Confirm where it sits and how it is logged.

## Operations

19. **Who can change the prompts and retrieval logic?** Vendor-side changes that alter system behavior should produce audit entries. Silent prompt updates that change AI behavior are a regulatory liability.
20. **What is the model upgrade path?** When the underlying model changes — and it will — does behavior change? Is there an evaluation harness? Will you be notified before deployment?
21. **What is the SLA for incident response?** AI systems break in ways traditional software does not. Hallucinations, retrieval failures, prompt injection. The vendor's response capability matters.

## Contract terms

22. **BAA signed, with all sub-processors named.** The LLM provider, the vector database, the embedding endpoint, the logging vendor — each is a sub-processor.
23. **Indemnification for breach caused by vendor's AI.** Carve-outs for "AI-generated content" that shift liability back to you should be flagged.
24. **Audit rights.** You should be able to audit the vendor's controls without their permission, on reasonable notice.

## Red flags

A few responses that should slow procurement down:

- "We use the latest models" without specifying which.
- "Your data is secure" without explaining how.
- "We have HIPAA compliance" without naming services and BAAs.
- Vague answers to "what is in the audit log."
- An inability to demonstrate tenant isolation.
- Refusal to answer detailed technical questions in front of a Security Officer.

These are not always disqualifying. They often indicate a vendor whose security posture is less mature than they realize. A good procurement process surfaces them before contract.

## What this checklist does not cover

Outcomes. The checklist above ensures the vendor can be deployed responsibly. Whether the vendor solves the problem you bought them to solve is a separate question. We have seen vendors clear every item on this list and still produce a system that does not work for the intended workflow. The checklist is necessary, not sufficient.

The right pairing: this checklist for the security and compliance review, plus a pilot deployment with measurable success criteria for the operational evaluation. Both questions need an answer before procurement closes.

## Where we fit

If you are working through this checklist for a vendor evaluation, or if you are deciding between buying and building, an architecture review is the lowest-risk way to get a second opinion. We do this for healthcare CIOs regularly. [Get in touch](/contact) if you have a procurement decision in front of you.

---

## Five Questions Every AI System Should Be Able to Answer
Source: https://tampadynamics.com/blog/audit-ready-ai-five-questions

> If your AI system cannot answer these five questions in seconds, it is not audit-ready — and that gap will surface at the worst possible moment.

Date: 2026-03-21

Every AI system in production will eventually produce an output that someone wants to investigate. A patient will ask why a recommendation was made. A regulator will ask what data was used. An auditor will ask whether the system has been operated within policy. A breach investigator will ask what data the model saw during a specific window.

When that happens, you will need to answer five questions. The teams whose AI systems can answer them in seconds keep operating. The teams whose systems cannot, do not.

## 1. Who used the AI, when, and what did they do?

The most basic audit question. "Show me every AI interaction by user X between dates A and B."

The answer requires the audit log to capture the requesting user identity — not the AI service account, but the human user the request originated from — alongside every model invocation. The log has to be queryable by user, by date, by tenant.

Most AI systems we see in early stages capture the AI's outputs but not the calling user. The user identity has to be threaded through the application layer to the AI invocation layer and persisted with each turn. Adding it later is a rebuild.

## 2. What did the AI see?

For any given output, what was in the context window when the model produced it?

This is the question that exposes RAG systems built without retrieval logging. The model produced output; the output references a fact; the fact came from somewhere. That somewhere has to be in the audit log, with chunk-level detail.

A complete answer captures: the system prompt, the conversation history, the retrieved chunks (by document and chunk ID), the user message, and the full input as sent to the model. Not a summary. The full input.

## 3. What did the AI produce, and what happened next?

The model output, plus the action that was taken. For a recommendation that was approved by a human, the audit log shows the AI's draft, the human's approval, any edits made, and the final action. For a recommendation that was rejected, the same chain.

If the AI's output triggered a downstream system call — wrote to the case record, sent an email, updated a billing line — that call is in the audit log too. The chain has to be reconstructable end-to-end.

## 4. For this specific subject (patient, customer, case), every interaction.

Tenant-scoped queries. "Show me every AI interaction that touched this patient's record, ever."

This requires the tenant identifier to be in every audit row, with appropriate indexes for tenant-scoped queries. For healthcare, the patient identifier; for legal, the matter identifier; for financial services, the account or customer identifier.

The query has to return everything: model invocations, retrieval calls, tool calls, approval decisions. If a piece of the data path bypassed the audit log, that is the gap an investigation will find.

## 5. Was anything outside policy?

The hardest question. "Were there any AI interactions during this window that should not have happened?"

Answering it requires policy to be expressed in something more than prose. The audit log captures the inputs, outputs, and actions; an automated evaluation runs against the log to flag interactions that violated guardrails (PHI sent to an unapproved endpoint, retrieval crossing tenant boundaries, outputs without citations, model versions outside the approved set, tool calls outside the allowed scope).

Most organizations treat policy review as manual log inspection. That works at small scale and breaks at production scale. The systems that scale have policy expressed as code that runs against the audit log, with alerts and dashboards on policy violations.

## What "audit-ready" actually requires

To answer all five questions, a production AI system needs:

- **Audit log capture at the model invocation layer.** Every model call, with full input and output, in a structured format.
- **User identity threaded through every layer.** The application user is captured at the model layer, not invented or replaced with a service account.
- **Tenant identifier on every row.** With appropriate indexes for tenant-scoped queries.
- **Tool calls captured.** Every tool invocation with parameters and results.
- **Approval chains captured.** For human-in-the-loop decisions, the human's identity, decision, and any edits.
- **Retention sized to obligations.** Six years for HIPAA, longer for some workloads, with object lock or equivalent immutability.
- **Policy expressed as code.** Automated evaluation of the log against the policy set.

This is not light infrastructure. For an AI system handling regulated workloads, the audit infrastructure is often comparable in size and effort to the AI itself. The teams that ship the AI without this infrastructure ship faster initially and slower in the long run, because the audit work has to happen eventually.

## When the question gets asked

The question that prompts this post: "Can your AI system answer these five questions today?"

If the answer is yes, the system is audit-ready. The compliance review will be a confirmation, not a discovery process. The breach investigation, if one happens, will be hours of work, not weeks.

If the answer is no — not all five, or not in seconds, or only with engineering effort — the gap is a project. The cheapest time to close it is before the audit. The most expensive time is during.

We have walked into both situations. The retrofitted audit log is always more painful than the day-one one. We say this not as theory but as engineers who have done both.

## Where we fit

If you are operating an AI system in a regulated environment and any of the five questions feel uncertain, that is the project. We do audit-readiness reviews of existing AI systems regularly. They produce a written gap analysis, a prioritized remediation plan, and an estimate. [Reach out](/contact) if your AI is in production and the audit conversation is starting to come up.

---

## AWS Amplify Gen 2 in Production: Architecture Decisions That Matter
Source: https://tampadynamics.com/blog/aws-amplify-gen2-architecture

> A practical guide to AWS Amplify Gen 2 for production applications — authentication, data modeling, custom resolvers, and the limitations to know before you build.

Date: 2026-01-07

AWS Amplify Gen 2 is a meaningful rearchitecting of the platform. The TypeScript-first, code-first model eliminates most of the friction that made Gen 1 difficult to work with in complex production environments — the YAML configuration fragmentation, the opaque CLI-generated resources, the difficulty of customizing beyond what the CLI anticipated.

But Amplify Gen 2 is still Amplify, which means it makes strong assumptions about your architecture, and those assumptions are not always the right ones for every application. Before you commit to Amplify for a production deployment, you need to understand both what it does well and where it runs out of road.

This is a practical guide written for teams evaluating Amplify Gen 2 for a Next.js application that needs to survive production.

---

## What Changed from Gen 1 to Gen 2

The most significant change is the move from a CLI-driven, YAML-based configuration model to a TypeScript-first, code-first model. In Gen 1, you ran `amplify add auth` and got a generated configuration file that was difficult to read, harder to modify, and nearly impossible to version meaningfully. Drift between environments was common and painful.

In Gen 2, your entire backend is defined in TypeScript files in your repository:

```typescript
// amplify/auth/resource.ts
import { defineAuth } from "@aws-amplify/backend";

export const auth = defineAuth({
  loginWith: {
    email: true,
  },
  multiFactor: {
    mode: "OPTIONAL",
    totp: true,
  },
});
```

This is a genuine improvement. Your backend infrastructure is now a first-class part of your codebase — version controlled, code-reviewed, and deployable through the same pipeline as your application code. The configuration is readable by engineers who were not present when it was created.

Gen 2 also ships with a unified data modeling layer built on AppSync and DynamoDB, a cleaner authentication integration with Cognito, and sandbox environments for local development that provision real AWS resources in an isolated account context.

What did not fundamentally change: Amplify is still an opinionated platform built on AppSync + DynamoDB for data, Cognito for auth, and CloudFront + S3 (or Lambda) for hosting. If your requirements live comfortably in that stack, Gen 2 is a significant improvement. If your requirements push outside it, you will hit the same walls.

---

## Authentication Setup

Amplify Gen 2 authentication is built on Cognito, and the Gen 2 configuration model gives you reasonable control over the Cognito setup through the TypeScript definition layer.

### Standard Configuration

The baseline auth setup handles email/password authentication, MFA, and the standard social providers (Google, Facebook, Apple):

```typescript
export const auth = defineAuth({
  loginWith: {
    email: {
      verificationEmailStyle: "CODE",
      verificationEmailSubject: "Verify your email",
    },
    externalProviders: {
      google: {
        clientId: secret("GOOGLE_CLIENT_ID"),
        clientSecret: secret("GOOGLE_CLIENT_SECRET"),
      },
      callbackUrls: ["https://yourapp.com/auth/callback"],
      logoutUrls: ["https://yourapp.com"],
    },
  },
  multiFactor: {
    mode: "REQUIRED",
    totp: true,
    sms: true,
  },
  userAttributes: {
    givenName: { required: true, mutable: true },
    familyName: { required: true, mutable: true },
  },
});
```

### Custom Auth Flows

Cognito supports custom authentication challenges — magic links, device-based authentication, custom verification logic — through Lambda triggers. Gen 2 exposes these through a `triggers` configuration:

```typescript
export const auth = defineAuth({
  loginWith: { email: true },
  triggers: {
    createAuthChallenge: defineFunction({
      entry: "./create-auth-challenge.ts",
    }),
    defineAuthChallenge: defineFunction({
      entry: "./define-auth-challenge.ts",
    }),
    verifyAuthChallengeResponse: defineFunction({
      entry: "./verify-auth-challenge.ts",
    }),
  },
});
```

This is how you implement magic link authentication or hardware key flows. The trigger functions are standard Lambda functions with Cognito event shapes. The integration is cleaner in Gen 2 than Gen 1, but the Cognito trigger model itself is unchanged — the same constraints around session management and challenge sequencing apply.

### Production Auth Considerations

A few things that matter in production and are easy to miss:

**Token expiry configuration** — Amplify defaults are not always appropriate for your application. The ID token, access token, and refresh token have separate expiry windows. Define these explicitly based on your security requirements. For applications handling sensitive data, shorter-lived tokens with well-implemented refresh flows are preferable to long-lived sessions.

**Advanced Security Features** — Cognito's advanced security features (compromised credential detection, adaptive authentication) are not enabled by default and are not configurable through the standard Amplify Gen 2 auth definition at the time of writing. Enabling them requires a custom CDK construct via `defineBackend`. This is not a blocking issue, but it is something to account for if you are building a security-sensitive application.

**User pool limits** — Cognito scales well but has default limits on API calls per second that can affect authentication under high concurrency. If you are building for significant scale, review the Cognito service quotas early.

---

## Data Modeling with DynamoDB

Amplify Gen 2 data modeling is built on AppSync with DynamoDB as the underlying store. The TypeScript model definition is the most significant improvement over Gen 1's GraphQL SDL-in-YAML approach:

```typescript
// amplify/data/resource.ts
import { defineData, a } from "@aws-amplify/backend";

const schema = a.schema({
  Organization: a
    .model({
      name: a.string().required(),
      plan: a.enum(["starter", "professional", "enterprise"]),
      members: a.hasMany("User", "organizationId"),
      createdAt: a.datetime(),
    })
    .authorization((allow) => [
      allow.owner(),
      allow.group("admin"),
    ]),

  User: a
    .model({
      email: a.string().required(),
      organizationId: a.id().required(),
      organization: a.belongsTo("Organization", "organizationId"),
      role: a.enum(["member", "admin", "viewer"]),
    })
    .authorization((allow) => [
      allow.owner(),
      allow.authenticated().to(["read"]),
    ]),
});

export const data = defineData({
  schema,
  authorizationModes: {
    defaultAuthorizationMode: "userPool",
  },
});
```

The authorization model is one of the stronger aspects of Amplify Gen 2. Field-level and model-level authorization rules compile down to DynamoDB condition expressions and VTL resolvers, which means the enforcement is at the data layer — not just in your application logic.

### What DynamoDB Access Patterns Require

Amplify abstracts DynamoDB but does not eliminate its fundamental access pattern constraints. DynamoDB performs well for queries by partition key, poorly for ad-hoc queries across arbitrary dimensions.

The Amplify data layer adds a set of secondary indexes automatically, but complex relational queries require either:

1. Designing your schema to match the access patterns you need (the correct DynamoDB approach)
2. Layering a custom query with a Lambda resolver for complex filtering
3. Accepting that some queries will result in a full table scan through AppSync filters (which works at small scale and fails badly at large scale)

If your application requires complex relational queries — reporting, multi-dimensional filtering, analytics — DynamoDB through Amplify is the wrong storage layer for those use cases. A common production pattern is DynamoDB for transactional data + Aurora Serverless or RDS Proxy for reporting workloads, connected via custom resolvers.

---

## Custom Business Logic: Lambda Resolvers and Custom Queries

Amplify Gen 2 supports two paths for custom business logic: Lambda functions as custom resolvers in the AppSync layer, and HTTP endpoints via the Amplify function definition.

### Custom Resolvers

Custom resolvers replace the auto-generated AppSync resolvers for specific operations. Use them when the default CRUD behavior is insufficient — when you need to enforce business rules, trigger side effects, or integrate with external services:

```typescript
// amplify/data/resource.ts
const schema = a.schema({
  // ...
  createOrderWithInventoryCheck: a
    .mutation()
    .arguments({
      productId: a.id().required(),
      quantity: a.integer().required(),
    })
    .returns(a.ref("Order"))
    .handler(
      a.handler.function(
        defineFunction({ entry: "./create-order-handler.ts" })
      )
    )
    .authorization((allow) => [allow.authenticated()]),
});
```

The handler function receives the AppSync event shape and has access to the full AWS SDK. This is where you put logic that the auto-generated resolvers cannot express — inventory checks, external payment API calls, conditional workflows.

### Function Definitions

For operations that live outside the AppSync layer — scheduled jobs, event-driven processing, webhook handlers — Amplify Gen 2 provides function definitions that deploy as Lambda functions and can be connected to EventBridge, SQS, or HTTP endpoints:

```typescript
// amplify/functions/process-webhook/resource.ts
import { defineFunction } from "@aws-amplify/backend";

export const processWebhook = defineFunction({
  name: "process-webhook",
  entry: "./handler.ts",
  timeoutSeconds: 30,
  environment: {
    STRIPE_WEBHOOK_SECRET: secret("STRIPE_WEBHOOK_SECRET"),
  },
});
```

The function connects to API Gateway through the backend definition, giving you a deployable webhook endpoint with the Amplify environment variables and IAM context already configured.

---

## Deployment Pipelines

Amplify Gen 2 integrates with Amplify Hosting for CI/CD. The deployment model provisions sandbox environments (development) from developer machines and deploys production through connected branches in Amplify Hosting.

### Branch-Based Environments

Each Git branch can map to an isolated Amplify environment with its own provisioned resources. This gives you true environment parity between staging and production — the same AppSync API, the same Cognito user pool configuration, the same DynamoDB tables, just with separate provisioned instances.

The configuration in `amplify.yml`:

```yaml
version: 1
backend:
  phases:
    build:
      commands:
        - npm ci
        - npx ampx pipeline-deploy --branch $AWS_BRANCH --app-id $AWS_APP_ID
frontend:
  phases:
    build:
      commands:
        - npm run build
  artifacts:
    baseDirectory: .next
    files:
      - "**/*"
  cache:
    paths:
      - node_modules/**/*
```

### Custom Domains and Edge Configuration

Custom domain configuration in Amplify Hosting is functional but limited compared to a CloudFront distribution you control directly. Advanced edge behaviors — custom cache policies, origin request policies, Lambda@Edge functions — require either going through the Amplify console (which exposes a subset of CloudFront options) or managing the CloudFront distribution outside Amplify.

For Next.js applications that require sophisticated edge configuration — geo-based routing, A/B testing at the edge, advanced cache invalidation — this is a meaningful constraint.

---

## Limitations and When Not to Use Amplify

Amplify Gen 2 is the right tool for applications that map cleanly onto its assumptions. It is the wrong tool when those assumptions conflict with your requirements.

**Complex relational data models.** If your application is fundamentally relational — complex joins, ad-hoc reporting, transactions across multiple entities — DynamoDB is not the right storage layer. Amplify does not support PostgreSQL or MySQL as a primary data store through its standard data layer.

**Strict infrastructure control requirements.** Some organizations require specific VPC configurations, custom KMS key management, fine-grained IAM policies that differ from what Amplify provisions, or integration with existing AWS infrastructure that predates the Amplify deployment. Amplify supports CDK customization via `defineBackend`, but the further you go down that path, the more you are managing raw CDK rather than Amplify.

**Multi-region deployments.** Amplify Hosting is primarily a single-region deployment model. Global multi-region active-active architectures are not a natural fit for Amplify's deployment model.

**High-throughput APIs with complex rate limiting requirements.** AppSync is a capable GraphQL endpoint, but organizations with sophisticated API rate limiting requirements, complex quota management, or high per-second throughput needs may find that a custom API Gateway + Lambda setup gives them more control.

**Teams that need to own the infrastructure.** Amplify abstracts significant infrastructure complexity. That is its value proposition. For teams that need deep operational visibility into their infrastructure, that abstraction can be an obstacle rather than an asset.

---

## Production Readiness Considerations

### Logging and Observability

Amplify deploys CloudWatch logging for AppSync and Lambda functions by default. For production applications, this is necessary but not sufficient.

A useful production observability stack on Amplify:
- **CloudWatch Logs Insights** for structured log querying across Lambda and AppSync
- **CloudWatch Alarms** on error rates, latency percentiles, and function throttles
- **X-Ray** for distributed tracing across AppSync resolvers and Lambda functions — enable this at the AppSync level and instrument Lambda handlers
- **RUM** (Real User Monitoring via CloudWatch) for frontend performance data

The Amplify console exposes a subset of these metrics. For serious production monitoring, go directly to CloudWatch and build the dashboards and alarms you need there.

### Custom Domain Setup

Amplify Hosting manages ACM certificate provisioning for custom domains. The setup is straightforward for domains managed in Route 53. For domains managed externally, you will need to add CNAME validation records manually, and the console workflow for this is functional but not fast.

One practical consideration: Amplify uses CloudFront under the hood, but the distribution is managed by Amplify. If you have existing CloudFront behavior configurations you want to apply, check what Amplify exposes through the console before assuming it is configurable.

### Environment Variable Management

Amplify Gen 2 introduced the `secret()` function for referencing sensitive values. Secrets are stored in AWS Secrets Manager and injected at build time and runtime — they are not in your TypeScript code or your environment variable files.

For non-secret configuration that varies by environment (feature flags, API endpoint URLs, tier-specific configuration), use Amplify environment variables configured per branch in the Amplify console or in `amplify.yml`. Do not put environment-specific configuration in your TypeScript backend definition files — it defeats the purpose of having branch-based environment parity.

---

## Frequently Asked Questions

### Should I use Amplify Gen 2 or CDK directly?

If you want the managed CI/CD, the integrated auth and data layers, and the Gen 2 TypeScript configuration model, Amplify Gen 2 is reasonable. If you have complex infrastructure requirements or need precise control over every AWS resource, CDK directly gives you more flexibility at the cost of more responsibility. Teams that start with Amplify often migrate specific concerns to CDK via `defineBackend` as the application grows — this is a supported pattern.

### Can I use Amplify Gen 2 with an existing Next.js application?

Yes. The Amplify backend is separate from your Next.js application. You can add Amplify to an existing Next.js project, connect it to Amplify Hosting, and adopt the auth and data layers incrementally. The most common starting point for an existing application is adding Amplify Hosting for CI/CD, then optionally adopting the auth and data layers.

### How does Amplify Gen 2 handle database migrations?

It does not, in the traditional sense. DynamoDB is schemaless, so there is no migration runner. Schema changes in Amplify Gen 2 that affect DynamoDB (new fields, new indexes) are applied by deploying the updated schema definition. Removing a field from the schema does not delete data from DynamoDB — it stops the auto-generated resolvers from reading it. Managing backward compatibility during schema evolution is your responsibility.

---

Amplify Gen 2 is worth evaluating seriously for Next.js applications that align with its model. The TypeScript-first approach is a genuine improvement, and the integrated auth and data layers save real time compared to assembling those components manually. The constraints are real but predictable — knowing them in advance lets you design around them or make an informed decision to use a different stack.

If you are evaluating Amplify Gen 2 for a production application and want a second opinion on whether the stack fits your requirements, [start with an architecture conversation](/contact). The answer is usually clear within a focused discussion.

---

## BAA-Ready AI: What to Ask Vendors
Source: https://tampadynamics.com/blog/baa-ready-ai-vendor-questions

> The specific questions that separate AI vendors who can support a HIPAA workload from vendors who say they can. A practical guide for healthcare buyers in early evaluation.

Date: 2026-04-02

"We support HIPAA" is a sentence vendors say. What it means in practice ranges from "we have a BAA template ready to send" to "we have never seen a healthcare customer and we hope it works out."

This is the list of questions we use during vendor evaluation for AI products that will touch PHI. The answers you get separate vendors who have done this work from vendors who think they can.

## On the BAA itself

**Will you sign your BAA, or ours?** Most established health-tech vendors have a BAA they will sign. Some require theirs. A vendor who balks at signing any BAA is not a healthcare vendor.

**What is the BAA's scope?** Does it cover all of the vendor's services or just specific ones? Does it cover sub-processors? Does it carve out any data uses (analytics, model improvement) that a covered entity should refuse?

**What sub-processors are in scope?** Every third party in the data path needs BAA coverage. The LLM provider, the embedding endpoint, the vector database, the logging service. Get the list.

**What is the breach notification timeline?** HIPAA requires timely notification. Vendor BAAs commonly specify 24, 48, or 72 hours. Anything longer than that should be flagged.

## On the data path

**Where does PHI go from the moment we send it?** The vendor should be able to draw a diagram of every system the data touches, with each labeled as "BAA-covered" or "not." If they cannot, they have not thought through the data path.

**Where is data stored?** AWS region, Azure region, GCP region. Multi-region is fine; "we don't know" is not. Some healthcare contracts require US-based storage; some require specific regions.

**Where do models run?** A model accessed via the vendor's API endpoint is not the same as a model deployed in your VPC. Both can be appropriate; the vendor should be clear which they offer.

**What happens to PHI at rest?** Encryption with which key? Customer-managed keys? Vendor-managed keys with attestation? The detail matters.

## On training data

**Is our data used to train models?** Default at major LLM providers (Anthropic, OpenAI, AWS Bedrock) for enterprise tier is no. Confirm the contract reflects this. Verify the vendor's downstream use of your data is also no.

**Does the vendor fine-tune models on customer data?** If yes, what is the data lifecycle? If a customer leaves, can the contributions to fine-tuned models be removed? Often the answer is "not really," which is a deal-breaker for sensitive workloads.

**Does the vendor use your data for anything other than serving you?** "Quality improvement," "feature development," "product analytics" are all answers that should slow procurement down. Get specifics.

## On audit and logging

**What is logged for every model invocation?** The right answer includes: requesting user identity, prompt, retrieved context, model output, tool calls, timestamps. "Aggregate analytics" is not the right answer for HIPAA work.

**Can we access the audit log directly?** Some vendors keep the AI audit log internal. For HIPAA, you need to be able to query it without a support ticket. API access, with rate limits and quotas, is the expectation.

**How long are audit logs retained?** Six years minimum for HIPAA. Some vendors keep less; some keep indefinitely. Both have implications.

**Are audit logs encrypted at rest? With customer-managed keys?** The audit log is PHI. Treat it that way.

## On tenant isolation

**How is our data isolated from other customers'?** Index-per-tenant, namespace-per-tenant, or shared resources with metadata filters. Each has different security implications. Vendors who answer "logically separated" without specifics are flagging.

**Can you demonstrate isolation?** The vendor should be able to show the architecture, the access control layer, and the controls that enforce isolation. Marketing diagrams are not demonstration.

**What happens if another customer experiences a breach?** Worst case, what is your exposure? A vendor whose architecture has a single shared resource with weak isolation is one breach away from being yours.

## On model behavior

**Which model is being used?** Specifics. "Claude Sonnet" is more useful than "an LLM." "Claude 3.5 Sonnet on AWS Bedrock" is the level of specificity that matters.

**When does the model change?** Vendors update underlying models. Will you be notified? Is there an evaluation step? Will behavior change without your knowledge?

**Are the prompts proprietary?** If the vendor will not show you the prompts, they will not be able to explain the model's behavior, and neither will you. For HIPAA work where every output may need to be defended, this matters.

**What guardrails are in place?** Content filters, PII detection, prompt-injection protection. The vendor should have answers.

## On outputs

**Are outputs cited?** Every claim the model makes should reference the source documents that informed it. Free-form outputs without citations are not auditable.

**What happens when retrieval comes back empty?** The system tells the user it does not know. Or the system makes something up. The first is acceptable for healthcare; the second is not.

**Is there a human-in-the-loop boundary?** For any clinical decision, a clinician should review and approve. The vendor's product should support this. If the workflow is "AI takes action automatically," that is a flag for HIPAA workloads.

## On customer support

**Has the vendor passed a real procurement at a covered entity?** Reference customers. Names. The vendor should be willing to put you in touch with at least one customer in healthcare.

**What is the SLA for security incidents?** AI systems break in novel ways. Hallucinations, prompt injection, retrieval failures. The vendor's response capability matters as much as their preventative posture.

**Who is your security contact?** A specific person. A specific email. A specific escalation path. Not "support@vendor.com."

## Red flags

A few responses that should slow procurement down materially:

- "We use the latest models" without specifying.
- "Your data is secure" without explanation.
- "We're working on HIPAA" — not the same as HIPAA today.
- "We can sign a BAA but it covers fewer services than we offer" — the carve-outs are where the risk lives.
- "Our model provider handles that" — they may, but the contract is between you and the vendor in front of you.
- Vague answers to detailed technical questions, especially in front of a Security Officer.

Some red flags are disqualifying. Some indicate a vendor whose security posture is less mature than they realize, where the right move is a longer evaluation and a stricter SOW.

## The deeper test

Beyond the checklist, the question that often separates ready vendors from unready ones: how the vendor talks about regulatory work.

Vendors who have done HIPAA work talk about it as a set of practical engineering and operational decisions. Encryption keys, audit log schemas, retention policies, sub-processor lists. The work is detailed but tractable.

Vendors who have not done HIPAA work tend to talk about it abstractly. "We're compliant." "We support HIPAA." "Our infrastructure is secure." The lack of specifics is the signal.

If you are working through this checklist and the vendor's answers are consistently abstract, that is the answer. Move on.

## Where we fit

We do BAA-readiness reviews for healthcare clients evaluating AI vendors regularly. The conversation often produces a shorter shortlist than the original RFP, but the vendors that survive are the ones that will clear the rest of the procurement process. [Get in touch](/contact) if you have a vendor evaluation in front of you.

---

## Building Compliant AI Workflows for Regulated Industries
Source: https://tampadynamics.com/blog/building-compliant-ai-workflows

> How to integrate AI into healthcare, legal, and compliance-focused systems while maintaining security, auditability, and regulatory compliance.

Date: 2024-11-30

Integrating AI into regulated industries requires more than just connecting an LLM to your application. It demands careful attention to data handling, audit trails, and human oversight.

## The Challenge

Organizations in healthcare, legal, and financial services face unique constraints when adopting AI:

- **Data residency and privacy** — PHI, PII, and privileged information can't flow through arbitrary third-party services
- **Auditability** — Every AI-assisted decision needs a clear trail for compliance reviews
- **Human oversight** — AI should augment human judgment, not replace it without review

## Our Approach

At Tampa Dynamics, we architect AI workflows with these principles from day one:

### 1. Data Never Leaves Your Control

We design systems where sensitive data stays within your infrastructure. AI models can be self-hosted, or we use privacy-preserving patterns that anonymize data before it reaches external APIs.

### 2. Every Decision is Logged

Our systems capture:
- What data was sent to the AI
- What response was received
- Who reviewed the output
- What action was taken

### 3. Guardrails by Default

We implement validation layers that catch potential issues before they reach end users—whether that's checking for hallucinated information or ensuring outputs meet compliance standards.

## Getting Started

If you're exploring AI adoption in a regulated environment, we'd recommend starting with a focused pilot:

1. Identify a specific workflow that's manual and time-consuming
2. Define clear success metrics and compliance requirements
3. Build with audit logging and human review from the start
4. Iterate based on real-world feedback

Ready to discuss your AI strategy? [Request an architecture review](/contact) to explore what's possible.

---

## Custom Software for Healthcare Providers: When It Makes Sense and How to Do It Right
Source: https://tampadynamics.com/blog/custom-software-for-healthcare

> A practical guide to custom healthcare software development — covering use cases, HIPAA requirements, integration complexity, and what distinguishes successful projects.

Date: 2026-02-13

Most healthcare technology decisions are not binary choices between custom software and nothing. They are choices between custom software and a commercial product that may or may not fit the specific clinical workflow, data environment, or integration requirement at hand.

The question "should we build custom software?" almost always deserves a more specific framing: "Is the workflow we need to support well-served by available commercial products, or does our specific combination of clinical context, data model, and integration requirements make custom development the better investment?"

This guide is for healthcare operators, clinical IT leaders, and engineering teams working through that evaluation.

---

## When Custom Software Beats Commercial

Commercial healthcare software is good at solving the problems that most organizations share: scheduling, billing, basic EHR data capture, standard reporting. The further your requirements deviate from the median, the worse commercial software performs.

Custom software is likely the better investment when:

**Your workflow is genuinely non-standard.** Specialty practices, research environments, and care delivery models that differ from the ambulatory or inpatient norm often find that commercial tools approximate their workflows without quite fitting. Workarounds in clinical workflows have patient safety implications. Custom software that fits the actual workflow is safer than a commercial product that requires the workflow to adapt to it.

**Your data sensitivity requires architectural control.** Some organizations handle data that demands tighter controls than commercial platforms provide — not because commercial platforms are insecure, but because the organization's compliance posture, legal exposure, or regulatory environment requires specific architectural patterns that multi-tenant commercial products cannot accommodate. Research data, behavioral health records, and substance use disorder treatment records (42 CFR Part 2) are examples where standard commercial platforms create compliance complexity.

**Integration requirements are unusual.** If you need to integrate with proprietary systems, legacy infrastructure, or non-standard EHR configurations that commercial vendors do not prioritize, custom software may be the only practical path. Commercial platforms optimize their integrations for the most common EHRs and the most standard interface patterns. Edge cases in their integration support are often expensive or impossible to address without vendor involvement.

**You are building a product, not just solving an internal problem.** Healthcare providers that are building clinical workflow tools to offer to affiliated organizations, health systems deploying internally developed tools across a network, or care delivery companies with proprietary clinical intelligence are building products. Products require control over the roadmap, the data model, and the integration architecture that commercial software does not provide.

---

## Common Use Cases for Custom Healthcare Software

### Clinical Workflow Tools

Specialty-specific workflow tools that extend or sit alongside an EHR — rather than replacing it. Examples include:

- Pre-procedure checklists and documentation workflows that the EHR supports generically but not specifically enough for the clinical protocol
- Care coordination tools that aggregate data from multiple source systems into a single interface for a specific team or role
- Clinical decision support tools that apply organization-specific protocols to patient data
- Population health management tools that operate on data extracted from the EHR and normalized for the organization's specific patient cohort

These tools typically integrate with the EHR rather than replacing it, and they require careful attention to data synchronization, conflict resolution, and the user experience of moving between systems.

### Patient Communication and Engagement

Patient-facing applications for appointment scheduling, secure messaging, care instructions, and remote patient monitoring are an area where commercial products are abundant but often generic. Custom development makes sense when:

- The care model is distinct enough that generic patient portal features do not serve it
- The organization needs integration between patient-facing functionality and proprietary backend systems
- Branding and user experience are strategic differentiators
- The engagement model involves workflows (remote monitoring, chronic disease management, post-procedure follow-up) that commercial platforms support poorly

Patient-facing healthcare applications have a different UX bar than clinician-facing tools. Patients are not trained on software; they use it infrequently and often under stress. The design investment in patient-facing custom software is higher, not lower, than for internal tools.

### Analytics and Clinical Reporting

Healthcare providers often have significant data assets — years of clinical records, operational data, billing data — that commercial analytics platforms do not integrate or model correctly. Custom analytics platforms are appropriate when:

- The analysis requires joining clinical, operational, and financial data in ways that commercial BI tools do not support
- The clinical metrics being tracked are organization-specific and not handled by standard quality reporting platforms
- Machine learning on proprietary clinical data is part of the strategy

Analytics platforms that process PHI have the same HIPAA obligations as any other system in the environment. De-identification for analytics is a significant architectural consideration — see the HIPAA compliant app development guide for technical patterns.

### Administrative Automation

Revenue cycle, prior authorization, referral management, and credentialing are workflow areas with significant administrative burden. Custom automation is appropriate when:

- The existing process is handled manually in ways that create errors, delays, or staff burden
- Commercial products for the specific workflow are not available or are a poor fit
- Integration with the organization's specific systems creates more complexity than commercial platforms can handle

---

## HIPAA Technical Requirements That Shape Development

Every custom healthcare application that creates, receives, maintains, or transmits PHI is a HIPAA-covered system. The technical requirements are not optional add-ons — they are architectural constraints that need to be designed in from the start, not retrofitted.

The technical safeguards most likely to affect system design:

**Access control architecture** — HIPAA's minimum necessary standard requires that access to PHI be scoped to what is actually needed for the specific purpose. This is not a RBAC checkbox; it is an access control design problem. Role-based access is a minimum. Healthcare applications typically require attribute-based access control that considers facility assignments, care team membership, and the patient's relationship to the accessing clinician.

**Audit logging** — Every access, modification, and export of PHI must be logged with sufficient detail to reconstruct the complete history of any record. Audit logs must be retained for six years, stored separately from application logs, and protected against modification.

**Encryption** — PHI at rest must be encrypted with documented key management. PHI in transit must use TLS 1.2 or higher. Encryption keys must be managed separately from encrypted data.

**Automatic session termination** — Workstations with active clinical sessions must automatically terminate after configurable inactivity periods. This is a required specification under 45 CFR §164.312(a)(2)(iii).

**BAA requirements for the full vendor stack** — Every vendor whose infrastructure processes PHI needs a BAA. This includes cloud providers, database hosting, authentication services, monitoring and observability tools, and any AI or LLM services integrated into the application. Mapping your vendor stack to BAA coverage is an early architectural requirement, not a legal afterthought.

The HIPAA Security Rule uses required and addressable specifications. Neither category is optional — addressable specifications must either be implemented or documented with a compliant alternative. If your team is unfamiliar with this distinction, reviewing the full guidance before design begins is worth the time.

---

## EHR Integration Complexity

EHR integration is consistently the most underestimated source of complexity in healthcare software projects. Understanding what you are getting into before committing to an integration approach is essential.

### FHIR as the Standard Path

HL7 FHIR (Fast Healthcare Interoperability Resources) is the current standard for healthcare data exchange, and the 21st Century Cures Act mandated FHIR-based APIs for certified EHR systems. In theory, this means a standardized path to EHR data. In practice:

- FHIR implementations vary significantly across EHR vendors — the same FHIR resource may be populated differently, contain different optional fields, or have different behavior in edge cases
- FHIR APIs typically expose a subset of EHR data, not all of it. Custom clinical data, proprietary fields, and legacy data structures may not be available via FHIR
- FHIR access often requires going through the EHR vendor's developer program, which involves its own approval process and timeline

FHIR is the right starting point for EHR integration. It is not a guarantee of a smooth integration.

### Proprietary APIs and HL7 v2

Many EHR systems have proprietary APIs that expose more data than FHIR or support write operations that FHIR does not. Older interfaces, particularly in larger health system environments, may still use HL7 v2 — a message-based format that predates FHIR by decades and has its own significant implementation variability.

Integrating with HL7 v2 interfaces requires an interface engine (Mirth Connect, Rhapsody, or similar) to translate between v2 messages and your application's data model. This adds infrastructure, operational overhead, and expertise requirements.

### Integration Scoping

Before committing to EHR integration, define:

- Which specific EHR platform(s) you are integrating with (not "major EHRs" — specific systems and versions)
- Which data flows you need (read vs. write vs. bidirectional, which specific resources)
- Which interface type is available and supported (FHIR R4, SMART on FHIR, HL7 v2, proprietary API)
- What the EHR vendor's developer program requirements and timelines look like
- Whether a third-party integration platform (Health Gorilla, 1upHealth, Redox) simplifies the integration at acceptable cost

Discovering integration constraints after development has started is expensive. Discovery is part of the project, not a pre-project exercise to skip.

---

## PHI Data Handling Architecture

Custom healthcare software requires a documented, deliberate PHI data architecture. The core pattern:

**PHI lives in a designated, access-controlled data store.** This is not your general-purpose application database — it is a separate store with tighter access controls, encryption at rest with managed keys, and audit logging on every access.

**Application logic fetches PHI only when a specific clinical purpose requires it.** The application does not load full patient records by default; it fetches the minimum PHI necessary for the current operation.

**PHI identifiers and PHI content are separated.** Operational systems work with patient IDs and record identifiers. PHI content — the actual clinical data — is retrieved only at the point of rendering, under the access controls and audit logging of the PHI store.

**Non-production environments never use real PHI.** Development, staging, and QA environments use synthetic data or de-identified data. Exposing real PHI in development environments, including through production database snapshots, expands your HIPAA scope to systems and people who should not be in scope.

---

## Patient-Facing vs. Clinician-Facing Design Considerations

Healthcare software has two distinct user populations with fundamentally different design requirements.

**Clinician-facing tools** are used by trained professionals in high-stakes, time-pressured environments. The design priorities are efficiency, information density, and error prevention. Clinicians learn software through training and repeated use; they tolerate complexity if it supports efficiency. The cost of a usability failure is clinical workflow disruption and, in some contexts, patient safety risk.

**Patient-facing tools** are used by people with no specialized training, often in stressful situations, sometimes on mobile devices, sometimes in low-bandwidth environments. The design priorities are clarity, accessibility, and trust. The cost of a usability failure is patient disengagement or incorrect understanding of clinical information.

These are different product design problems. Teams that try to serve both populations with the same design language typically under-serve one of them. If your application serves both, design the two experiences separately and integrate at the data layer.

---

## Build vs. Buy Decision Framework

The build vs. buy decision in healthcare is not a simple cost comparison. The relevant factors:

**Workflow fit** — How well does the commercial product fit your specific clinical workflow, without modification? If the answer is "well enough with workarounds," model what those workarounds cost in user time, error rate, and staff frustration over three years.

**Integration compatibility** — Can the commercial product integrate with your specific EHR, at the data flows you need, on a timeline that matches your project? Integration promises from vendors deserve skepticism until confirmed in technical detail.

**Data control** — Does the commercial product allow you to control your data in the way your compliance posture requires? Data portability, deletion capability, and the vendor's data use policies are all relevant.

**Roadmap dependency** — Commercial products build features on their roadmap, not yours. If your requirements are evolving and you need control over the feature set, a commercial product may become a constraint faster than you expect.

**Total cost of ownership** — Commercial licensing costs are visible. The cost of living with a poor fit — workarounds, training, data cleanup, integration maintenance — is less visible but often larger.

Custom software has higher upfront costs and requires ongoing development investment. Commercial software has lower upfront costs and defers that investment into licensing, configuration, and the cost of workarounds. Neither is universally better; the right answer depends on how well the commercial product fits your specific requirements.

---

## Common Project Failure Patterns

Healthcare software projects fail in predictable ways:

**Underscoped EHR integration.** Integration complexity is discovered during development rather than during scoping. Timeline extends, scope contracts, and the application ships with integration gaps that limit its utility.

**Compliance retrofitted rather than designed in.** HIPAA requirements are added to an existing design rather than shaping the design from the start. The result is expensive rework of access control, audit logging, and data handling architecture.

**Clinician input deferred.** The application is designed by IT and product teams, validated with clinicians late in the process, and fails to reflect actual clinical workflow. Adoption suffers.

**Scope expansion without timeline adjustment.** Clinical requirements grow during development as the organization discovers that the original scope does not fully solve the problem. Timelines do not adjust proportionally.

**Vendor accountability gaps.** Development vendor delivers code that passes acceptance testing but does not reflect production-grade security controls. PHI handling, access control, and audit logging are not tested at the level required for a HIPAA-compliant system.

---

## What Good Healthcare Software Vendors Actually Deliver

A vendor that has done this work before will:

- Start with a thorough discovery phase that covers clinical workflow, EHR integration specifics, HIPAA obligations, and BAA requirements for the full vendor stack
- Design the access control and audit logging architecture before writing application code
- Use de-identified or synthetic data in all non-production environments
- Provide a clear data flow diagram that maps all PHI through the system
- Treat security review and compliance documentation as part of the deliverable, not an afterthought
- Structure the engagement so that clinical workflow validation happens early, not at the end

If a vendor is proposing to build a HIPAA-covered application without asking detailed questions about your PHI data flows, access control requirements, and BAA coverage for the full stack, that is a signal worth paying attention to.

---

## The Right Conversation to Start With

Custom healthcare software projects that succeed share a common pattern: the technical requirements — HIPAA architecture, EHR integration constraints, access control model, audit logging design — were treated as engineering problems to be solved, not compliance documents to be produced.

If you are evaluating custom software development for a clinical workflow, patient engagement, or analytics use case, [an architecture review](/contact) is a structured engagement to work through the technical requirements before committing to a development plan. Our [healthcare software development practice](/services/healthcare-software) and [healthcare AI consulting](/healthcare-ai-consulting) work covers the full range of custom clinical system design.

The conversation starts with your specific clinical context, not with a sales pitch about technology capabilities.

---

## DynamoDB Access Patterns for High-Performance Applications
Source: https://tampadynamics.com/blog/dynamodb-patterns

> A practical guide to DynamoDB data modeling — covering single-table design, access pattern planning, GSIs, sparse indexes, and the patterns that prevent expensive rework.

Date: 2026-01-14

DynamoDB is one of the most performant and scalable databases available on AWS. It is also one of the most expensive to retrofit when the data model is wrong. The reason is the same in both cases: DynamoDB is built around access patterns. You define the access patterns first, model the data to support them efficiently, and then query exactly as the model expects. Do this correctly and you get single-millisecond latency at any scale. Design it in reverse — start with the data model and figure out access patterns later — and you eventually hit a wall that requires either rebuilding the data model or replacing DynamoDB with a relational database.

This guide covers the patterns that matter, starting with the cardinal rule.

---

## The Cardinal Rule: Access Patterns Before Data Model

In a relational database, you design a normalized schema and then write queries. The query layer is flexible — you can join tables, filter on any column, sort by arbitrary fields, and add indexes later if queries are slow.

In DynamoDB, the query layer is not flexible. You can only query by primary key and sort key. Secondary indexes (GSIs and LSIs) expand what you can query, but they are defined at table creation time and must be maintained. Ad-hoc queries that were not anticipated in the data model either require expensive scans or are impossible.

This means the design process is inverted. Before modeling any data, document every access pattern your application will use:

```
1. Get user by user_id
2. Get all orders for a user, sorted by date (descending)
3. Get all orders with status=PENDING across all users (admin use)
4. Get order by order_id
5. Get all items in an order
6. Get all orders containing a specific product_id
```

Write these down. All of them. Then design the data model to support every pattern with a primary key query or a GSI query. If a pattern cannot be supported this way, it either needs to be rethought or moved to a different data store.

---

## Primary Key Design

A DynamoDB primary key is either a simple key (partition key only) or a composite key (partition key + sort key). Most production tables use composite keys.

**Partition key (PK)** — Determines which physical partition holds the item. DynamoDB distributes items across partitions based on the partition key hash. All queries must specify the partition key.

**Sort key (SK)** — Enables range queries within a partition. Items with the same partition key and different sort keys are stored together, sorted lexicographically. Sort key queries support begins_with, between, and comparison operators.

The most important property of a good partition key: high cardinality with even distribution. A partition key with low cardinality (e.g., a status field with three possible values) concentrates traffic on a small number of partitions, creating hot partitions that hit throughput limits. A partition key with high cardinality (e.g., user_id or order_id) distributes traffic evenly.

---

## Single-Table Design

Single-table design is the dominant pattern in production DynamoDB systems. The idea: store all entity types in a single table, using the primary key structure to differentiate entities and support multiple access patterns.

This is counterintuitive to engineers with a relational background, where entities live in their own tables. The reason single-table works in DynamoDB is that DynamoDB queries are partition-scoped — items with the same partition key are stored and retrieved together efficiently. Storing related data under a single partition key, using the sort key to differentiate it, enables fetching an entity and its related data in a single query.

### A Concrete Example

Consider an order management system with Users, Orders, and Order Items.

```
Table: OrdersTable

# User entity
PK: USER#user_123     SK: #METADATA#user_123
Attributes: name, email, created_at

# Order entity (under the user)
PK: USER#user_123     SK: ORDER#2026-02-17#order_456
Attributes: total, status, shipping_address

# Order item entity (under the order)
PK: ORDER#order_456   SK: ITEM#item_789
Attributes: product_id, quantity, unit_price

# Order entity (accessible by order_id directly)
PK: ORDER#order_456   SK: #METADATA#order_456
Attributes: user_id, total, status, created_at
```

This structure supports:

- **Get user**: `PK = USER#user_123, SK = #METADATA#user_123`
- **Get all orders for user**: `PK = USER#user_123, SK begins_with ORDER#`
- **Get orders for user in date range**: `PK = USER#user_123, SK between ORDER#2026-01-01 and ORDER#2026-02-28`
- **Get order by ID**: `PK = ORDER#order_456, SK = #METADATA#order_456`
- **Get all items in order**: `PK = ORDER#order_456, SK begins_with ITEM#`

This is the core value of single-table design: multiple access patterns served by a single table with no joins.

### Item Collections

Items that share a partition key form an item collection. In a single-table design, a user's item collection might contain their profile, their orders, and their addresses — all stored under `PK = USER#user_id`.

Item collection size is limited to 10GB if a local secondary index exists on the table. For most applications, this limit is not reached, but if individual item collections can grow large (e.g., a user with millions of orders), design with this limit in mind.

---

## Global Secondary Indexes

A Global Secondary Index (GSI) is a separate index with its own partition key and sort key, built from a subset of the table's attributes. GSIs enable access patterns that the base table's primary key does not support.

GSIs are eventually consistent by default (you can request strongly consistent reads from the base table, but not from GSIs). They add storage cost and write throughput cost — every write to the base table that affects a GSI attribute triggers a write to the GSI.

### GSI Overloading

A single GSI can support multiple access patterns if you use the same overloading pattern as the base table. This is GSI overloading.

Example: You need to support two additional access patterns:
1. Get all PENDING orders (across all users) — admin view
2. Look up a user by email address

Rather than creating two GSIs, create one with `GSI_PK` and `GSI_SK` attributes:

```
# For the order entity, populate GSI attributes for status-based lookup
GSI_PK: STATUS#PENDING     GSI_SK: ORDER#2026-02-17#order_456

# For the user entity, populate GSI attributes for email lookup
GSI_PK: EMAIL#user@example.com    GSI_SK: #METADATA#user_123
```

Now a single GSI supports both patterns. The overloaded GSI pattern keeps the number of indexes minimal while supporting a wide range of access patterns.

---

## Sparse Indexes

A sparse index is a GSI that only indexes a subset of items — specifically, only items that have the GSI's partition key attribute defined.

If the `GSI_PK` attribute is only populated on items with a specific status — say, PENDING orders — then the GSI only contains those items. Queries against the sparse index automatically filter to that subset without needing a filter expression.

```
# Only PENDING orders have this attribute set
PENDING_ORDER_GSI_PK: "PENDING"   # Only set on orders with status=PENDING

# COMPLETED orders do not have this attribute at all
# → They are not in the GSI
```

Querying the GSI for `PK = PENDING` returns only pending orders, efficiently, without scanning completed orders. When an order is fulfilled and its status changes to COMPLETED, the attribute is removed, and the item is automatically removed from the GSI.

Sparse indexes are useful for any pattern that involves "get all X where Y is true" where Y is a state that applies to a minority of items.

---

## Relationship Patterns

### 1:1 Relationships

Store as separate items sharing a partition key, or as attributes on a single item if the data is always accessed together and the total item size remains under 400KB.

### 1:N Relationships

Use the parent entity's ID as the partition key and the child entity's ID (or a sortable attribute) as the sort key. This supports fetching all children of a parent in a single query.

```
PK: ACCOUNT#account_123
SK: TRANSACTION#2026-02-17T14:23:11Z#txn_456
```

For large collections, where the parent may have millions of children, consider whether you actually need to fetch all children or only recent children. The sort key's range query capability is particularly useful here — `SK begins_with TRANSACTION#2026-02` fetches only February transactions, for example.

### M:N Relationships

Many-to-many relationships require explicit join items. If users can belong to many teams, and teams have many users:

```
# User → Teams lookup
PK: USER#user_123     SK: TEAM#team_456
Attributes: role, joined_at

# Team → Users lookup (duplicate item with swapped PK/SK for the GSI, or separate item)
PK: TEAM#team_456     SK: USER#user_123
Attributes: role, joined_at
```

This pattern duplicates the relationship data, which is the DynamoDB approach to supporting queries in both directions without joins.

---

## Transactional Writes

DynamoDB supports ACID transactions across up to 100 items in a single TransactWriteItems call. This enables:

- Creating multiple related items atomically (e.g., creating an order and decrementing inventory in a single transaction)
- Conditional writes that fail if a precondition is not met (e.g., only create an item if an item with that key does not already exist)
- Consistent multi-item updates that should not be partially applied

Transactions in DynamoDB cost twice the write capacity of non-transactional writes (the overhead of the coordination mechanism). For operations that genuinely require atomicity, this is the right tool. For operations that do not, pay the lower cost of standard writes.

### Optimistic Locking with Version Numbers

For concurrent update scenarios, DynamoDB's conditional write expressions enable optimistic locking without a separate locking mechanism:

```
# Write condition: only update if version_number matches expected value
ConditionExpression: "version_number = :expected_version"
UpdateExpression: "SET version_number = :new_version, ..."
```

If two processes try to update the same item concurrently, the second write fails the condition check. The second process then retries with the current state. This is the standard pattern for preventing lost updates in DynamoDB.

---

## DynamoDB Streams for Event-Driven Patterns

DynamoDB Streams captures a time-ordered sequence of item-level changes (inserts, updates, deletes) and makes them available for downstream processing. This enables event-driven architectures without polling.

Common patterns built on DynamoDB Streams:

**Derived data maintenance.** When an order item is updated, a stream processor recalculates the order total and updates the parent order item. The parent is always consistent with its children without requiring the write path to do both updates.

**Cross-region replication.** Stream processors read changes from a primary region and replicate them to secondary regions. (AWS Global Tables is a managed version of this pattern.)

**Audit logging.** Every item change is captured in the stream and written to an audit log store. This is a clean separation between the application write path and the audit trail — the application writes to DynamoDB, the stream processor writes to the audit log, and neither path knows about the other.

**Search index synchronization.** Item changes in DynamoDB trigger an OpenSearch index update via a stream processor. The operational database and the search index stay in sync without coupling the write path to the search index write.

Stream records are available for 24 hours. Consumers must process them within that window or miss them. For audit and compliance use cases, ensure your stream consumer has adequate error handling and retry logic.

---

## TTL for Automatic Expiration

DynamoDB's Time to Live (TTL) feature automatically deletes items when a specified timestamp attribute passes. TTL deletions are background operations — they do not consume write capacity and occur within approximately 48 hours of the TTL attribute's expiry time (not exactly at the specified time).

TTL is appropriate for:

- **Session data** — Session records that should expire after a fixed inactivity period
- **Temporary state** — Pending verifications, one-time tokens, in-progress operations with timeouts
- **Caching** — Items used as a DynamoDB cache layer, where stale data should be removed automatically

TTL deletions appear in DynamoDB Streams, which means TTL can be used to trigger downstream cleanup operations — deleting related items in other tables or updating derived data when the primary item expires.

For compliance use cases where data must be retained for a defined period and then deleted, TTL combined with a stream processor provides a clean deletion mechanism with a downstream audit log of the deletion event.

---

## Capacity Planning: On-Demand vs. Provisioned

DynamoDB offers two billing modes:

**On-demand** — You pay per request. DynamoDB automatically scales to any traffic level with no configuration. No capacity planning required. Higher per-request cost than provisioned at sustained load.

**Provisioned** — You specify the read and write capacity units (RCUs and WCUs) the table should maintain. Lower per-unit cost than on-demand at predictable load. Requires capacity planning and auto-scaling configuration to handle traffic spikes.

For most applications, on-demand mode is the right default:

- No risk of throttling from under-provisioning
- No wasted spend from over-provisioning
- Zero capacity planning required
- Appropriate for variable or unpredictable traffic

Switch to provisioned mode when: you have a well-characterized, stable traffic pattern and the per-request cost difference justifies the operational overhead of capacity management. For high-throughput applications with millions of requests per day, the cost difference is significant.

---

## Cost Optimization Patterns

**Project attributes to reduce item size.** DynamoDB bills on the size of the items read and written. Large items cost more per operation. Storing large blobs in S3 and storing the S3 reference in DynamoDB reduces item size and read/write cost.

**Use batch operations.** BatchGetItem and BatchWriteItem reduce per-request overhead for bulk operations. TransactWriteItems is more expensive than batch writes for non-transactional operations — use it only when atomicity is actually required.

**Prefer eventual consistency where possible.** Eventual consistency reads cost half the RCUs of strongly consistent reads. For read patterns that do not require seeing the most recent write (e.g., displaying a list that updates periodically), eventual consistency is appropriate and less expensive.

**Archive infrequently accessed data.** Items that are rarely accessed but must be retained (e.g., historical order records, archived documents) can be moved to S3 and accessed via Athena, reducing DynamoDB storage costs.

---

## When NOT to Use DynamoDB

DynamoDB is the wrong choice when:

**Your access patterns are not known upfront.** If you need ad-hoc queries across arbitrary fields — analytical queries, exploratory data access, complex filtering — DynamoDB will frustrate you. Use a relational database (Aurora PostgreSQL) or a purpose-built analytics store (Redshift, Athena over S3).

**You need complex transactions across many items.** DynamoDB's 100-item transaction limit and lack of multi-table join capability make it unsuitable for systems with complex relational constraints — financial ledgers with multi-table consistency requirements, inventory systems with cascading updates across many entities.

**Your data model is highly relational and frequently changing.** Single-table DynamoDB models for complex domains with many entity types and many access patterns are difficult to design correctly and difficult to evolve. If the access patterns are genuinely unpredictable, the flexibility of a relational database is worth the scaling trade-off.

**You need full-text search or fuzzy matching.** DynamoDB supports exact key lookups and range queries. Full-text search, stemming, and fuzzy matching require OpenSearch or a purpose-built search service.

DynamoDB is excellent for: user and session data, time-series event data, leaderboards and rankings, operational data stores with well-defined access patterns, and high-throughput write workloads. It is not a general-purpose replacement for relational databases.

---

## Building the Right Model Before You Build the System

The upfront investment in DynamoDB access pattern analysis pays back multifold. An hour spent writing out every access pattern before touching the data model prevents weeks of rework when the application is in production and a new access pattern requires restructuring the table.

If you are building an application on AWS and working through the data architecture — whether that is DynamoDB, Aurora, or a hybrid — [an architecture review](/contact) covers this territory. Our [cloud architecture practice](/services/cloud-architecture) and [AI development work](/services/ai-development) both involve DynamoDB design as a regular part of system design engagements.

The goal is always the same: model the data correctly the first time, so the system does not need to be rebuilt when it scales.

---

## FHIR vs HL7: A Practical Comparison for Healthcare Software Teams
Source: https://tampadynamics.com/blog/fhir-vs-hl7

> A technical comparison of FHIR and HL7 v2 for engineering teams building healthcare integrations. Covers data models, interoperability use cases, EHR compatibility, and implementation considerations.

Date: 2026-02-04

If you are building software that integrates with hospitals, clinics, health systems, or any EHR platform, you will encounter both HL7 v2 and FHIR. Understanding the difference between them is not just an academic exercise — choosing the wrong integration approach for a given environment can cost months of development time and produce an integration that no one can maintain.

This guide is written for engineering teams who need to make concrete decisions about healthcare integrations, not for readers who want a survey of standards bodies and working group history.

---

## HL7 v2: What It Is and Why It Still Dominates

HL7 v2 is a messaging standard that has been in production use since 1987. It is pipe-delimited, segment-based, and looks like this:

```
MSH|^~\&|SendingApp|SendingFacility|ReceivingApp|ReceivingFacility|20260217142311||ADT^A01|MSG000001|P|2.5
EVN|A01|20260217142311
PID|1||12345^^^Hospital^MR||Smith^John^A||19800315|M|||123 Main St^^Tampa^FL^33601^US|||||||12345-6789
PV1|1|I|2NORTH^201^01^Hospital||||1234^Attending^Doctor|||||||||||V|2026001234
```

Each caret-delimited field is a segment. The message type (ADT^A01, in this case an admit notification) tells the receiving system what to do with the message. HL7 v2 has message types for every significant clinical event: admissions (ADT), laboratory results (ORU), orders (ORM), pharmacy (RDE), scheduling (SIU), and more.

HL7 v2 is not a modern standard. It predates JSON, REST, and modern API design patterns by decades. But it is the integration backbone of essentially every hospital built before 2015, and many built after. Epic, Cerner (Oracle Health), MEDITECH, and McKesson — the EHRs running most US hospitals — all speak HL7 v2 natively through their integration engines.

The reason HL7 v2 persists is not technical merit. It persists because hospitals have integration engines (Mirth Connect, Rhapsody, Infor Cloverleaf, Iguana) that process millions of HL7 v2 messages per day across connections that have been stable for years. Replacing that infrastructure is expensive and risky, and there is no compelling reason to do it when the existing system works.

### What "Works" Actually Means

HL7 v2 interoperability is often described as "minimal interoperability" in the standards community, and this is fair. The standard permits significant field-level variation. The same concept may be encoded differently across hospitals, or even across departments in the same hospital. Z-segments — custom extensions — are common and non-portable.

What this means in practice: building an HL7 v2 integration involves not just implementing the standard but negotiating with the specific hospital's implementation of the standard, which will have its own field population patterns, message volume characteristics, and historical quirks. You do not implement HL7 v2 once and connect to every hospital. You implement HL7 v2 and then tune for each site.

This site-by-site variation is the primary cost driver for HL7 v2 integrations. Budget for discovery and customization at each new site.

---

## FHIR: What It Actually Is

FHIR (Fast Healthcare Interoperability Resources, pronounced "fire") is a standard developed by HL7 International starting around 2011, with R4 reaching normative status in 2019. It is fundamentally different from HL7 v2 in both design and intent.

FHIR is REST-based and resource-oriented. Clinical data is modeled as typed resources (Patient, Observation, Encounter, Medication, DiagnosticReport, and hundreds of others), exchanged as JSON or XML, and accessed through a RESTful API:

```
GET /Patient/12345
GET /Observation?patient=12345&code=http://loinc.org|2339-0
POST /DocumentReference
```

A FHIR Patient resource looks like this (abbreviated):

```json
{
  "resourceType": "Patient",
  "id": "12345",
  "identifier": [
    {
      "system": "http://hospital.org/mrn",
      "value": "MRN-789456"
    }
  ],
  "name": [
    {
      "family": "Smith",
      "given": ["John", "A"]
    }
  ],
  "birthDate": "1980-03-15",
  "gender": "male",
  "address": [
    {
      "line": ["123 Main St"],
      "city": "Tampa",
      "state": "FL",
      "postalCode": "33601"
    }
  ]
}
```

FHIR resources reference each other by ID, support standard CRUD operations, and are designed to be queried using a standardized search API. The model is far more developer-friendly than HL7 v2 if you are building a new integration and your target system supports it.

### SMART on FHIR

SMART on FHIR (Substitutable Medical Applications and Reusable Technologies) is a framework layered on top of FHIR that adds OAuth 2.0-based authorization and a standardized launch context. It enables applications to be launched from within EHRs with the appropriate patient and user context, and to request scoped permissions to read or write FHIR resources.

SMART on FHIR is the mechanism by which third-party applications integrate with modern EHR APIs. If you are building an application that will be accessed from within Epic, Cerner, or another SMART-enabled EHR, SMART on FHIR is not optional — it is the integration model.

---

## Key Differences: A Side-by-Side View

| Dimension | HL7 v2 | FHIR R4/R5 |
|---|---|---|
| Protocol | TCP/MLLP (or HTTPS) | HTTPS REST |
| Format | Pipe-delimited segments | JSON or XML |
| API style | Message-based (push) | Resource-based (request/response) |
| Query support | Limited (predefined message types) | Rich search API with many parameters |
| Versioning | v2.1 through v2.8.2 | DSTU2, STU3, R4, R5 |
| EHR compatibility | Universal (all legacy EHRs) | Modern EHRs (Epic, Cerner, etc.) via APIs |
| Standardization | Significant site variation | More consistent, still some variation |
| Developer experience | Requires specialized knowledge | REST-standard, more accessible |
| Real-time events | Push model via MLLP | Subscriptions (R4B+), webhooks |

---

## FHIR R4 vs R5

R4 is the current normative standard and is what you will encounter in production EHR APIs. Epic, Cerner, and most payers that offer FHIR APIs support R4.

R5 was released in 2023 and introduces meaningful improvements: a more robust subscription model, improved versioning support, enhanced search capabilities, and refinements to several resource types. However, as of early 2026, R5 is not widely implemented in production EHR systems. Building a new integration against R5 will limit your compatible systems.

The practical guidance: target R4 for any integration you are building now. Monitor R5 adoption, particularly in the subscription and event notification use cases where R5 improvements are most significant.

---

## The 21st Century Cures Act and FHIR Mandates

The 21st Century Cures Act (2016) and the HHS Interoperability and Information Blocking rules (finalized 2020, enforcement began 2022) created a regulatory mandate for FHIR-based data access.

The key requirements:

**Certified EHR Technology** — EHRs that seek ONC certification (required for Meaningful Use incentives) must support FHIR R4 APIs for patient data access, specifically through SMART on FHIR-enabled apps.

**US Core profiles** — The regulations mandate support for the US Core Implementation Guide, which defines the minimum data elements and FHIR profiles that must be supported. US Core R4 profiles specify required fields and search parameters for Patient, Condition, AllergyIntolerance, Immunization, MedicationRequest, and other core resources.

**Information blocking prohibitions** — Health systems, EHR vendors, and health IT developers are prohibited from practices that unreasonably restrict the access, exchange, or use of electronic health information. This creates a legal environment where refusing to provide FHIR API access is increasingly difficult to justify.

The practical implication for software teams: large health systems now have a regulatory obligation to expose FHIR APIs, and Epic and Cerner have both implemented FHIR R4 APIs across their customer base. Accessing data from a modern EHR is often cleaner through FHIR than through legacy HL7 v2 interfaces, and the regulatory trend continues to push in that direction.

---

## When You Will Encounter Each Standard

**HL7 v2 environments** — Community hospitals and critical access hospitals running older EHR versions, radiology and laboratory information systems, older pharmacy systems, long-term care facilities, legacy data migration projects, and any integration with a hospital's existing interface engine. If the hospital is running any version of Epic before 2018, Cerner before approximately 2020, or MEDITECH for most use cases, expect HL7 v2 to be the primary interface mechanism.

**FHIR environments** — Epic MyChart APIs, Epic third-party developer program (App Orchard), Cerner HealtheIntent, payer data exchange (the Payer-to-Payer FHIR rule requires payers to support FHIR R4), government programs (CMS Blue Button 2.0 for Medicare data), patient-facing applications accessing data from modern EHR patient portals, and most new health tech integrations being designed after 2021.

**Both simultaneously** — Large health systems with a mix of modern and legacy infrastructure, organizations aggregating data from multiple facilities, and applications that need both real-time event processing (where HL7 v2 ADT feeds are common) and on-demand data access (where FHIR APIs are preferred).

---

## Implementation Complexity

### HL7 v2 Integration Complexity

Building an HL7 v2 integration without an integration engine is painful. The MLLP transport protocol (a TCP-based framing protocol) is not HTTP, which means your existing HTTP client libraries do not work out of the box. Message parsing requires either a specialized library or custom parsing logic.

Libraries worth evaluating:
- **Java**: HAPI HL7 (the de facto standard, actively maintained)
- **Python**: `hl7apy`, `python-hl7`
- **Node.js**: `node-hl7-client`
- **.NET**: NHapi

The more significant challenge is integration engine interoperability. Most hospitals route HL7 v2 messages through an integration engine (Mirth Connect is open source and common; Rhapsody and Infor Cloverleaf are common in large health systems). You will often need to work with the hospital's IT team or their integration engine vendor to establish the connection, configure message filtering, and handle acknowledgment patterns.

Factor in time for the hospital IT approval process. Even straightforward HL7 v2 integrations routinely take three to six months from initial conversation to live data flow, because health system IT approval processes are cautious by design.

### FHIR Implementation Complexity

FHIR is technically more accessible than HL7 v2 — it is REST over HTTPS, using JSON, with a well-documented API. The technical implementation is straightforward compared to MLLP and HL7 v2 parsing.

The complexity in FHIR integrations comes from three places:

**Profile compliance** — FHIR resources have a base definition and then profiles (US Core, QI-Core, Da Vinci, etc.) that add constraints and required elements. A Patient resource from Epic may differ from a Patient resource from Cerner in which extensions are populated, which identifiers are present, and which search parameters are supported. The FHIR specification permits significant optional variation; actual interoperability requires working within the profiles that both systems support.

**OAuth 2.0 and SMART flows** — SMART on FHIR adds authorization complexity, particularly around the launch context flow (EHR-launched vs. standalone launch), scope management, and token refresh. For patient-facing applications, the authorization flow involves the patient authenticating with the health system's patient portal, which varies across EHRs.

**Epic and Cerner API-specific behaviors** — In practice, you are often integrating not with abstract FHIR but with Epic's FHIR API or Cerner's FHIR API. Each has quirks: rate limits, non-standard extensions, specific search parameter support, and their own developer registration and access approval processes. Epic's sandbox environment is accessible through their open.epic.com developer portal; Cerner's through the code.cerner.com portal.

---

## Security Considerations for Both

### HL7 v2 Security

HL7 v2 over MLLP has minimal native security. The historical assumption was that HL7 v2 traffic ran over private hospital networks, not the public internet. Modern deployments should use:

- **MLLP over TLS** (MLLP+) for encrypted transport when messages traverse any network segment you do not fully control
- **VPN tunnels** for connections between facilities or between a vendor system and a hospital's internal network
- **IP allowlisting** at the firewall level to restrict which systems can send and receive HL7 v2 messages

Even over private networks, HL7 v2 messages contain PHI in plaintext. Network access controls and encryption in transit are necessary even in internal environments.

### FHIR Security

FHIR over HTTPS provides transport encryption. But several additional security considerations apply:

**Token scope management** — SMART on FHIR scopes define what the application can access. Request only the scopes you need. Broad scopes (`patient/*.*`) are appropriate in some development scenarios; they are not appropriate in production.

**Token storage** — Access tokens and refresh tokens must be stored securely. In a web application context, this means server-side storage or secure httpOnly cookies — not localStorage or sessionStorage, which are accessible to JavaScript and vulnerable to XSS.

**Audit logging** — Access to FHIR APIs that return PHI needs to be logged at the application level, not just at the transport level. Your application needs an audit trail of which patient data was accessed, by whom, and for what purpose — independent of what the EHR logs on its end.

**PKCE** — The SMART on FHIR 2.0 specification requires PKCE (Proof Key for Code Exchange) for public clients. If you are implementing SMART on FHIR authorization, use PKCE regardless of whether it is strictly required for your client type.

---

## Practical Guidance: Which to Use When

**Use HL7 v2 when:**
- Your integration target is a community hospital or facility running a legacy EHR
- You need real-time ADT feed processing (admit/discharge/transfer events)
- The hospital IT team has an existing integration engine and prefers to route messages through it
- You are building a device integration (many medical devices and monitoring systems still output HL7 v2)
- The data you need is only available through the legacy interface

**Use FHIR when:**
- Your integration target is a modern EHR (Epic post-2018, Cerner, athenahealth, modern eClinicalWorks)
- You are building a patient-facing application that accesses data the patient has a right to access
- You need on-demand data access (query by patient, query by date range, search by condition code)
- You are building for payer data exchange under the CMS Interoperability Rule
- You are starting a new integration and have the option to choose

**Use both when:**
- You are aggregating data from a mixed environment (modern and legacy EHRs)
- You need real-time event notifications (HL7 v2 ADT) plus on-demand data access (FHIR queries)
- You are building a platform that needs to connect to any hospital, regardless of their technology generation

---

## Frequently Asked Questions

### Can we translate HL7 v2 messages into FHIR resources?

Yes, and there are published mappings for the most common conversions (ADT to FHIR Encounter/Patient, ORU to FHIR DiagnosticReport/Observation). The HL7 community maintains a ConceptMap and StructureMap library for common v2-to-FHIR translations. Integration engines like Mirth Connect have modules for FHIR transformation. The translations are not perfect — HL7 v2 and FHIR do not have 1:1 field correspondence for all data — but they are workable for most common use cases.

### Is FHIR replacing HL7 v2?

Gradually, and not uniformly. FHIR is the direction the industry is moving, accelerated by regulatory mandates. But HL7 v2 will remain in production healthcare systems for many years because replacing integration infrastructure at hospitals is slow and expensive. Any team building healthcare integrations should be comfortable with both.

### What is CDA and how does it relate to FHIR?

CDA (Clinical Document Architecture) is an HL7 standard for structured clinical documents — discharge summaries, referral letters, care plans. It uses XML and was the basis for Meaningful Use Stage 2 requirements (via C-CDA, Consolidated CDA). FHIR has largely superseded CDA for new implementations, but C-CDA documents are still widely used for transitions of care (hospital discharge to primary care, for example). Many FHIR integrations include the ability to generate or consume C-CDA documents for document-based workflows even when data APIs are FHIR-based.

---

If your team is building a healthcare integration and needs clarity on which approach fits your specific environment, or if you are designing a system that needs to handle both HL7 v2 and FHIR at scale, [start with an architecture conversation](/contact). We work in this space regularly and can give you a direct answer based on your actual target environment.

---

## Fintech Software Development: Compliance, Security, and Scale
Source: https://tampadynamics.com/blog/fintech-development-guide

> A technical guide to fintech software development — covering regulatory frameworks, security architecture, payment processing, and the engineering patterns that matter in financial services.

Date: 2026-01-21

Fintech development is not regular software development with a few extra security requirements. The regulatory environment, the sensitivity of financial data, the liability exposure of processing errors, and the fraud surface area create a distinct set of engineering constraints that shape architecture decisions from the beginning.

This guide is for engineering leaders, CTOs, and founders building financial services software — covering the regulatory landscape, core security architecture, and the engineering patterns that distinguish fintech systems from general-purpose software.

---

## The Regulatory Landscape

Fintech operates in a fragmented regulatory environment. The specific regulations that apply depend on the product type, the customers served, and the states and countries where the product operates. Getting this wrong is not just a compliance problem — it is a legal and operational risk.

### PCI DSS

The Payment Card Industry Data Security Standard applies to any system that stores, processes, or transmits cardholder data. PCI DSS is not a government regulation — it is a contractual requirement imposed by the card networks (Visa, Mastercard, etc.) through payment processor agreements.

The most important decision in PCI DSS compliance is scope reduction: minimize the systems that touch cardholder data. Using a payment processor like Stripe and never allowing raw card data to reach your servers reduces your PCI scope dramatically. Stripe handles card data; your system handles tokens. The compliance burden of a tokenized integration is far lower than processing card data directly.

If your product requires storing cardholder data — recurring billing without a vault, legacy hardware integrations — PCI DSS compliance is a significant engineering and audit commitment. Level 1 compliance requires an annual assessment by a Qualified Security Assessor and quarterly network scans.

### SOC 2

SOC 2 Type II is the de facto standard for fintech companies selling to enterprise customers or other businesses. It is an audit of your security controls over a defined period (typically six to twelve months), covering five Trust Services Criteria: security, availability, processing integrity, confidentiality, and privacy.

SOC 2 is not a specific technical standard — it is a framework that requires you to define your controls and demonstrate that you operate them consistently. Engineering teams are responsible for implementing the controls; compliance teams manage the audit process. Common technical controls that SOC 2 assesses:

- Logical access controls and multi-factor authentication
- Encryption at rest and in transit
- Audit logging and monitoring
- Change management and deployment controls
- Incident response procedures

SOC 2 readiness is not a last-minute project. The "Type II" designation means the audit covers a period of time, not a point-in-time assessment. You need to operate your controls consistently for the audit period before you can achieve the certification. Starting SOC 2 readiness work six months before you need the certification is late.

### GLBA

The Gramm-Leach-Bliley Act applies to financial institutions and requires safeguards for customer financial information. The FTC Safeguards Rule, updated in 2023, specifies more concrete technical requirements: encryption, access controls, multi-factor authentication, audit logging, and a written information security program.

If your product is a financial institution — or if you handle customer financial data on behalf of a financial institution — GLBA applies and its technical requirements need to be reflected in your architecture.

### State Licensing and Money Transmission

Payment products, lending products, and money transmission products require state-specific licenses in most US states. The licensing requirements vary significantly by state and by product type. This is a legal and compliance problem, not an engineering problem — but the engineering team needs to understand which states the product is licensed to operate in, because that determines where customers can be onboarded.

### Open Banking and CFPB 1033

The CFPB's Section 1033 rule, finalized in 2024, establishes consumer rights to access their financial data and requirements for financial institutions to provide data access to authorized third parties. If your product aggregates financial data or relies on consumer-permissioned data access, Section 1033 creates a more standardized regulatory framework for how that access must be provided and how the data must be handled.

---

## Payment Processing Architecture

The payment processing architecture decision is one of the most consequential in fintech development.

### Direct Integration with Stripe or Similar

For most fintech products, a direct integration with Stripe (or its equivalents — Adyen, Braintree, Square) is the right starting point. These platforms provide:

- Card tokenization — raw card data never touches your servers
- ACH and bank transfer support
- Subscription billing management
- Fraud detection and chargeback handling
- Regulatory compliance for the card processing layer

The engineering cost is low relative to alternatives. The trade-offs: per-transaction fees, dependency on a third-party platform, and limited control over the payments experience.

### Banking as a Service (BaaS)

Products that need to hold customer funds, issue cards, or operate as banking-like products (neobanks, earned wage access, embedded finance) use Banking as a Service providers — Synapse (acquired), Column, Treasury Prime, Unit, or Stripe Treasury. BaaS platforms sit between your application and a licensed bank partner, providing the financial infrastructure under a banking license your company does not hold.

BaaS architecture adds complexity: you are now integrating with an intermediary that has its own API contracts, reliability characteristics, and regulatory requirements. The BaaS provider's compliance requirements flow through to your application — KYC/AML requirements, transaction monitoring, and reporting obligations are all part of the integration.

### Plaid and Open Banking Data

Products that aggregate financial accounts — personal finance management, underwriting, lending — use Plaid or similar aggregators to access consumer bank account data via OAuth-based connections. The architecture consideration: Plaid's data is delayed (not real-time), coverage varies by institution, and the data model requires normalization before use. Plan the data access, refresh, and normalization pipeline before designing features that depend on financial data freshness.

---

## Security Architecture for Financial Data

### Encryption and Key Management

Financial data — account numbers, routing numbers, transaction history, identity documents — is high-value for attackers and carries regulatory consequences if exposed. The encryption baseline:

- AES-256 for data at rest
- TLS 1.3 for data in transit
- Field-level encryption for the most sensitive fields (account numbers, SSNs) beyond full-database encryption

Key management is where financial services engineering teams most commonly fall short. Encrypting data with keys that are stored adjacent to the encrypted data — in the same database, the same secrets manager, the same environment — provides weaker protection than it appears. AWS KMS with customer-managed keys provides separation between key access and data access. Envelope encryption adds another layer.

Define key rotation policies, document who holds each key, and test the key rotation procedure before an incident requires it.

### Secrets Management

Financial applications integrate with a large number of external services: payment processors, data aggregators, identity verification providers, banking partners. Each integration has API keys and credentials. These must live in a secrets manager (AWS Secrets Manager or Secrets Manager-compatible alternatives), never in environment files, application code, or version control.

Rotation of third-party credentials needs to be automated or at minimum procedurally enforced. Credentials that cannot be rotated quickly are a liability when a team member leaves or a key is inadvertently exposed.

---

## Fraud Detection and Anomaly Detection

Fraud is an operational reality in financial services, not an edge case to handle later. The architecture decisions that support fraud detection:

**Event streaming for real-time analysis.** Fraud signals — unusual transaction velocities, geographic anomalies, device fingerprint changes — are time-sensitive. A batch analytics architecture that processes transactions hours later cannot support real-time fraud intervention. Event streaming (Kinesis, Kafka) enables real-time signal processing.

**Behavioral baseline modeling.** Fraud detection at the account level requires a behavioral baseline — what is normal for this user? Establishing baselines for transaction amounts, frequencies, geographic patterns, and session behavior enables anomaly scoring relative to the individual baseline, not just population-level thresholds.

**Risk scoring at decision points.** Integrate risk scores into the transaction processing flow at the point where intervention is still possible: before a transaction is authorized, before a withdrawal is initiated, at account creation for identity verification. Post-hoc fraud detection that occurs after funds have moved has far less operational value.

**Human review queues for high-risk events.** Fully automated fraud decisions are appropriate for low-risk events (flagging for review) and high-confidence fraud signals (blocking). A human review queue for medium-risk events — where the cost of a false positive is high and automated decision confidence is lower — is a standard production pattern.

---

## KYC/AML Requirements and Implementation

Know Your Customer (KYC) and Anti-Money Laundering (AML) requirements apply to financial institutions, money services businesses, and many fintech products. The regulatory obligation is real; the implementation is an engineering problem.

### Identity Verification

KYC requires verifying that users are who they claim to be. The standard implementation uses a managed identity verification provider — Persona, Alloy, Jumio, or Stripe Identity — to collect and verify government-issued identity documents and match them against the presented identity. Using a managed provider limits the sensitivity of the documents your system directly handles and offloads the compliance burden of keeping document verification capabilities current.

What your system needs to store: the verification result (pass/fail/review), the provider's reference ID, the timestamp, and the identity information required for your downstream compliance purposes. Raw identity documents should generally not be stored by your application — let the IDV provider hold them under their compliance controls.

### Transaction Monitoring

AML requires monitoring transactions for suspicious activity and filing Suspicious Activity Reports (SARs) when warranted. The technical implementation:

- Define transaction monitoring rules appropriate for your product and customer risk profile
- Log all transaction events with sufficient detail to reconstruct the full activity picture
- Implement automated flagging for rule-based triggers (cash structuring patterns, transactions with sanctioned countries, velocity thresholds)
- Build a case management workflow for investigating flagged transactions

OFAC screening — checking transactions and counterparties against the Office of Foreign Assets Control sanctions list — is a required step in payment processing. This is typically handled through your payment processor or a dedicated compliance API rather than building your own screening logic against the OFAC SDN list.

---

## Audit Logging for Financial Transactions

Financial services audit logging has more prescriptive requirements than general software audit logging. Every financial transaction must be recorded with sufficient detail to reconstruct its complete history: who initiated it, what was the state at initiation, what approvals or verification steps occurred, what was the outcome, and any subsequent modifications or reversals.

Audit logs in financial systems must be:

- **Immutable** — Write-once, with no application-level ability to delete or modify records
- **Complete** — Every state transition of a financial record must be captured, not just the final state
- **Attributable** — Every action linked to an authenticated identity
- **Retained** — Financial records typically require multi-year retention; confirm requirements for your specific product and jurisdiction

The practical architecture: a dedicated audit log store (separate from the operational database) that the application can write to but cannot modify or delete from. A write-only IAM role for the audit log writer, a separate read-only role for audit review, and an entirely separate admin path for the rare cases where audit log correction is legally required.

---

## Multi-Currency and International Considerations

If your product operates internationally or handles multiple currencies, the complexity of the data model increases significantly.

**Currency representation.** Store monetary amounts as integers in the smallest currency unit (cents for USD, pence for GBP), not as floating-point numbers. Floating-point arithmetic with financial values produces precision errors that accumulate across transactions. The currency is stored as a separate field alongside the amount.

**Exchange rate handling.** If your product converts between currencies, the exchange rate used at the time of a transaction must be stored with the transaction record. Exchange rates change; knowing the rate that was applied is required for reconciliation and dispute resolution.

**Regulatory variation.** Different countries have different transaction reporting requirements, different KYC standards, and different data residency requirements. International expansion is not just a product question — it is a compliance and engineering question that requires country-specific architectural considerations.

---

## Open Banking and API-First Architecture

Open banking — the practice of financial institutions exposing customer financial data through standardized APIs — is both a regulatory trend and a product opportunity.

For fintech products that depend on financial data access, the API-first design principle matters: design your product assuming that data comes from APIs, not from proprietary scraping or batch file transfers. This positions the product for the regulatory direction of travel and provides a cleaner, more maintainable data access architecture.

For fintech products that are financial institutions (or adjacent to them), exposing a well-designed API is increasingly a regulatory requirement and a commercial differentiator. Design the API surface with the same engineering rigor as your core product — versioning, consistent authentication, rate limiting, comprehensive documentation.

---

## Build vs. Partner Decisions

Fintech companies face a recurring decision: build or integrate a partner for capabilities that are adjacent to the core product.

**Build:** Core IP, proprietary workflows, and capabilities that differentiate the product in the market. The risk scoring model, the underwriting logic, the investment portfolio optimization — whatever makes your product distinct.

**Partner:** Regulated, commoditized infrastructure — card processing, ACH rails, banking ledgers, KYC identity verification, OFAC screening. Building these from scratch means acquiring the licenses, compliance expertise, and infrastructure management burden that managed providers have already absorbed.

The pattern that works in fintech: narrow the custom-built surface area to the product's actual differentiation, and use managed partners for regulated infrastructure. The failure pattern is building regulated infrastructure from scratch — card processing, identity verification, AML transaction monitoring — without the compliance expertise to operate it correctly.

---

## Engineering for Financial Services

The distinguishing characteristic of well-engineered fintech systems is not the sophistication of the technology stack — it is the rigor applied to correctness, auditability, and failure handling.

Money movement errors have real consequences: customer harm, regulatory scrutiny, reputational damage. Systems that process financial transactions need to be designed for idempotency (duplicate transaction prevention), reconciliation (state recovery after failures), and complete audit trails — not as features added after launch, but as first-class engineering concerns from the start.

If you are building a fintech product and want a structured review of your architecture — payment processing, fraud controls, data security, or compliance infrastructure — [an architecture review](/contact) is where that conversation starts. Our [custom software development practice](/services/custom-software) and [compliance engineering work](/services/compliance-engineering) covers regulated financial system design.

---

## HIPAA Compliant App Development: A Technical Guide for Engineering Teams
Source: https://tampadynamics.com/blog/hipaa-compliant-app-development

> A practical, architecture-level guide to HIPAA compliant app development. Covers technical safeguards, PHI data flows, audit logging, encryption, BAA obligations, and common mistakes that cause compliance failures.

Date: 2026-02-16

Building a HIPAA compliant application is not primarily a legal exercise. It is an engineering discipline.

The regulation defines outcomes — confidentiality, integrity, availability of protected health information — but leaves implementation to you. That flexibility is also where most teams get into trouble. Without a clear architectural framework, "HIPAA compliance" becomes a checklist of surface-level controls that looks good in a vendor assessment and fails badly in an audit or breach investigation.

This guide is written for CTOs, engineering leads, and product owners who are building or rebuilding healthcare software and need to understand what HIPAA technical safeguards actually require — not at the legal summary level, but at the level of system design.

---

## What HIPAA Technical Safeguards Actually Require

The HIPAA Security Rule organizes requirements into three categories: administrative safeguards, physical safeguards, and technical safeguards. Engineering teams own the technical safeguards, and that category is more specific than most developers realize.

The Security Rule (45 CFR §164.312) defines five technical safeguard standards:

1. **Access control** — Unique user identification, emergency access procedures, automatic logoff, and encryption/decryption
2. **Audit controls** — Hardware, software, and procedural mechanisms to record and examine activity in systems that contain PHI
3. **Integrity** — Mechanisms to authenticate that PHI has not been improperly altered or destroyed
4. **Transmission security** — Guard against unauthorized access to PHI transmitted over electronic networks
5. **Person or entity authentication** — Verify that a person or entity seeking access to PHI is who they claim to be

Each of these has required specifications (mandatory) and addressable specifications (implement if reasonable and appropriate, or document why an equivalent alternative was used). The common mistake is treating addressable as optional. It is not. You must either implement it or document a compliant alternative — and that documentation will be examined if you are audited.

### What "Encryption" Actually Means Under HIPAA

HIPAA does not mandate a specific encryption algorithm, but it does reference NIST guidance. In practice, this means:

- AES-256 for data at rest
- TLS 1.2 or higher for data in transit (TLS 1.3 strongly preferred)
- Key management must be documented — who holds the keys, how rotation works, what happens during personnel changes

Encrypting your database and using HTTPS is necessary but not sufficient. If your encryption keys are stored in the same environment as the encrypted data, the protection is weaker than it appears. Key management is where many implementations fall short.

---

## Architecture Patterns for HIPAA Compliant Software

### Separate PHI Storage from Operational Data

The most durable pattern in HIPAA software architecture is to treat PHI as a distinct data tier with its own access controls, encryption, and audit logging — separate from your general application database.

This means:

- PHI lives in a dedicated data store with row-level or field-level encryption
- Application logic fetches PHI only when explicitly required for a specific operation
- PHI identifiers (patient IDs, record IDs) are separate from PHI content

A common implementation uses a dedicated encrypted database — RDS with encryption at rest, for example — while operational data (scheduling, billing metadata, workflow state) lives in a separate store. The application joins these only at the point of rendering, and the join itself is logged.

This pattern reduces the blast radius of a breach. If your operational database is compromised, it contains identifiers but not PHI content. It also makes access control simpler: you can apply strict IAM policies to the PHI store without restricting access to general operational data.

### Access Control: Role-Based Is Not Enough

Role-based access control (RBAC) is the baseline, but healthcare applications typically require attribute-based access control (ABAC) or a hybrid. The difference matters:

- **RBAC**: A user with the "clinician" role can access patient records
- **ABAC**: A user with the "clinician" role can access patient records for patients currently assigned to their care team, within the facilities where they hold active credentials

Pure RBAC grants overly broad access. A clinician at a large hospital system should not be able to query PHI for patients in facilities they have no relationship to. HIPAA's minimum necessary standard requires that access be scoped to what is actually needed for the current purpose.

Designing for ABAC upfront is significantly easier than retrofitting it. The key components:

```
AccessDecision = f(
  subject.role,
  subject.department,
  subject.facility_assignments[],
  resource.patient_id,
  resource.facility_id,
  resource.sensitivity_flags[],
  action.purpose_of_use
)
```

This decision function lives in your authorization layer — not scattered across individual API handlers. Every PHI access request passes through it, and the decision (including the parameters that drove it) is logged.

### Audit Logging: What to Log and How

Audit logging is one of the most commonly under-implemented HIPAA controls. The regulation requires that you record and examine activity in systems containing PHI. In practice, this means your audit log needs to capture:

- **Who** — Authenticated user identity (not just user ID, but enough to uniquely identify a person)
- **What** — The specific PHI record accessed, modified, or deleted
- **When** — Timestamp with sufficient precision (millisecond-level for most systems)
- **How** — The operation type (read, write, export, print, share)
- **From where** — Source IP, device identifier, and application context
- **Why** — Purpose of use where the system can determine it

A minimal audit log record looks like this:

```json
{
  "event_id": "evt_01HX...",
  "timestamp": "2026-02-16T14:23:11.847Z",
  "actor": {
    "user_id": "usr_abc123",
    "email": "provider@clinic.org",
    "role": "physician",
    "session_id": "sess_xyz789"
  },
  "resource": {
    "type": "patient_record",
    "record_id": "pt_def456",
    "facility_id": "fac_ghi789"
  },
  "action": "read",
  "purpose_of_use": "treatment",
  "source_ip": "10.0.1.45",
  "user_agent": "Mozilla/5.0 ...",
  "result": "success"
}
```

Critical implementation requirements:

- **Audit logs are append-only.** The application user cannot delete or modify audit records. Use a separate write-only connection or a dedicated audit service.
- **Audit logs are separate from application logs.** Mixing them makes audits difficult and creates risk that log rotation or deletion touches audit records.
- **Audit logs are retained for six years.** This is a specific HIPAA documentation requirement. Architect your storage with this retention window in mind.
- **Audit logs are themselves PHI-adjacent.** They may contain information that reveals PHI existence. Protect them accordingly.

---

## PHI Data Flows: Designing Systems That Minimize Exposure

Every system that handles PHI should have a documented data flow diagram. Not as a compliance artifact — as an engineering tool. Knowing exactly where PHI enters, where it is stored, how it moves, and where it exits is the foundation of a defensible architecture.

### The Principle of PHI Minimization

Before designing a feature, ask: does this component actually need PHI, or does it only need a pseudonymous identifier?

A scheduling system does not need a patient's full medical history to book an appointment. It needs a patient identifier, a provider, and a time slot. The clinical record system can be queried separately, under stricter controls, only when a clinician is actively rendering care.

PHI minimization in practice:

- **Tokenization** — Replace PHI fields with non-sensitive tokens in operational systems. The token mapping lives in a separate, access-controlled store.
- **De-identification** — For analytics, reporting, and ML training data, de-identify records to the Safe Harbor or Expert Determination standard before they leave the PHI boundary.
- **Data masking** — In non-production environments (development, staging, QA), PHI should be masked or replaced with synthetic data. Developers should never need real PHI to do their work.

### Where PHI Leaves Your System

Outbound PHI flows are where many organizations have the least visibility. Common unintended PHI exits:

- **Third-party analytics and error tracking** — If your error monitoring SDK captures request bodies or user context, it may be capturing PHI. This requires either a BAA with your monitoring vendor or ensuring PHI is scrubbed before it reaches those systems.
- **Log aggregation** — Application logs that include request parameters or response payloads may contain PHI. Structured logging with explicit field exclusions is safer than unstructured log strings.
- **Client-side data** — React Query caches, localStorage, browser session storage — all of these can hold PHI. Design your frontend state management to hold PHI only as long as needed and to clear it on logout or session expiry.
- **PDF generation and file exports** — Export pipelines are often an afterthought. Every generated document containing PHI needs to be accounted for, stored securely, and its access logged.

---

## BAA Obligations and What They Mean for Your Stack

A Business Associate Agreement is a contractual requirement, not a technical control — but your vendor choices determine which BAAs you can obtain, and a BAA cannot be obtained from every vendor.

### Who Needs a BAA

Any vendor, service provider, or contractor that creates, receives, maintains, or transmits PHI on your behalf is a Business Associate and requires a BAA. In a modern SaaS application, this list is longer than most teams initially expect:

- Cloud infrastructure provider (AWS, Azure, GCP — all offer BAAs)
- Database hosting (e.g., RDS, managed PostgreSQL services)
- Authentication provider (Auth0, Okta, Cognito — check BAA availability per tier)
- Error monitoring and observability (Datadog, Sentry — BAAs are available but often require enterprise tiers)
- Email delivery (if PHI is included in transactional email)
- AI and LLM providers (this is where most health tech teams have the largest gap)

### The AI/LLM BAA Problem

If you are integrating AI into a healthcare application, you need to understand which AI providers offer BAAs and under what conditions.

AWS Bedrock offers a BAA under its standard AWS HIPAA compliance program. Azure OpenAI Service offers a BAA through the Microsoft Products and Services Agreement. OpenAI's consumer API does not offer a BAA and should not be used with PHI. Anthropic's API does not currently offer a BAA for the standard tier.

This is not a comprehensive or permanent list — BAA availability changes as providers update their commercial terms. But the principle is stable: **if PHI will be sent to or processed by a service, that service needs a BAA before you write the first line of integration code.**

The architectural implication is that your AI integration layer needs to distinguish between what is and is not PHI. If your RAG system retrieves clinical documents to answer a query, those documents may be PHI. If you are sending them to an LLM, that LLM provider needs a BAA. If no BAA is available, you need an architecture that de-identifies or synthesizes the context before it leaves your HIPAA boundary.

---

## Common Mistakes That Cause HIPAA Failures in Software

These are the patterns we see most often when reviewing healthcare application architectures:

**Broad database access credentials.** The application service account has read/write access to the entire database, including all PHI tables. When that credential is compromised — through a misconfigured environment, a leaked secret, or an SSRF vulnerability — the entire PHI store is exposed. Instead: least-privilege database credentials, scoped to the minimum required tables and operations for each service.

**PHI in URLs and query parameters.** Patient IDs, record identifiers, or any PHI appearing in URL paths or query strings will end up in web server access logs, browser history, and HTTP referer headers. Use POST bodies for PHI, or use opaque identifiers that cannot be reverse-mapped without authenticated database access.

**Shared sessions without proper isolation.** Multi-tenant systems where session state or cache entries are not fully isolated by tenant. This is a standard software engineering problem, but the consequence in healthcare is PHI cross-contamination between organizations.

**Logging PHI in application logs.** Structured logging is good. Logging request bodies, response payloads, or user objects that contain PHI is not. Every logging call that touches user-supplied data needs to go through a sanitization function that strips PHI fields before writing.

**Missing automatic session termination.** HIPAA requires automatic logoff. Workstations left logged in with an active clinical session are a physical and technical risk. The implementation is straightforward — an inactivity timer that terminates the session after a configurable interval — but it is frequently omitted from initial builds.

**Treating development environments as outside scope.** Development and staging environments that use real PHI are HIPAA in scope. Using production database snapshots for local development, without de-identifying the data first, exposes PHI on developer workstations that are rarely subject to the same controls as production infrastructure.

**Insufficient encryption key management.** Encrypting the database but storing the encryption key in the same AWS account, in an environment variable, or in the same secrets manager instance as the application credentials — this is encryption theater. Key management needs to be a distinct architectural concern, with access to keys separated from access to the encrypted data.

---

## What a Compliant Architecture Review Looks Like

When we work with engineering teams on HIPAA compliant app development, the starting point is a structured review of the existing or proposed system design — not a compliance checklist, but an architecture conversation.

A useful review covers:

- **PHI inventory** — What data qualifies as PHI, where it is created in the system, where it is stored, and every path by which it moves or exits
- **Access control model** — How users are authenticated, how authorization decisions are made, and whether the model supports minimum necessary access in practice
- **Audit log completeness** — What is currently logged, whether it is sufficient to reconstruct the history of any PHI record, and whether the log store is adequately protected
- **Encryption posture** — Encryption at rest and in transit, key management, and whether encryption is applied at the right layer for your threat model
- **Third-party data flows** — Every vendor that touches PHI, whether BAAs are in place, and whether the data minimization principle is applied before PHI reaches external services
- **AI integration risk** — If AI is part of the system, how PHI is handled within the AI pipeline, which providers are in scope, and what guardrails are in place

The output is a clear picture of where the architecture is solid and where there are gaps — with specific, prioritized recommendations for addressing them. It is not a certification and it is not a legal opinion. It is an engineering assessment.

If your team is building a healthcare application and you want a structured review of your architecture before you go further, that is the conversation we are set up to have.

---

## Frequently Asked Questions

### Does HIPAA require a specific encryption standard?

HIPAA does not mandate a specific algorithm, but it references NIST guidance, and in practice AES-256 for data at rest and TLS 1.2 or higher for data in transit are the accepted standards. The more important question is often not which algorithm you use but how you manage the keys — who controls them, how they are rotated, and what happens when they are compromised.

### What is the difference between required and addressable HIPAA specifications?

Required specifications are mandatory — you must implement them. Addressable specifications must be implemented if reasonable and appropriate given your organization's risk assessment, or you must document why an equivalent alternative was used instead. Addressable does not mean optional. During an audit or breach investigation, you will be expected to demonstrate that you considered each addressable specification and made a documented, defensible decision.

### Can we use a HIPAA compliant cloud provider and consider ourselves covered?

No. A cloud provider offering a BAA and HIPAA-eligible services means the shared responsibility model applies — the provider is responsible for the physical infrastructure and some platform controls, but you are responsible for everything you build on top of it. Your application access controls, audit logging, encryption key management, and secure coding practices are entirely your responsibility regardless of what your cloud provider does.

### Do we need a BAA with our AI or LLM provider?

Yes, if PHI will be sent to or processed by that provider. This includes PHI used as context in prompts, PHI retrieved from your knowledge base and passed to a model, or PHI included in documents that are analyzed by the model. Review each AI provider's BAA availability before integrating them into a system that handles PHI. If a BAA is not available, you need an architecture that prevents PHI from reaching that provider.

### How long do audit logs need to be retained?

HIPAA requires that documentation of policies, procedures, and actions be retained for six years from the date of creation or the date when it was last in effect — whichever is later. For audit logs specifically, six years is the required retention window. Architect your log storage with this in mind: cold storage for older logs is acceptable, but retrieval needs to be practical if you are ever audited or investigating an incident.

---

## Build Healthcare Software That Is Defensible, Not Just Documented

The difference between healthcare applications that survive audits and those that do not is not usually the legal documentation — it is the engineering. Systems that log the right things, scope access correctly, keep PHI out of places it should not be, and handle third-party integrations with appropriate controls are genuinely more defensible than systems that rely on compliance documents to paper over architectural gaps.

If you are building a HIPAA compliant application and want an architecture review from engineers who work in this space, [start with a conversation](/contact). If you are specifically evaluating AI capabilities within a HIPAA-compliant framework, our [healthcare AI consulting](/healthcare-ai-consulting) practice works through exactly these design decisions.

The engagement is structured and time-bounded — a focused review to understand your systems, constraints, and goals, with clear deliverables. No pitch deck, no vague roadmap. If we are a fit, you will know exactly what the next steps look like.

---

## How to Choose a Software Development Partner: A Practical Evaluation Guide
Source: https://tampadynamics.com/blog/how-to-choose-software-development-partner

> A practical guide to evaluating and selecting a software development partner — covering technical due diligence, contract structure, engagement models, and red flags to watch for.

Date: 2025-12-09

Most companies that select a software development partner badly do so for one of two reasons: they evaluate vendors on the wrong criteria, or they do not know what they actually need until they are already deep into an engagement that is not working.

This guide is written for technical leaders, product owners, and founders who are evaluating software development partners — for a new build, a platform migration, or ongoing engineering capacity. It covers how to structure the evaluation, what to look for and what to avoid, and how to negotiate contracts that protect your interests without creating adversarial dynamics.

---

## Define What You Actually Need Before You Evaluate Vendors

The most consequential decision in vendor selection happens before you talk to any vendor: defining the engagement type you actually need. Most buyers conflate three distinct models, and confusing them results in mismatched partnerships from the start.

### Staff Augmentation

Staff augmentation means embedding individual engineers into your existing team. The vendor provides people; your team provides direction, architecture, and management. The augmented engineers work within your processes, your tools, and under your technical leadership.

**When it works:** You have a well-functioning engineering team with strong technical leadership and need to increase capacity. The backlog is defined. The architecture is established. You need execution, not design.

**When it fails:** You bring in augmented engineers because you lack technical leadership and expect them to provide strategic direction. Individual contributors cannot substitute for technical leadership, regardless of their seniority level.

### Project-Based Engagement

A project engagement scopes a defined deliverable — a feature set, a platform migration, an application build — with a beginning and an end. The vendor takes ownership of delivery within an agreed scope. You define the outcomes; the vendor determines how to get there.

**When it works:** You can define the scope clearly enough to contract around it. The deliverable has clear acceptance criteria. You have the bandwidth to review and approve work during the engagement.

**When it fails:** The scope is underspecified, requirements change substantially mid-engagement, or the client expects project-priced delivery with an open-ended scope. This is the source of most fixed-price software project disputes.

### Product Partnership

A product partnership is a longer-term relationship where the development partner functions as an extension of your product and engineering organization — contributing to architecture, roadmap, and strategic decisions, not just execution.

**When it works:** You are building a technology-intensive product and lack the internal capacity to own the technical strategy and execution. You want a partner who understands the business context, not just the ticket.

**When it fails:** You are not ready to give the partner enough context and authority to actually make good decisions. Treating a product partnership like a staff augmentation arrangement — managing to individual tickets without sharing strategic context — produces good task completion and poor architecture.

Being honest with yourself about which of these you need, before you issue an RFP or schedule discovery calls, will filter your vendor options significantly and save considerable time.

---

## Technical Evaluation Criteria

Most vendor evaluations focus on portfolio work, pricing, and team bios. These are necessary but insufficient. The technical evaluation criteria that actually surface differences:

### Architecture Interviews

Ask the technical lead who will own your engagement to walk through how they would approach a specific technical problem from your domain. Not a whiteboard exercise with a contrived problem — a real design question from your actual system.

What you are evaluating: do they ask the right clarifying questions before proposing an approach? Do they identify the tradeoffs in different approaches rather than presenting one solution as obviously correct? Do they demonstrate familiarity with the specific constraints of your domain (regulatory, operational, scale)?

A vendor who jumps immediately to a specific technology answer without understanding your constraints is either not thinking carefully or is selling you the solution they already know how to build. Neither is what you want.

### Code Sample Review

Ask for a code sample from a previous engagement in the relevant stack. Review it with someone who can evaluate it technically — look at structure, naming conventions, error handling, test coverage, and documentation. Code that is hard to read, poorly tested, and undocumented will look exactly the same way in your codebase six months after the engagement ends.

If the vendor refuses to show code samples citing NDA constraints, that is understandable — ask if they have open source contributions or can share a sanitized example. If they cannot produce anything reviewable, you cannot evaluate their technical craft.

### Documentation Quality

Ask to see a technical specification or architecture document from a previous engagement. The quality of their documentation tells you a great deal about how they think and communicate, and it predicts what you will receive at handoff.

Documentation that is vague, diagram-heavy with little explanatory text, or organized around the solution rather than the problem is a preview of what you will get at the end of your engagement. Clear, substantive technical writing that explains decisions and their rationale is a strong signal.

---

## Reference Checks That Surface Real Information

Standard reference checks are almost entirely useless. Vendors only share references who will give positive reviews. Asking a reference "How was it working with them?" produces an answer that was scripted before the call.

References are useful when you ask specific questions that require specifics in response:

**"Describe the most difficult moment in the engagement and how the vendor handled it."** Every real engagement has a difficult moment — a technical dead end, a scope dispute, a missed milestone. A reference who cannot describe one is either not remembering the engagement honestly or the relationship is too surface-level to be useful.

**"What would you do differently if you were starting the engagement again?"** This question surfaces the structural things the reference wishes had been true at the start — clearer scope, different contract structure, different team composition. The answer tells you what you should negotiate for before signing.

**"Did the code they delivered live up to what you were shown in the sales process? What were the gaps?"** This directly addresses the demo-to-delivery gap that is the most common disappointment in software development engagements.

**"Would you use them again, and for what type of work specifically?"** Some vendors are excellent for certain types of work and not others. The specificity of "yes, for X but not for Y" is far more useful than a generic recommendation.

Try to find references beyond the list the vendor provides. LinkedIn, industry networks, and common customers are all valid paths to additional references. Vendors who have many satisfied customers are easy to find corroborating references for; vendors who only surface references they control are a yellow flag.

---

## Contract Structures

The contract structure determines the alignment of incentives between you and the vendor. There is no universally correct structure — the right structure depends on how well you can define scope and how much risk each party can absorb.

### Time and Materials

You pay for time actually worked at an agreed rate. Scope can change; cost tracks actual effort. The risk of overruns sits with you; the vendor has less incentive to be efficient with time.

Time and materials is appropriate when scope is genuinely uncertain — early-stage exploration, research-heavy work, iterative product development where the requirements will evolve. It requires that you trust the vendor's time reporting and have enough visibility into the work to evaluate whether the effort is appropriate.

Protect yourself in T&M contracts with: weekly or bi-weekly timesheet review, a defined escalation process for cost overruns, and milestone-based check-in points where both parties can reassess whether the engagement is on the right track.

### Fixed Price

You pay a contracted amount for a defined deliverable. The vendor bears the risk of underestimating; you bear the risk of scope creep and the friction of formal change orders for anything outside the original scope.

Fixed price is only appropriate when scope can be defined precisely enough to actually hold the contract to it. Attempting fixed price on a project with poorly understood requirements produces disputes, quality shortcuts, and adversarial dynamics as the vendor tries to stay profitable within scope and you try to get everything you expected.

The contract language that matters in a fixed-price engagement: the definition of "done," the change order process (how scope changes are estimated and approved), and the acceptance testing criteria. Vague definitions of completion in a fixed-price contract will be exploited by either party when the engagement is under pressure.

### Milestone-Based

A hybrid: fixed-price milestones with clearly defined deliverables at each stage. The advantage is that you pay for tangible progress rather than time, while limiting fixed-price exposure to a bounded scope at each milestone rather than the full project.

Milestone-based contracts require that each milestone be defined precisely enough to determine whether it has been achieved. "Working authentication system" is not a milestone definition. "Users can register with email/password, receive a verification email, complete verification, and log in to the application — verified against the acceptance tests defined in Attachment A" is.

### IP Ownership and Source Code

Regardless of the contract structure, clarify IP ownership explicitly. The work should be unambiguously yours — including all source code, documentation, databases, and configuration. Ensure the contract specifies:

- Work product IP transfers to you upon payment (work-for-hire)
- The vendor may not reuse your code in other engagements
- Open source components are identified, and their licenses are compatible with your use
- Source code is delivered in a repository you control, not the vendor's

Some vendors retain ownership of "pre-existing IP" they bring into the engagement — generic frameworks, internal libraries, common utilities. This is reasonable to allow; make sure you have a perpetual, irrevocable license to use any pre-existing IP incorporated into your work product.

**Source code escrow** is worth considering for critical systems where you are dependent on a vendor relationship. An escrow agreement deposits source code, documentation, and build instructions with a third-party escrow agent; the code is released to you under defined conditions (vendor insolvency, end of engagement, failure to maintain). The overhead is modest; the protection is real for systems where continuity is essential.

---

## Red Flags

These are patterns that reliably correlate with poor engagement outcomes:

**Guaranteed timelines presented before discovery.** Any vendor that quotes a completion date before understanding your requirements in depth is either working from a template that does not match your situation or telling you what you want to hear. Credible estimates require scope understanding. A rough order of magnitude before discovery is fine; a confident commitment is not.

**No post-launch support plan.** Software development does not end at launch. Production systems require ongoing maintenance, bug fixes, dependency updates, security patches, and monitoring. A vendor who does not address post-launch support in their proposal is either expecting you to handle it (fine if planned for) or has not thought about it (not fine).

**Vague deliverables.** Proposals that describe deliverables in terms of effort ("200 hours of development") rather than outcomes ("a working authentication system with documented test coverage") create misaligned expectations. Effort describes input; deliverables describe output. Hold vendors to outcome-defined deliverables.

**The bait-and-switch team.** A vendor presents senior engineers in the sales process and then staffs the engagement with junior engineers under minimal senior oversight. This is common enough that it deserves explicit protection in the contract: name the key personnel on the engagement and require your approval for any substitution.

**No questions about your users.** A vendor who does not ask about the actual users of the system, their workflows, and how they will interact with what is being built will often produce a system that is technically functional and practically unusable. Product thinking and user-centricity are not bonus features — they are part of building software that succeeds.

**Offshore delivery presented as equivalent to onshore.** It may be equivalent for your situation, or it may not be. Offshore and nearshore models introduce coordination overhead, time zone friction, and sometimes quality variability that is not captured in the hourly rate comparison. We discuss this further below.

---

## Offshore vs. Nearshore vs. Domestic: An Honest Comparison

The cost differential between offshore, nearshore, and domestic development is real. So are the tradeoffs.

### Offshore (India, Eastern Europe, Southeast Asia)

**Advantages:** Lowest hourly rates, large talent pools, established firms with mature delivery processes for certain types of work.

**Real tradeoffs:** Time zone overlap of 0-4 hours with US Eastern is a genuine coordination overhead. Asynchronous collaboration requires discipline from both sides. Code quality variance is wide — the best offshore shops produce excellent work; the market is crowded with firms that compete on price and deliver accordingly. Senior technical leadership is often thinner than represented, and the team you work with may change more frequently than a domestic arrangement.

**Best fit:** Well-defined, execution-heavy work with stable requirements and strong technical oversight from your side. Not a fit for early-stage product exploration, domain-complex regulatory systems, or engagements where you need your development team to make architectural decisions independently.

### Nearshore (Latin America, Eastern Time Zone overlap)

**Advantages:** Significant time zone overlap with the US (usually 1-3 hours off East Coast), lower cost than domestic, and a growing pool of strong engineering talent in Colombia, Brazil, Mexico, and Argentina.

**Real tradeoffs:** The nearshore market is less mature than offshore; quality variance is also significant. The best nearshore shops are excellent; the market is not uniformly developed. English proficiency varies.

**Best fit:** Teams that value real-time collaboration during the US business day but have cost constraints that make domestic rates difficult. The time zone alignment advantage over offshore is genuine and undervalued.

### Domestic (US-based)

**Advantages:** Full time zone alignment, easier reference checking and relationship validation, stronger accountability mechanisms, often stronger product and communication skills, easier to integrate with your internal team.

**Real tradeoffs:** Highest hourly rates. The US market for experienced engineers is expensive.

**Best fit:** Complex, domain-intensive work in regulated industries. Systems where domain knowledge (healthcare, financial services, legal) is as important as raw technical skill. Engagements where real-time collaboration and rapid iteration are core to the process. Cases where the cost of poor architecture is higher than the cost of premium engineering rates.

The decision is rarely as simple as a direct hourly rate comparison. Model the total cost including coordination overhead, quality risk, and the cost of rework before concluding that the lower hourly rate produces lower total cost.

---

## Questions to Ask in the Evaluation Process

These are useful as a structured starting point for vendor discussions:

- Who will be working on this engagement day-to-day? Can I meet the actual team, not the sales team?
- How do you handle scope changes? Walk me through a specific example from a past engagement.
- What does your testing practice look like? What test coverage do you target, and how do you handle regression?
- What does handoff look like? What documentation will I have at the end of the engagement?
- What is your on-call and incident response process for production issues during the engagement?
- Have you worked in our regulatory environment before? What specific requirements did that create?
- What is the one thing that is most likely to cause this engagement to be harder than expected, and how would we get ahead of it?

The last question is the most revealing. A vendor who cannot answer it has not thought carefully about your engagement. A vendor who gives a thoughtful answer about specific risks and their mitigation demonstrates the kind of clear-eyed thinking you want building your system.

---

## Frequently Asked Questions

### How long should a vendor evaluation take?

For a significant engagement — a new product build, a major platform migration — four to eight weeks from first contact to signed contract is reasonable if you move with intention. Shorter timelines often produce hasty decisions; longer ones often reflect unclear internal decision-making more than vendor complexity.

### Should we issue an RFP?

RFPs are useful for commoditized work where you can specify requirements precisely and evaluate vendors on a comparable basis. For complex software development, RFPs often produce proposals that optimize for winning the evaluation rather than solving your problem. A structured conversation and a paid discovery engagement produce more useful information than an RFP response.

### What is a paid discovery engagement?

A bounded, paid engagement — typically 2-4 weeks — where the vendor develops a detailed technical proposal, architecture recommendation, or product specification based on deep collaboration with your team. It costs money but produces something real: a concrete plan that you can evaluate, compare, and potentially take to a different vendor. It also reveals whether the working relationship functions before you commit to a larger engagement.

### How do we evaluate a vendor for a domain we don't know well?

Bring in a technical advisor or consultant to assist with the evaluation — someone who can review code samples, conduct technical interviews, and assess architecture proposals with domain expertise. This is a much smaller investment than the cost of a poorly selected primary vendor.

---

Selecting a software development partner is a significant decision, and the evaluation process deserves proportionate effort. The frameworks here are a starting point, not a complete system — every engagement has specifics that matter.

If you are evaluating partners for a system in a regulated industry and want a direct conversation about whether we are a fit for your situation, [start here](/contact). We will tell you honestly if we are not.

---

## Next.js 16 Best Practices for Production Apps
Source: https://tampadynamics.com/blog/nextjs-16-best-practices

> Modern patterns and practices for building fast, maintainable Next.js 16 applications with React 19, Server Components, and the App Router.

Date: 2025-11-28

Next.js 16 represents a maturation of the App Router paradigm introduced in Next.js 13. Combined with React 19's stable Server Components, it's now the default choice for production React applications. Here's what we've learned building regulated, production-grade apps with this stack.

## Server Components by Default

The mental model shift is complete: components are Server Components unless you explicitly opt into client-side rendering.

### When to Use Server Components

- **Data fetching** — Fetch directly in your components, no useEffect or client-side loading states
- **Static content** — Marketing pages, documentation, blog posts
- **Sensitive operations** — API keys and database queries stay on the server

```tsx
// This runs on the server - no "use client" needed
async function RecentPosts() {
  const posts = await db.posts.findMany({ take: 5 })
  return (
    <ul>
      {posts.map(post => <li key={post.id}>{post.title}</li>)}
    </ul>
  )
}
```

### When to Use Client Components

Add `"use client"` only when you need:

- **Interactivity** — onClick, onChange, form submissions
- **Browser APIs** — localStorage, geolocation, window
- **State and effects** — useState, useEffect, useRef
- **Third-party client libraries** — Many UI libraries require client context

## The Component Boundary Pattern

Structure your app so client components are leaves, not roots:

```
Page (Server)
├── Header (Server)
│   └── MobileMenu (Client) ← interactivity isolated here
├── Content (Server)
│   └── ContactForm (Client) ← form state isolated here
└── Footer (Server)
```

This keeps your JavaScript bundle small and your initial page load fast.

## Data Fetching Patterns

### Parallel Data Fetching

Don't waterfall your requests:

```tsx
// Bad - sequential
const user = await getUser(id)
const posts = await getPosts(user.id)

// Good - parallel where possible
const [user, recentPosts] = await Promise.all([
  getUser(id),
  getRecentPosts() // if doesn't depend on user
])
```

### Streaming with Suspense

Wrap slow data fetches in Suspense to stream content progressively:

```tsx
export default function Dashboard() {
  return (
    <div>
      <h1>Dashboard</h1>
      <QuickStats /> {/* Fast - renders immediately */}
      <Suspense fallback={<Skeleton />}>
        <SlowAnalytics /> {/* Streams in when ready */}
      </Suspense>
    </div>
  )
}
```

## Caching and Revalidation

Next.js 16 gives you granular control over caching:

```tsx
// Revalidate every 60 seconds
const data = await fetch(url, { next: { revalidate: 60 } })

// Revalidate on-demand with tags
const data = await fetch(url, { next: { tags: ['posts'] } })
// Then call revalidateTag('posts') when content changes
```

For database queries, use `unstable_cache` (now stable in 16):

```tsx
import { unstable_cache } from 'next/cache'

const getCachedUser = unstable_cache(
  async (id: string) => db.users.findUnique({ where: { id } }),
  ['user'],
  { revalidate: 300 }
)
```

## Metadata and SEO

Use the metadata API for type-safe, dynamic SEO:

```tsx
export async function generateMetadata({ params }): Promise<Metadata> {
  const post = await getPost(params.slug)
  return {
    title: post.title,
    description: post.excerpt,
    openGraph: {
      images: [post.coverImage],
    },
  }
}
```

## Error Handling

Create error boundaries at route segment levels:

```
app/
├── error.tsx        # Catches errors in this segment
├── global-error.tsx # Catches root layout errors
└── dashboard/
    └── error.tsx    # Dashboard-specific error UI
```

Always provide a meaningful recovery path:

```tsx
'use client'

export default function Error({ error, reset }) {
  return (
    <div>
      <h2>Something went wrong</h2>
      <button onClick={() => reset()}>Try again</button>
    </div>
  )
}
```

## Performance Checklist

Before deploying, verify:

1. **No unnecessary "use client"** — Audit components for actual client needs
2. **Images optimized** — Use `next/image` with proper sizing
3. **Fonts optimized** — Use `next/font` for zero layout shift
4. **Bundle analyzed** — Run `@next/bundle-analyzer` to catch bloat
5. **Core Web Vitals** — Test LCP, CLS, and INP in production conditions

## Security Considerations

For regulated industries, remember:

- **Server Actions validate input** — Never trust client data
- **Environment variables** — Use `NEXT_PUBLIC_` only for truly public values
- **CSP headers** — Configure in `next.config.js` or middleware
- **API routes authenticate** — Check sessions/tokens on every request

## Our Stack

At Tampa Dynamics, we pair Next.js 16 with:

- **Tailwind CSS 4** — Faster builds, CSS-first configuration
- **TypeScript** — Strict mode, always
- **Velite** — Type-safe MDX content
- **AWS Amplify Gen 2** — Deployment with preview environments

This combination gives us fast iteration with production-grade reliability for our healthcare, legal, and compliance-focused clients.

---

Building a Next.js application for a regulated industry? [Let's talk](/contact) about architecture patterns that meet your compliance requirements.

---

## Offshore vs. Onshore Software Development: Honest Trade-offs
Source: https://tampadynamics.com/blog/offshore-vs-onshore-development

> An honest comparison of offshore and onshore software development — covering cost, quality, communication, IP risk, compliance considerations, and when each model works.

Date: 2025-12-16

The offshore vs. onshore question gets more attention than it deserves when framed purely as a cost question. Hourly rate differences are real. But the total cost of a development engagement — including management overhead, rework, timezone friction, communication failures, and the opportunity cost of delayed or incorrect software — rarely matches the hourly rate math.

This is not an argument against offshore development. It is an argument for choosing a development model based on the actual trade-offs for your specific context, rather than on hourly rate arbitrage or reflexive preference.

---

## The Real Cost Difference

The hourly rate comparison between offshore and onshore teams is straightforward: senior engineers in Eastern Europe, India, and Latin America typically bill at 30-60% of equivalent US rates. At face value, this creates significant cost savings on a project of any meaningful size.

The total cost picture is more complex.

**Management overhead.** Offshore teams require more active management than co-located onshore teams. Specification clarity, technical direction, quality review, and issue escalation all take more time when the team cannot walk over to your desk. A reasonable estimate for an offshore engagement is that someone on your side will spend 20-40% of their time managing the engagement — time that has a cost.

**Specification and rework cost.** Software quality is closely correlated with specification clarity. Offshore teams working from underspecified requirements produce output that requires significant rework. The cost of rework — not just the engineering time, but the schedule impact and the re-specification work — often exceeds the original cost savings. This is not unique to offshore teams, but the feedback loop is longer and the misalignment compound faster at a distance.

**Timezone friction cost.** A 9-12 hour timezone difference between a US-based product team and an offshore engineering team means there is a limited window for real-time collaboration. Decisions that would take 10 minutes in a room together take a day — one side sends a question at end of day, the other responds first thing the next morning. Over a six-month engagement, this friction is significant. It is manageable with deliberate process design, but it has a cost.

**Integration and knowledge transfer cost.** When the engagement ends or the team changes, knowledge that was held informally in an offshore team does not transfer automatically. Documentation quality, code comment practices, and knowledge management all affect how much value you retain.

The realistic cost comparison is not $X/hour vs. $Y/hour. It is (offshore rate × hours) + management overhead + rework cost + friction cost vs. (onshore rate × hours) + lower overhead. Depending on the engagement type, the gap narrows considerably.

---

## Quality Variability in Offshore Markets

Offshore development is not a monolithic category. Quality varies enormously — across countries, across vendors, and within vendors across individual engineers.

The offshore markets with the largest talent pools and the most mature software development industries are India, Eastern Europe (Poland, Romania, Ukraine, Czech Republic), and Latin America (Brazil, Argentina, Colombia, Mexico). Each has a different distribution of engineering talent, different typical specializations, and different cost ranges.

The variability within a single market is high. A top-tier software consultancy in Warsaw and a staff augmentation shop billing at a similar rate may produce dramatically different work. Evaluating offshore teams requires the same rigor as evaluating onshore vendors — technical interviews, code reviews, reference checks, and a structured trial engagement before committing to a full project.

The failure mode is selecting an offshore team based primarily on cost and presentation quality, without adequate technical vetting. This is common enough that "we tried offshore and it didn't work" is a common story — often told by organizations that made the selection decision carelessly.

---

## Communication and Timezone Challenges

Timezone management is the practical engineering challenge that most offshore engagement post-mortems trace problems back to.

**Async-first development works when it is designed intentionally.** Teams that treat offshore development as an opportunity to run asynchronous workflows — detailed written specifications, documented decisions, structured code review, daily written status — can function effectively across large timezone gaps. Teams that try to replicate synchronous onshore workflows at a distance pay the friction cost heavily.

**Minimal overlap windows create bottlenecks.** When a US-based product team and an Eastern European engineering team share a 1-2 hour daily overlap, every decision that cannot be made asynchronously waits for the next overlap window. Architectural decisions, scope questions, and ambiguous requirements can sit for 24 hours per iteration.

**Communication quality matters more than communication quantity.** The discipline of writing clear specifications, documenting architectural decisions, and creating written acceptance criteria is more important in an offshore engagement than in a co-located one. Organizations that are not already good at written communication struggle significantly with offshore development.

**Latin America nearshore reduces timezone friction** for US-based teams. Colombia (UTC-5), Argentina (UTC-3), and Mexico (UTC-6) share substantial working hours with US time zones. The timezone friction that makes India or Eastern Europe challenging largely disappears. This is the primary advantage of the nearshore model for US companies.

---

## IP and Data Security Considerations

Intellectual property exposure is a real risk in offshore development, not a reflexive fear. The risk is not that offshore developers are inherently less trustworthy — it is that legal recourse for IP theft across international jurisdictions is slow, expensive, and uncertain compared to domestic enforcement.

Practical IP protection measures for offshore engagements:

**Contractual protections with clear jurisdiction.** IP assignment clauses, non-disclosure agreements, and non-compete provisions need to be in place before any engagement begins. The governing law matters — contracts governed by US law with arbitration in the US are easier to enforce than contracts governed by local law in the vendor's country.

**Code access control.** Offshore team members should have access to the specific repositories and systems necessary for their work, and no more. Access should be revoked immediately when engineers leave the vendor team. This is true for all engagements, but the consequence of broad access rights is higher when legal recourse for misuse is uncertain.

**Avoiding proprietary algorithm exposure.** If your core IP is a specific algorithm, model, or business logic implementation, consider whether that component can be designed to remain onshore — with offshore teams working on adjacent, less sensitive components.

**Background checks.** Enterprise-level offshore vendors conduct background checks on their engineers; smaller shops may not. Ask specifically about their screening processes.

---

## Compliance Implications: PHI and Financial Data Offshore

This is the point where regulated industries diverge significantly from general software development in offshore vs. onshore analysis.

**PHI offshore is a HIPAA problem.** If your offshore development team has access to Protected Health Information — even in development or testing environments — those team members and their employer are potentially Business Associates under HIPAA. A BAA with an offshore development vendor is possible to obtain, but less common and less straightforward than with US-based vendors. More importantly, HIPAA enforcement across international borders is effectively non-existent. The risk profile of a data breach involving offshore contractors who had access to PHI is different from a breach involving onshore contractors.

The correct answer for healthcare software development is not to avoid offshore development — it is to ensure that offshore teams never have access to real PHI. Development and staging environments must use synthetic or de-identified data. This should be true for onshore teams as well, but the consequences of failure are more significant offshore.

**Financial data offshore creates similar considerations.** PCI DSS compliance requires documented controls over who has access to cardholder data environments. SOC 2 requires documented access controls. Offshore teams with access to production financial data need to be within the scope of these controls — which requires that your offshore vendor maintains equivalent compliance controls. Verifying this is non-trivial.

**The practical solution:** design your development environments so that no real sensitive data reaches the offshore team. Synthetic data, anonymized datasets, and mock services replace production data access. This is good engineering practice regardless of where the team is located, but it is essential for offshore development in regulated industries.

---

## When Offshore Works Well

Offshore development is most effective when:

**Requirements are well-defined and stable.** Offshore teams can execute against clear specifications efficiently. They struggle with ambiguous requirements that require frequent clarification cycles, which the timezone gap makes expensive.

**The work is modular and parallelizable.** Offshore teams add the most value on work that can be scoped clearly, developed independently, and integrated through well-defined interfaces. Deep integration with an existing codebase that requires constant context-sharing is harder to offshore effectively.

**The engagement is long-term.** Short-term offshore engagements have higher setup costs (knowledge transfer, process establishment, access provisioning) relative to the duration. Teams that work together long enough to develop shared context and communication norms are more efficient.

**Your organization is already process-disciplined.** Offshore development amplifies your existing process quality. Teams with good specification practices, clear ticket management, disciplined code review, and documented architectural decisions get more out of offshore development than teams that rely on informal coordination.

**The work does not require deep regulatory expertise.** Frontend development, backend services with well-defined APIs, test automation, and infrastructure work can be offshored more cleanly than HIPAA compliance architecture, financial regulatory controls, or security-sensitive system design — which require expertise that is harder to verify and evaluate in offshore markets.

---

## When Offshore Creates Problems

Offshore development is likely to underperform when:

**Product requirements are evolving rapidly.** In early-stage product development, requirements change frequently as the team learns what works. The feedback cycle in an offshore engagement — write requirements, offshore team builds, review at next overlap window, iterate — is too slow for rapid product iteration.

**The codebase is complex and poorly documented.** Offshore teams inheriting complex, undocumented codebases spend significant time on knowledge acquisition that a co-located team would resolve through informal conversation. The timezone gap makes this worse.

**Your team lacks offshore management experience.** Managing offshore development is a distinct skill. Teams that have never done it before underestimate the specification clarity and management overhead required, and they pay for the learning curve in project outcomes.

**Security and compliance are core to the product.** Security architecture and compliance design require senior expertise and close collaboration with the client's compliance and legal teams. These decisions are expensive to get wrong, difficult to review from a distance, and sensitive enough that offshore access controls become a material concern.

---

## Nearshore as a Middle Ground

Nearshore development — vendors in countries with timezone alignment to the client — captures most of the cost advantage of offshore development while eliminating most of the timezone friction.

For US companies, Latin American nearshore markets (Colombia, Argentina, Mexico, Brazil) offer:

- Comparable or overlapping work hours with US East and Central time zones
- English proficiency comparable to Eastern Europe
- Cost structures that are higher than India but significantly lower than US onshore rates
- Easier travel for in-person collaboration when needed

The nearshore model has grown significantly and the talent pool has deepened. For US companies looking for a practical middle ground, nearshore Latin America is worth evaluating alongside other models.

---

## Hybrid Models

Many organizations that have tried purely offshore or purely onshore models end up at a hybrid approach: onshore architects and tech leads who design the system and own quality, offshore or nearshore execution teams who implement well-specified components.

This model works when:

- The onshore layer provides strong technical direction and detailed specifications
- The offshore/nearshore layer is stable and long-tenured (not rotating contractors)
- Code review is rigorous and happens at the onshore layer
- The offshore team has genuine senior engineers, not just execution-level developers

The failure mode of the hybrid model is treating the offshore team as a code factory that executes without judgment, rather than as engineers who need context, can identify problems early, and improve with feedback. Teams are not interchangeable resources; treating them as such produces interchangeable-resource output quality.

---

## Evaluating Offshore and Nearshore Teams

The evaluation process for offshore vendors should be more thorough, not less, than for onshore vendors:

1. **Technical interviews** — Interview the specific engineers who will work on your project, not representative engineers from the vendor's team
2. **Code review** — Examine actual production code from comparable past projects (with appropriate NDA protections)
3. **Reference checks with past clients** — Specifically ask about the communication quality, specification clarity requirements, and how they handled ambiguous requirements
4. **Process evaluation** — How do they handle requirements gaps? How are technical decisions documented? How is code reviewed?
5. **Paid trial engagement** — A time-bounded, well-scoped trial project before a full engagement commitment is standard practice for reducing selection risk

The vendor's sales presentation and portfolio are the least useful inputs for evaluating offshore development quality. The most useful inputs are references from past clients and a trial engagement.

---

## Making the Decision

The offshore vs. onshore decision for most engineering engagements is not a binary choice. The question is: for the specific scope of work, given your regulatory environment, your internal process maturity, and the nature of the product, which development model produces the best outcome at an acceptable total cost?

For regulated industries specifically — healthcare, legal, financial services — the compliance considerations narrow the decision space. Offshore development with access to sensitive data is avoidable (through proper environment design) but requires more careful vendor evaluation and contractual protections. Onshore development in the US is simpler from a compliance documentation perspective and creates a cleaner audit trail.

If you are working through a development model decision for a regulated industry product and want a grounded conversation about what has worked and what has not, [reach out](/contact). Our practice focuses specifically on [compliance-aware software development](/services/compliance-engineering) for organizations where the regulatory environment is a first-class constraint.

---

## Building a Scalable SaaS MVP on AWS: Architecture Decisions That Matter Early
Source: https://tampadynamics.com/blog/scalable-saas-mvp-on-aws

> Architecture guidance for SaaS founders building on AWS — covering multi-tenancy, auth, data isolation, and the decisions that are expensive to change later.

Date: 2026-02-11

The framing of "MVP speed vs. architectural debt" is a false choice more often than not. A few specific architecture decisions made poorly in month one will cost weeks or months of engineering time at month twelve. Other decisions that feel important early turn out not to matter at scale. The skill is knowing which is which.

This guide focuses on the architecture decisions that are genuinely expensive to change after you have customers — not a comprehensive AWS tutorial, but a precise map of where early choices compound.

---

## The Decisions That Are Expensive to Change

Not all architecture is created equal. Some decisions can be reversed or iterated on cheaply as you learn more. Others lock you in, because changing them requires migrating existing data, rewriting core systems, or breaking existing integrations.

The expensive-to-change decisions in a SaaS MVP on AWS:

- **Multi-tenancy model** — How you isolate customer data
- **Identity and authentication architecture** — How users and tenants are represented in your system
- **Database schema for tenant isolation** — Row-level vs. schema-level vs. silo
- **API design for multi-tenancy** — How tenant context propagates through your stack
- **IAM and permissions model** — How internal service-to-service permissions are structured

Get these right early, or at least make deliberate trade-offs you understand. Everything else — serverless vs. containers, which observability tool you use, how you handle caching — can be changed with far less pain.

---

## Multi-Tenancy Models

Multi-tenancy is the defining architectural challenge of SaaS. The question is not whether to share infrastructure — it is how to isolate tenant data within shared infrastructure.

### Row-Level Tenancy (Silo in a Shared Table)

All tenants share the same database and the same tables. A `tenant_id` column on every row is the only isolation mechanism. The application enforces tenant scoping in every query.

**Advantages**: Simplest to implement, lowest infrastructure cost, easy to add tenants, no schema migration required per tenant.

**Risks**: Every query that forgets a `WHERE tenant_id = ?` clause is a data leak. Cross-tenant data leakage through application bugs is a real risk, not a theoretical one. In regulated industries — healthcare, legal, finance — this model is difficult to defend in a security audit, because isolation depends entirely on application-layer correctness.

**When it makes sense**: Early-stage B2C or B2SMB SaaS where all tenants have the same data model and the regulatory bar is low. Move toward stronger isolation if you win enterprise customers or enter regulated markets.

### Schema-Level Tenancy

Each tenant gets their own database schema within a shared database instance. Queries are schema-scoped, which provides stronger isolation than row-level without the infrastructure overhead of fully separate databases.

**Advantages**: Stronger isolation than row-level. Schema-level migrations per tenant are possible. Easier to backup, restore, or move individual tenants.

**Risks**: Schema proliferation — managing hundreds of schemas in a single database instance creates operational complexity. Per-tenant schema migrations become a deployment concern. Connection pooling requires careful handling.

**When it makes sense**: Mid-market SaaS with a moderate number of tenants (dozens to low hundreds), where per-tenant customization of the data model is a real requirement.

### Silo (Database-per-Tenant)

Each tenant gets a dedicated database instance. Full infrastructure isolation.

**Advantages**: Maximum isolation. Per-tenant backup, restore, and migration. Clean story for enterprise security audits. Eliminates cross-tenant blast radius from application bugs.

**Risks**: Highest infrastructure cost. Operational complexity of managing many database instances. Adding tenants requires provisioning infrastructure. Harder to run cross-tenant analytics.

**When it makes sense**: Enterprise SaaS with security-sensitive customers, regulated industry requirements, or tenants that demand infrastructure isolation in their contracts. Also appropriate if tenant data volumes are large and per-tenant performance isolation matters.

### The Practical Recommendation for MVPs

Start with row-level tenancy if your market is unregulated and you need speed. Design your queries from day one with a `tenant_id` parameter that cannot be omitted — enforce this at the ORM or query builder level, not through code convention. Use database row-level security (RLS) in PostgreSQL as a defense-in-depth layer.

If you are building for healthcare, legal, or finance from the start, the row-level model will create problems at the first enterprise security review. Schema or silo isolation is worth the additional setup cost.

---

## Authentication Architecture

Authentication is the second decision that is painful to change after launch. The choice of auth provider and the design of your user and tenant identity model will affect every part of your application.

### Amazon Cognito

Cognito is the natural AWS-native choice. It integrates with other AWS services, supports user pools and identity pools, and has a reasonable free tier.

**Where Cognito works well**: Applications that are deeply integrated with AWS services, where you need IAM role federation for fine-grained AWS resource access. Also reasonable for teams that want to stay within a single vendor.

**Where Cognito creates friction**: The developer experience for customization — custom authentication flows, custom token claims, flexible user attributes — is more complex than alternatives. The admin API is verbose. Multi-tenancy representation in Cognito (one User Pool per tenant vs. groups vs. custom attributes) requires deliberate design.

### Auth0 / Okta

More developer-friendly API, better customization, native multi-tenancy support through Organizations. Higher per-MAU cost at scale, vendor lock-in considerations.

**Where it makes sense**: Teams that need to move quickly on auth, products with complex SSO requirements (enterprise SAML/OIDC), or where the developer experience of the auth layer matters for velocity.

### Building on JWTs with Your Own Identity Layer

Rolling your own auth is almost never the right call for an MVP. The edge cases — token rotation, session invalidation, MFA, account recovery — consume engineering time that should go toward product. Use a managed auth provider and invest that time elsewhere.

### Tenant Representation in Your Identity Model

Regardless of which auth provider you use, you need a clear model for how users and tenants are represented. The common patterns:

- **Tenant in JWT claims** — The tenant ID is embedded as a custom claim in the auth token. The API reads tenant context from the token on every request. Simple, but the token becomes the source of truth for authorization.
- **Tenant in the request context** — The API resolves tenant context from the subdomain, request header, or a separate lookup on the authenticated user ID. More flexible, slightly more complex.

Design this model before you write your first API handler. Retrofitting tenant context into an API that was not designed with it is painful.

---

## Data Isolation and Security Implications

The multi-tenancy model you choose has direct security implications beyond data leakage risk.

**Blast radius**: If a security vulnerability allows arbitrary data access, how much data is exposed? Row-level isolation means all tenant data in the database is potentially in scope. Silo isolation limits exposure to the compromised tenant's database.

**Encryption**: AWS RDS supports encryption at rest per instance. With silo isolation, you can use per-tenant KMS keys — a meaningful isolation enhancement for security-conscious tenants. With row-level or schema isolation in a shared database, all tenants share the same encryption key.

**Backup and restore isolation**: With silo isolation, restoring one tenant's data does not require touching others. With shared databases, point-in-time restore affects all tenants.

**Compliance**: SOC 2, HIPAA, and enterprise security frameworks will ask how you isolate customer data. "Row-level isolation enforced by application logic" is an auditable answer — but it requires demonstrating that the application-layer controls are reliable and tested. Infrastructure-level isolation is a simpler story to tell.

---

## API Design for Multi-Tenant Products

Every API endpoint in a multi-tenant system needs to operate in tenant scope. There are two common patterns for expressing that scope:

**Subdomain-based routing** — `tenant-a.yoursaas.com` and `tenant-b.yoursaas.com` route to the same API, which resolves tenant context from the subdomain. Clean, intuitive for users, maps naturally to custom domain support.

**Header or path-based routing** — `api.yoursaas.com/v1/resources` with tenant context in a header (`X-Tenant-ID`) or path prefix (`/tenants/{tenant_id}/resources`). More explicit, easier to work with in API testing, but less natural for user-facing endpoints.

Whichever pattern you choose, the principle is the same: tenant context is resolved once at the entry point of the request and propagated through the call chain as a first-class parameter. It is not looked up piecemeal in individual service methods.

---

## Cost Optimization Early: Serverless vs. Containers

For most SaaS MVPs on AWS, the right compute answer is Lambda or ECS Fargate — not EC2 instances that you manage directly.

**Lambda** is appropriate when: request patterns are bursty and unpredictable, cold start latency is acceptable for the use case, and functions are small and stateless. Cost scales directly with usage — you pay nothing when idle.

**Fargate** is appropriate when: you have long-running workloads, consistent baseline traffic that would make Lambda always-warm overhead wasteful, or applications that are not naturally decomposed into function-sized units.

The cost optimization that matters most early is not serverless vs. containers — it is avoiding over-provisioning. An ECS service with auto-scaling minimum of 1 task and a Lambda with 128MB memory allocation will both be inexpensive at MVP scale. The mistake is provisioning for projected scale before you have validated usage patterns.

### Services That Scale Well

- **S3** — Object storage scales infinitely without management overhead
- **CloudFront** — CDN with no fleet to manage
- **SQS / SNS** — Message queuing and pub/sub with zero infrastructure
- **DynamoDB** (on-demand billing) — Scales to zero, pays per request
- **Lambda** — No idle cost, automatic scaling

### Services That Create Problems at Scale

- **RDS** — Does not scale horizontally for write-heavy workloads. Read replicas help for read scaling. Vertical scaling has limits and requires downtime. Plan your database access patterns early.
- **ElastiCache** — Cluster management, failover, and connection pooling add operational overhead. Consider whether a managed cache is necessary at MVP scale.
- **Single-region architecture** — Not a service, but a constraint. Cross-region replication and multi-region active-active are significant architectural commitments. Defer them until you have an actual customer requirement, but design with region-awareness from the start.

---

## CI/CD From Day One

CI/CD is not a luxury for later. The cost of shipping broken code to a production SaaS application with paying customers is high. The cost of setting up a basic CI/CD pipeline on day one is low.

Minimum viable CI/CD for a SaaS MVP:

1. **GitHub Actions or AWS CodePipeline** — Automated build and test on every pull request
2. **Staging environment** — A separate AWS environment that mirrors production, deployed to before production
3. **Infrastructure as code** — CloudFormation, CDK, or Terraform from the start. Clicking through the AWS console to configure production infrastructure is not repeatable and not auditable.
4. **Secrets management** — AWS Secrets Manager or Parameter Store for all credentials. No secrets in environment variables committed to source control.

If you are launching with a team of two, this setup takes a day. If you defer it until you have a team of ten and a production system with customers, retrofitting it costs far more than that.

---

## What NOT to Build Custom in an MVP

The list of things SaaS founders should not build from scratch at MVP stage:

- **Authentication and user management** — Use Cognito, Auth0, or Clerk
- **Email delivery** — Use SES with a transactional email service layer (Postmark, Resend, or similar)
- **Payment processing** — Use Stripe. The alternative is months of work and PCI compliance scope
- **Full-text search** — Use OpenSearch or Algolia rather than implementing search logic against a relational database
- **Feature flags** — Use LaunchDarkly or AWS AppConfig rather than hardcoded flags in your codebase
- **Analytics and event tracking** — Use a managed product analytics tool. Building your own event pipeline is a distraction at MVP stage.

The principle behind this list is not that these things are technically difficult. Some of them are not. It is that they are not your product. Your product is the workflow, the data model, and the user experience that solves your customer's specific problem. Every hour spent building an authentication system is an hour not spent on that.

---

## Starting Strong

The SaaS MVPs that scale well are not the ones that made the most sophisticated technical choices early — they are the ones that made the right trade-offs. Row-level tenancy with disciplined application-layer enforcement is the right call for some products. Silo isolation is the right call for others. The mistake is not picking either of those; it is not thinking about it at all.

The same principle applies to auth, API design, and CI/CD: these decisions do not have to be perfect, but they have to be deliberate.

If you are building a SaaS product and want a structured conversation about the architecture decisions that will matter at your specific stage and in your specific market, [an architecture review](/contact) is a focused engagement to work through exactly these questions. Our [cloud architecture practice](/services/cloud-architecture) and [SaaS development work](/services/saas-development) covers this territory across a range of product types and regulatory contexts.

---

## SOC 2 Compliance Checklist for SaaS Companies
Source: https://tampadynamics.com/blog/soc2-compliance-checklist-saas

> A technical checklist for SaaS founders preparing for SOC 2 Type II. Covers access controls, logging, encryption, change management, and vendor oversight — written for engineering teams.

Date: 2025-12-23

SOC 2 is a trust framework, not a security certification. The distinction matters because many SaaS companies approach an audit as a paperwork exercise and emerge with a report that tells customers very little about the actual security of the system. Done correctly, SOC 2 is an engineering discipline — one that forces rigor in access control, logging, change management, and incident response.

This checklist is written for engineering and technical leadership at SaaS companies preparing for a SOC 2 Type II audit. It covers what auditors actually look at, the controls that most commonly have gaps, and the sequencing decisions that affect how painful the process is.

---

## Type I vs. Type II: What the Difference Actually Means

A SOC 2 Type I report describes the design of your controls at a single point in time. An auditor reviews your documented policies, inspects your configurations, and opines on whether the controls, as designed, appear capable of meeting the applicable Trust Service Criteria.

A Type II report covers operating effectiveness over an observation period — typically six to twelve months. The auditor is not just reading your policies; they are testing whether the controls actually functioned throughout the period. That means log evidence, access review records, change management tickets, incident documentation, and vendor review artifacts — all timestamped and covering the full observation window.

Type I is sometimes used as a stepping stone, but enterprise buyers and enterprise procurement teams increasingly require Type II. If your goal is enterprise sales, plan for Type II from the beginning.

The practical implication for engineering: your observation period starts before you engage an auditor. Controls need to be live and producing evidence before the clock starts. If you implement audit logging in November and your observation period starts October 1, you have a gap.

---

## The Five Trust Service Criteria

SOC 2 is built on the AICPA Trust Service Criteria (TSC). Security is the only required criterion — the others are optional and elected based on what your customers care about and what your product does.

**Security (CC)** — The Common Criteria. Required. Covers logical and physical access controls, risk management, change management, monitoring, and incident response. This is the baseline that all SOC 2 reports include.

**Availability (A)** — The system is available for operation and use as committed. Relevant if your customers have uptime SLAs or your product is operationally critical. Requires documented uptime monitoring, incident response, and disaster recovery.

**Confidentiality (C)** — Information designated as confidential is protected. Relevant if you handle proprietary business data, trade secrets, or non-PHI sensitive information. Requires data classification, encryption, and access controls scoped to confidential data.

**Processing Integrity (PI)** — System processing is complete, valid, accurate, timely, and authorized. Relevant if you process financial transactions, payroll, or other integrity-sensitive workflows. Requires input validation, error handling, and output completeness controls.

**Privacy (P)** — Personal information is collected, used, retained, disclosed, and disposed of in conformity with your privacy notice and applicable criteria. Relevant if you handle personal data for individuals. Overlaps significantly with GDPR and CCPA obligations.

Most SaaS companies elect Security plus one or two others. If you process payments or handle financial data for customers, add Processing Integrity. If you are in a B2B context storing customer business data, Confidentiality is common. Healthcare products almost always add Availability given clinical dependencies.

---

## The Engineering Checklist

What follows is organized by control domain. Each item reflects what auditors will look for evidence of — not just policy documentation, but operational evidence that the control functioned during the observation period.

### Access Controls

- [ ] Unique user accounts for every individual — no shared credentials, no team accounts
- [ ] Multi-factor authentication enforced for all internal systems, cloud consoles, and production access
- [ ] Privileged access management: production access is separated from development access, with documented justification for each user who has production credentials
- [ ] Role-based access control with documented role definitions — what each role can access and why
- [ ] Access provisioning is tied to an onboarding process with documented approvals
- [ ] Quarterly (or more frequent) access reviews: evidence that you are reviewing who has access and removing stale accounts
- [ ] Offboarding procedures result in access revocation within a defined timeframe (24-48 hours is a common commitment)
- [ ] Automated deprovisioning where possible — SSO with SCIM provisioning makes access review evidence far cleaner
- [ ] Emergency access procedures documented and tested

### Audit Logging and Monitoring

- [ ] Authentication events logged: successful logins, failed logins, MFA challenges
- [ ] Privileged actions logged: production access events, administrative configuration changes, data exports
- [ ] Application-level events logged for security-relevant operations
- [ ] Logs are centralized — not scattered across individual servers or services
- [ ] Log integrity: logs cannot be modified or deleted by the application or by standard user roles
- [ ] Log retention meets your defined retention policy (common commitment: 12 months)
- [ ] Automated alerting on anomalous events: failed login spikes, unusual access patterns, configuration changes outside business hours
- [ ] Alerts are reviewed and responded to — documented evidence of the review process

### Encryption

- [ ] Data at rest encrypted — document the algorithm and key management approach
- [ ] Data in transit encrypted with TLS 1.2 or higher for all endpoints — TLS 1.3 preferred
- [ ] Encryption key management documented: who holds keys, rotation schedule, what happens when personnel with key access depart
- [ ] Secrets are managed via a secrets manager (AWS Secrets Manager, HashiCorp Vault, GCP Secret Manager) — not environment variables in application code or configuration files
- [ ] No plaintext secrets in version control — confirmed via secret scanning in your CI pipeline
- [ ] Backup data encrypted with the same or equivalent controls as primary data

### Change Management

- [ ] All production changes go through a defined process — documented, not just practiced
- [ ] Code review required before merge to main: evidence that changes were reviewed by at least one person other than the author
- [ ] Separation of duties for deployments: the person who writes code cannot be the only person who deploys it (some auditors will accept a compensating control if team size makes this impractical — document it)
- [ ] Automated testing in CI pipeline before production deployment
- [ ] Change approvals documented — a ticketing system or PR approval record suffices
- [ ] Emergency change process documented for out-of-band deployments, with post-incident documentation requirements
- [ ] Infrastructure changes managed as code where possible — reduces drift and creates an audit trail

### Vendor Management

- [ ] Vendor inventory maintained: all third-party services that have access to your systems or customer data
- [ ] Security reviews conducted for critical vendors before onboarding
- [ ] Annual (or more frequent) reviews of critical vendor SOC 2 reports or equivalent
- [ ] Contracts with subprocessors that handle customer data include appropriate security and data handling obligations
- [ ] If you are in scope for HIPAA: BAAs in place with every vendor that processes PHI
- [ ] A process exists for monitoring vendor security incidents and assessing impact on your environment

### Incident Response

- [ ] Incident response plan documented — not a template, but a plan tailored to your environment and team
- [ ] Incident classification defined (severity levels with documented response requirements per level)
- [ ] On-call or escalation path defined and tested
- [ ] Post-mortems documented for significant incidents during the observation period
- [ ] Customer notification procedures documented for security incidents that affect customer data
- [ ] Tabletop exercise conducted during the observation period — evidence of the exercise and its outcomes
- [ ] Communication channels tested: can you reach your team and your customers if primary systems are unavailable?

---

## Common Gaps That Cause Audit Findings

**Access reviews with no evidence.** Many teams conduct quarterly access reviews verbally or in a Slack thread. Auditors need a record — a spreadsheet, a ticketing system export, or a documented process output — showing that the review occurred and what actions were taken as a result.

**Logging coverage gaps.** Application logging and infrastructure logging often exist in separate systems. Auditors look for comprehensive coverage. If your CloudTrail is configured but your application-level events are not centralized anywhere, you have a gap.

**Change management exceptions.** Emergency deployments that bypass the normal change process are common. What auditors look for is whether you have a documented exception process and whether those exceptions are tracked and reviewed. Undocumented hotfixes to production are findings.

**Vendor reviews that stopped at the contract stage.** Signing a DPA or a security addendum with a vendor is necessary but not sufficient. Auditors expect ongoing review — annual SOC 2 attestation verification, monitoring vendor breach disclosures, re-assessing critical vendors when they make significant changes.

**Policies that do not match practice.** The most common finding is a policy that says one thing and operational evidence that shows something different. Your password complexity policy says 16 characters minimum; your identity provider is configured to accept 8. The fix is either to update the policy or to update the configuration — but the gap itself is a finding.

**Separation of duties with single-person teams.** Small teams legitimately cannot separate every duty. Auditors understand this. What they require is that you document the compensating controls: enhanced monitoring, periodic manager review, or other mechanisms that reduce the risk. The finding comes when there is neither separation nor a documented compensating control.

---

## Observation Period Timing and Auditor Engagement

The observation period is the window during which auditors collect evidence of control operation. Six months is the minimum for a credible Type II report; twelve months is more common and more credible to enterprise buyers.

The practical sequencing:

1. **Implement controls.** Every control on your checklist needs to be live and producing evidence before the observation period starts. Do not start the clock until your logging is centralized, your access reviews are scheduled, and your change management process is documented and followed.

2. **Run the controls for a period.** Many teams run for 30-60 days post-implementation before formally engaging an auditor. This surfaces operational gaps before the audit clock starts.

3. **Engage an auditor for readiness assessment.** Most audit firms offer a pre-audit readiness review. This is worth doing. It surfaces gaps that would otherwise appear as findings in the final report.

4. **Observation period begins.** This is typically agreed upon between you and your auditor. The auditor collects evidence at the end of the period, not throughout it — but the evidence they collect (logs, access review records, change tickets) must span the full period.

5. **Evidence collection and fieldwork.** The auditor requests evidence packages. Having a well-organized evidence repository — organized by control domain, with clear naming and date metadata — significantly reduces the friction here.

6. **Report issuance.** A Type II report contains a description of your system and controls, the auditor's opinion on control design, and testing results for each control over the observation period. Findings (exceptions) are documented with your management's response.

The typical engagement timeline from control implementation to report issuance is 12-18 months for a first-time SOC 2 Type II. Teams that try to compress this timeline often end up with short observation periods that raise questions with sophisticated buyers, or they end up rushing control implementation into the observation window and accumulating findings.

---

## Frequently Asked Questions

### How much does a SOC 2 Type II audit cost?

Audit fees from established CPA firms range from $20,000 to $60,000 for a first-time Type II engagement, depending on the scope of criteria elected and the complexity of your environment. Compliance automation platforms (Vanta, Drata, Secureframe) typically cost $10,000-$25,000 per year and can reduce the audit fee by streamlining evidence collection. Budget for both the audit firm and the tooling.

### Do we need SOC 2 compliance if we are a small startup?

Not immediately. SOC 2 becomes a practical requirement when you are selling to enterprise buyers, to regulated industries (healthcare, financial services, government), or to any customer whose procurement process includes a security questionnaire. Many early-stage SaaS companies implement the controls without pursuing a formal audit, then engage an auditor once enterprise sales require it.

### What is the difference between SOC 2 and ISO 27001?

Both are information security frameworks with third-party attestation. SOC 2 is US-centric and more common in US enterprise sales. ISO 27001 is internationally recognized and required by some European and global enterprise buyers. The controls overlap significantly; organizations that pursue both typically use a unified control framework rather than running two separate programs.

### Can we use a compliance automation platform instead of an auditor?

Compliance automation platforms (Vanta, Drata, Secureframe) automate evidence collection, monitor your environment for control drift, and connect to your infrastructure via API. They do not replace the auditor — you still need a licensed CPA firm to conduct the audit and issue the report. What they replace is much of the manual evidence collection work and the spreadsheet-driven control tracking that otherwise consumes significant engineering time.

---

## Build Controls That Actually Work

A SOC 2 report is evidence of the controls you operate, not the controls you intend to operate. The teams that come through audits cleanly are the ones that implemented controls before the observation period, ran them consistently, and maintained operational evidence as a byproduct of normal engineering practice — not as a compliance exercise bolted on at the end.

If your team is preparing for a first SOC 2 engagement and wants a structured review of your current control posture before the observation period begins, that is a useful conversation to have early. [Reach out to discuss your architecture](/contact) and where the gaps are most likely to surface.

---

## Tampa Bay Technology Landscape 2026: A Practical Guide for Growing Companies
Source: https://tampadynamics.com/blog/tampa-bay-tech-landscape-2026

> An honest look at Tampa Bay's tech ecosystem in 2026 — talent, infrastructure, industry verticals, coworking, and what makes the market different from coastal tech hubs.

Date: 2026-02-17

Tampa Bay's technology ecosystem has changed materially in the past five years. The pandemic-driven relocation wave that began in 2020 brought a mix of remote workers, founders, and enterprise teams who chose Tampa for reasons that are now structural advantages: no state income tax, lower cost of living relative to coastal tech hubs, a growing talent base, and enough of a city to actually enjoy living in.

What Tampa Bay's tech scene is not, in 2026, is Silicon Valley. The VC ecosystem is thinner, the density of senior engineering talent is lower than San Francisco or New York, and the startup exit history is shorter. Companies that relocated here expecting a comparable infrastructure were often disappointed.

What it is, is a legitimate and increasingly functional market for growing technology companies — particularly those in financial services, healthcare, defense, and enterprise software. This guide is intended to be practically useful for founders and operators who are evaluating Tampa Bay or who are already here and want to understand the landscape better.

---

## The Growth Trajectory

Tampa Bay experienced significant population growth between 2020 and 2025, driven by Florida's general in-migration trend and by specific relocations of financial services and technology firms. The metro area crossed 3.3 million residents, making it the eighth-largest metro in the US.

The technology sector grew within that context but not uniformly. The strongest growth has been in:

- **Financial technology and financial services operations** — Teams from major banks and fintech companies relocated or expanded Tampa operations, attracted by the financial services talent base that already existed around the Wells Fargo, Citibank, and JPMorgan Chase operations established in earlier decades
- **Healthcare technology** — Tampa Bay's existing healthcare infrastructure (described below) anchored health tech company growth
- **Defense and government contracting technology** — Proximity to MacDill Air Force Base and CENTCOM created demand for government technology contractors
- **Remote-first companies** that chose Tampa as a physical anchor for distributed teams

The growth that did not materialize at the same pace: consumer tech startups and venture-backed moonshots. Tampa Bay venture funding remains concentrated in later-stage deals and real estate-adjacent technology. If you are raising a seed round from local VCs for a consumer app, Tampa is not your best market.

---

## Financial Services: The Deepest Industry Cluster

Financial services is Tampa Bay's strongest technology employment vertical. The concentration of financial institutions with significant Tampa operations is meaningful:

**Citi** has one of its largest global technology operations in Tampa, employing thousands of technologists across software engineering, data, and infrastructure. The Citi Tampa presence has anchored a community of financial services engineers who have circulated through Citi and then moved to fintech startups, other banks, and technology companies in the area.

**JPMorgan Chase** has a significant Chase technology operations presence in Tampa, focused on retail banking and commercial banking systems. This adds another substantial pool of mid-career financial services technologists to the local market.

**Raymond James** is headquartered in St. Petersburg (across the bay), with a large technology team that has contributed significantly to the wealth management and financial planning technology ecosystem in the region.

**USAA, Charles Schwab, and Franklin Templeton** all have meaningful Tampa Bay operational presences, with technology teams that span engineering, data science, and cybersecurity.

For companies building financial services technology — payments, trading infrastructure, compliance tooling, wealth management software — Tampa Bay offers a concentrated talent pool and a network of potential customers and partners that is genuine, not manufactured.

---

## Healthcare and Health Tech

Tampa Bay is one of the larger healthcare markets in the Southeast US, anchored by several major systems:

**BayCare Health System** is one of the largest not-for-profit health systems in the country, operating dozens of hospitals and hundreds of ambulatory facilities across the Tampa Bay region. BayCare has invested in health IT capabilities and is a meaningful reference customer for health technology companies operating in the region.

**AdventHealth** has a significant Florida presence, with campuses across the Tampa Bay area. Their technology and innovation investments have created partnership opportunities for health tech companies, particularly those focused on patient engagement, care management, and clinical decision support.

**Moffitt Cancer Center**, a National Cancer Institute-designated comprehensive cancer center, is located in Tampa. Moffitt's research focus and patient volume make it a meaningful environment for oncology-adjacent technology companies, AI-assisted diagnostics, and clinical research tools.

**Tampa General Hospital** is a major academic medical center affiliated with the University of South Florida Health. USF Health encompasses the USF medical, pharmacy, nursing, and public health schools, creating a clinical research environment that supports health technology development.

For health technology companies — particularly those in clinical workflow, patient engagement, or health data infrastructure — Tampa Bay offers both customers and a talent pool with healthcare operations experience. The presence of major health systems also means that regulatory and compliance expertise (HIPAA, HL7/FHIR, clinical workflows) is findable in the local market.

---

## Defense Technology

MacDill Air Force Base, located on a peninsula in south Tampa, is the headquarters of US Central Command (CENTCOM) and US Special Operations Command (SOCOM). These are among the most operationally significant military commands in the US, and their presence drives a substantial government technology contracting ecosystem.

Defense technology companies — ranging from small specialized contractors to large primes with Tampa offices — cluster around MacDill. For software companies with defense or government technology ambitions, Tampa Bay is a meaningful market. Security clearance holders are relatively more common here than in most comparable metro areas.

The defense tech cluster is somewhat separate from the commercial tech ecosystem — different event circuits, different hiring pools, different customers — but it contributes meaningfully to the overall engineering talent base in the region.

---

## Tech Talent Landscape

### Universities

**University of South Florida** is the anchor institution for Tampa Bay's technology talent pipeline. USF's computer science, data science, and engineering programs produce a significant number of graduates annually. The quality of the engineering pipeline has improved as the university's research profile has grown.

**University of Tampa** produces a smaller engineering and business technology talent pool, with stronger representation in business and finance-adjacent technology roles.

**Florida Polytechnic University** in Lakeland (about 45 minutes east of Tampa) focuses specifically on STEM and has grown its technology programs meaningfully, though it is still a relatively young institution.

**HCC (Hillsborough Community College)** and local technical programs produce entry-level technology talent through bootcamps, associate degrees, and certification programs.

### The Talent Reality

Honest assessment: Tampa Bay has a good supply of early-career engineers and a reasonable supply of mid-career technologists in the verticals where Tampa has industry concentration (financial services, healthcare, defense). Senior engineers with 10+ years of experience who specialize in specific technology stacks remain more competitive to hire than in a major coastal tech hub.

Remote work has changed this equation in both directions. Tampa Bay companies can compete for senior engineers who want to live in Tampa for lifestyle reasons, even if those engineers could earn more in San Francisco. Conversely, Tampa Bay companies now compete with distributed-first employers who offer the ability to work from Tampa plus a national salary scale.

Compensation expectations have normalized closer to national benchmarks than they were pre-pandemic. Budget accordingly.

---

## Coworking and Office Space

Tampa Bay's coworking market grew substantially between 2020 and 2025 and has since consolidated around a smaller set of well-operated spaces.

**Armature Works** in the Tampa Heights neighborhood has become one of the more prominent tech-adjacent meeting and coworking destinations. The building's combination of food hall, event space, and office space, combined with its riverfront location, makes it a natural choice for client meetings and small team gatherings.

**Industrious** operates several locations in the Tampa market, including downtown and Westshore, targeting the professional and enterprise segment of the coworking market.

**WeWork** locations in Tampa have fluctuated with the company's broader operational changes; verify current availability before planning around them.

**St. Petersburg** (across the bay, approximately 30-45 minutes from downtown Tampa) has developed a coworking and tech community of its own, centered around the downtown St. Pete area. Companies based in St. Petersburg and Pinellas County have access to a slightly different talent pool and a lower commercial real estate cost basis.

For companies choosing between shared space and dedicated office: downtown Tampa commercial real estate costs are substantially below comparable space in Chicago, New York, or San Francisco. For teams of 10-50 people that want dedicated space, Tampa is often more cost-effective to lease than to use flexible space at premium per-seat rates.

---

## No State Income Tax: The Actual Impact

Florida has no state income tax. For individuals earning above $150,000-200,000, this is a meaningful take-home pay difference compared to California (13.3% top rate), New York (10.9% top rate), New Jersey (10.75% top rate), or Massachusetts (9%).

For recruiting, this is a real advantage when competing for senior engineers considering a Tampa relocation against offers in no-income-tax states (Texas, Washington). For founders who have historically lived in high-tax states, the personal financial impact of a Florida relocation is often more significant than expected.

The nuance: Florida funds its government through property taxes, sales taxes, and business-related fees, which are not trivial. The advantage is specifically the income tax elimination, not an overall low-tax environment.

---

## Tampa vs. Miami: A Practical Comparison

Both are Florida tech markets, and the comparison comes up frequently. They are quite different in practice.

**Miami** has attracted more venture capital, more international capital, and a more visible startup community — particularly in crypto, fintech, and real estate technology. The Miami ecosystem is more internationally oriented, with significant Latin American business connections. Cost of living in Miami proper is higher than Tampa, and real estate costs have increased dramatically since 2020.

**Tampa** has a deeper concentration in financial services, healthcare, and defense technology. The startup community is smaller and less venture-oriented, but the enterprise and mid-market customer base is more accessible. Tampa's cost of living advantage over Miami has grown as Miami has become more expensive.

For companies building B2B enterprise software for regulated industries — financial services, healthcare, compliance — Tampa Bay's customer concentration and industry talent pool is often more valuable than Miami's startup ecosystem. For consumer technology and venture-backed early-stage startups seeking funding and media visibility, Miami is currently the more active market.

---

## Areas to Watch

**Tampa Heights and the Riverwalk corridor** — The area north of downtown Tampa, anchored by Armature Works, has seen significant investment and development. Several technology companies have taken space in the area, and the density of restaurants, coworking, and event venues makes it a natural gathering point.

**Ybor City** — Tampa's historic neighborhood, immediately east of downtown, has been through multiple cycles of development. The combination of creative space, historic architecture, and proximity to downtown has attracted some technology and creative companies looking for character and lower rents.

**Westshore** — The established business district between downtown Tampa and the airport remains the largest concentration of office space in the market. Less character than Tampa Heights, but strong infrastructure, parking, and proximity to Tampa International Airport.

**St. Petersburg downtown** — The Growth of the St. Pete tech community, centered around the waterfront and Beach Drive area, has attracted companies that want a more manageable scale and a slightly different culture than Tampa proper.

---

## Challenges to Know

**Venture capital is thin.** If you need institutional venture capital, Tampa Bay's local VC ecosystem is limited. There are family offices, real estate capital, and some angel networks, but the density of institutional VCs writing Series A checks is not comparable to the coasts. Tampa Bay companies that have raised institutional rounds have generally done so from out-of-market investors. This is a structural limitation that is unlikely to change quickly.

**Senior talent competition is real.** Despite the growth, the number of senior engineers and technical leaders available for in-person roles is lower than in major tech hubs. Distributed-first hiring models are common among Tampa Bay companies for this reason.

**Industry networking requires intention.** The Tampa Bay tech community exists but is more diffuse than in dense coastal markets. Organizations like Tampa Bay Wave, Embarc Collective, and the Tampa Bay Technology Forum provide structured networking. Without actively participating in those networks, you can spend significant time in Tampa without connecting with the broader community.

**Hurricane season.** Inland Tampa and St. Petersburg have less storm surge risk than coastal areas, but Tampa Bay is one of the most vulnerable major metros in the US for a direct hurricane hit due to the bay geography. Business continuity planning, data center resilience, and disaster recovery deserve attention that they do not always get in other markets.

---

## Practical Advice for Companies Evaluating Tampa Bay

If you are evaluating a Tampa Bay expansion or relocation, the factors that tend to be decisive:

1. **Do your key customers or partners have significant Tampa operations?** Financial services, healthcare, and defense companies with Tampa operations are potential customers, partners, and sources of referrals.

2. **Does your required talent pool exist here?** For financial services technology, healthcare IT, and enterprise software, the answer is generally yes for mid-career roles. For early-stage product engineering in specialized niches, less so.

3. **Is the lifestyle fit right for your team?** Tampa Bay has good weather, outdoor amenities, a reasonable food and arts scene, and a livable suburban infrastructure. It is not New York or San Francisco for cultural density. For teams relocating from coastal cities, this is a real factor.

4. **Do you need to be in person, or are you distributed?** If distributed, Tampa is a reasonable personal base for founders and key team members. If you require in-person engineering teams, the talent pool and real estate economics both favor Tampa, but with the senior talent caveats noted above.

Tampa Bay is a legitimate and improving market for technology companies operating in regulated industries. We are building here because it is where our customers are and because the conditions — cost structure, talent access, and the specific industry clusters the region has developed — align with what we do. The honest perspective is that it rewards companies who understand the market and engage with it directly, more than companies looking for an ecosystem to carry them.

If you are building technology for regulated industries and are based in or considering Tampa Bay, [we are happy to connect](/contact) — not as a formal engagement, but as practitioners who spend time in this market and are often useful as a sounding board.

---

## Why AI Pilots Stall in Financial Services
Source: https://tampadynamics.com/blog/why-ai-pilots-stall-financial-services

> Most AI pilots in financial services do not fail technically. They stall in the gap between an interesting demo and a production system that risk and compliance can sign off on.

Date: 2026-04-09

In financial services, the AI pilots that ship are not the most technically impressive ones. They are the ones whose teams understood, before they started, what production deployment actually requires.

We have watched a handful of FinServ pilots stall over the past year, mostly with the same pattern: an exciting demo from an internal team or a vendor, followed by months of risk and compliance review, followed by a quiet wind-down. The technology was not the problem. The problem was that the pilot was designed to impress an executive sponsor, not to clear a production review.

Here is what we see go wrong, in roughly the order it shows up.

## Pilot scope mismatched to production scope

Pilots get scoped against ease of demo: a sample data set, a single workflow, a hand-picked set of test users. Production scope is different: real data, integration with the systems of record, every user role, edge cases that did not appear in the sample.

The result: the pilot works. Production breaks on the gap. Risk teams refuse to approve a system whose pilot did not surface the failure modes that matter.

The fix: design the pilot against a production-shaped slice of the workload. Smaller volume, but the same data heterogeneity, the same integration depth, the same user roles. The pilot should be smaller, not different.

## Audit logging not in scope

The pilot demo focuses on the AI's outputs. The audit log is not built. When risk asks "show me how this would be operated in production — every model interaction, captured, queryable, retained" — there is no answer.

This is the most common reason pilots stall. The team built the AI, not the AI plus its audit infrastructure. The audit infrastructure is more work than the AI in many cases. Surprising risk teams with this discovery midway through review is a project killer.

The fix: build the audit log first. The AI's outputs are observable in the audit log from day one. By the time risk reviews the system, the audit infrastructure is mature, the queries risk wants to run already work, and the conversation is about whether the system meets policy — not whether the system is auditable at all.

## Tenant or counterparty isolation undefined

In financial services, your data has to be separated from your counterparties' data, your customers' data has to be separated from each other, your investment bank's information walls have to be enforced in software. The boundaries are not optional.

Pilots often skip this. "It's a pilot, we'll figure it out for production." Production then requires a redesign of the data model, the retrieval index, the access control layer. The redesign is months of work. The pilot stalls.

The fix: design tenancy and information barriers into the architecture before the pilot. They get cheaper, not more expensive, when designed in early.

## Outputs are not citable

The AI produces an answer. The answer is wrong sometimes. Risk asks: "When the AI gets it wrong, how does the user know?"

If the system does not cite its sources, the answer is "they don't." The user has no way to verify the AI's output without redoing the work. Risk reasonably concludes that the system creates a new failure mode — confident-sounding wrong answers — and refuses approval.

The fix: every output cites the sources it relied on. The user verifies before acting. The audit log captures both the citation and the user's action. This is a hard requirement for FinServ AI, not a nice-to-have.

## No human-in-the-loop boundary

For any AI output that affects a regulated decision — a trading recommendation, a loan adjudication, a fraud alert classification — a human reviews and approves before the action takes effect. Pilots often blur this boundary. "The user can override the AI" is not the same as "the user must affirmatively approve."

The fix: explicit approval steps for any consequential output. Logged. Tied to the approver's user identity. The AI proposes; the human disposes; the audit log captures both.

## Model selection treated as a one-time decision

The pilot uses the latest, most capable model. The bill at production volume is unsustainable. The team scrambles to replace the model with a cheaper one and discovers that retrieval and prompting were tuned to the original model's behavior. Quality drops. The pilot is now expensive AND worse.

The fix: assume model selection will change. Build the system to be model-agnostic — clean separation between the orchestration layer and the model layer, evaluation harnesses that can run against any model, prompts that do not exploit specific model quirks. When you have to switch models, it is a configuration change, not a redesign.

## Vendor BAA / sub-processor due diligence happens late

The team picks a vendor for the AI components. The vendor was great in evaluation. Risk asks for the vendor's SOC 2 report, sub-processors list, and security policies, plus the AI model provider's terms, plus the embedding endpoint's terms. Each comes with a different timeline.

By the time the third-party review completes, the pilot is six months in and the executive sponsor has lost patience.

The fix: front-load vendor due diligence. Work with the procurement and risk teams in week one, not month four. The vendor that clears review fast is not always the most exciting; it is often the right choice.

## The conversation gets political

When pilots stall in FinServ, the explanation in the room is often not technical. "Risk is being unreasonable." "Compliance is moving the goalposts." "The vendor isn't doing their part." These framings are tempting and rarely accurate.

Risk and compliance are doing what they are paid to do: refusing to approve systems that cannot be operated safely under the firm's regulatory regime. If the pilot did not produce an auditable, isolatable, citable, human-supervised system, the answer is no, and the right answer is no.

The fix is upstream: design the pilot to clear those bars from day one. The pilots that ship in FinServ are the ones designed against the production review the firm will actually do — not against the demo the executive sponsor wants to see.

## What we do differently

When we work with FinServ teams, we sequence the work so audit logging, tenancy, and output citation are in place before the AI does anything interesting. The first weeks of the engagement are not glamorous. They produce the infrastructure that makes the AI defensible. Once that is in place, building the AI features on top of it is fast.

Teams that try to skip ahead — get to the impressive demo first, then add the compliance scaffolding — almost always end up paying more for less. The audit-first sequence is the cheap path, even though it does not feel that way at the start.

If you are working on a FinServ AI project that is heading into pilot — or stalled in pilot review — [we have done this enough times to be useful](/contact).


# Case Studies

---

## Deposition Analysis RAG System for a Legal Tech Platform
Source: https://tampadynamics.com/resources/case-studies/legal-deposition-rag-system

> How we built a retrieval-augmented generation system over a legacy deposition database — with inconsistency detection, natural-language Q&A, report generation, and a full ETL pipeline from a .NET source.

Date: 2025-09-30

## The situation

A legal technology platform had spent years building a database of attorney depositions in a legacy .NET system. The data was valuable — it contained detailed testimony records useful for case research, inconsistency analysis, and litigation strategy. But it was locked inside a schema that predated modern search, and accessing it meant knowing exactly what to look for.

They wanted to unlock that corpus for their users: let attorneys ask natural-language questions, surface contradictions across depositions, and generate structured reports — without rebuilding the underlying database or abandoning the data that was already there.

## What we built

A retrieval-augmented generation system running on AWS, with a full ETL pipeline feeding it from the legacy source.

The ETL pipeline reads from the .NET database, normalizes and chunks the deposition records, generates embeddings, and loads the output into a retrieval index. The pipeline handles the shape of legacy data — inconsistent formatting, varying record structures — and produces clean, chunked input that the retrieval layer can work with reliably.

On top of that, two distinct AI workflows:

**Q&A over depositions.** Attorneys ask natural-language questions and receive answers grounded in specific deposition excerpts, with citations back to the source record. Claude Sonnet handles the retrieval-grounded generation for this workflow.

**Inconsistency detection.** A separate orchestration workflow identifies contradictions across deposition records and flags them for attorney review. This task requires more complex multi-document reasoning, so it uses Claude Opus — the harder the reasoning requirement, the more capable the model.

Both workflows feed into report generation. The platform also delivers a consumer-facing interface for end-user queries and a separate admin interface for platform management and workflow configuration.

## Decisions that mattered

**Two models, not one.** Sonnet and Opus serve different jobs here. Sonnet is fast and accurate for retrieval-grounded Q&A where the answer is directly supported by a source chunk. Opus is reserved for inconsistency detection, where the system needs to reason across multiple documents and surface contradictions that are not obvious at the sentence level. Using Opus for everything would be slower and more expensive without improving Q&A quality. Using Sonnet for everything would produce weaker inconsistency detection.

**ETL before RAG.** The quality of a retrieval system depends entirely on the quality of its input. Before writing a single line of orchestration code, we designed the ETL pipeline to handle the edge cases in the legacy .NET data — varying record formats, encoding issues, incomplete fields. The retrieval layer only works reliably because the pipeline feeding it is boring and rigorous.

**Retrieval-grounded outputs only.** Every answer the Q&A system produces cites the specific deposition record it drew from. The system does not synthesize across documents without attribution, and it does not fall back to the model's general knowledge when retrieval comes up short. If the answer is not in the corpus, the system says so.

**Separate consumer and admin surfaces.** The consumer UI is optimized for attorneys using the system in the course of casework — clean, query-focused, citation-forward. The admin UI serves the platform's internal team: managing ingestion, monitoring workflows, configuring the inconsistency-detection pipeline. Blending these into a single interface would have made both worse.

## Outcome

The platform delivered three months from kickoff: ETL pipeline, retrieval index, Q&A workflow, inconsistency-detection workflow, report generation, consumer UI, admin UI.

The deposition corpus that was previously searchable only by those who knew its schema is now accessible through natural language. Attorneys can ask questions and receive answers with source attribution. Inconsistencies that would require manual cross-reference can be surfaced systematically.

The engagement is on a monthly retainer. Model versions change, new Bedrock capabilities become available, and the platform's feature set continues to evolve — the retainer structure means that evolution is planned and budgeted, not reactive.

## Working with us

The engagement started with an architecture review focused on the retrieval design and ETL strategy. Legacy source data is the hardest part of most RAG projects — getting that pipeline right before building the AI layer determines whether the system is usable in production or impressive only in demos.

If you have a legacy data corpus and want to build reliable AI retrieval over it — not a proof of concept, but something that ships and holds up — [let us know](/contact).

---

## Cloud-Native Operating Platform for a Multi-Business Owner-Operator
Source: https://tampadynamics.com/resources/case-studies/multi-business-operating-platform

> How we designed and shipped an integrated operating layer across four businesses — estimating, back-office dashboards, and automated lead-to-invoice workflows — with AI-assisted quoting built on AWS Bedrock.

Date: 2025-06-30

## The situation

A single owner-operator was running four businesses. Two were in field services — HVAC and construction work. One was a table-service restaurant. One was a B2B commercial kitchen ventilation company serving restaurants and food-service operators. Each business was operationally distinct. Estimating, invoicing, customer relationships, and marketing all worked differently depending on which business you were looking at.

The shared problem: there was no operating layer that spanned all four. Estimates were being assembled by hand. Job and customer data lived in different tools with no shared view. Leads came in through one system, invoices went out through another, and reconciling the two required manual work that grew with the business rather than shrinking.

The owner did not need four separate software projects. He needed one coherent platform — something that understood the differences between his businesses while eliminating the duplication that cost him time every week.

## What we built

A cloud-native business operating platform built on AWS, with workflow automation tying together HubSpot, QuickBooks, and the custom applications.

**Estimating platform.** The most operationally significant piece. The estimating tool is built for the field-services and commercial ventilation businesses, where a quote involves job-specific variables, material and labor line items, and customer-facing language that needs to be clear and professional. AWS Bedrock powers AI capabilities inside the tool: draft quote generation from job parameters, line-item suggestions based on job type, and drafting of customer-facing communications. The goal is not to replace the estimator's judgment — it is to remove the blank-page problem and reduce the time from job assessment to quote delivered.

**Back-office dashboard.** A single operator-facing interface that consolidates data across all four businesses. Field services jobs, restaurant operations, and commercial ventilation accounts sit in the same view, with the right data surfaced for each business type. The owner does not need to context-switch between four different tools to understand where things stand.

**Workflow automation.** n8n runs on AWS EC2 and serves as the automation layer connecting HubSpot, QuickBooks, and the custom applications. When a lead comes in through HubSpot, the relevant data flows into the estimating platform. When a quote is accepted, the workflow triggers job creation. When a job closes, the invoice flows into QuickBooks. The sequence from lead to quote to invoice happens without manual re-entry at each handoff.

**Marketing and web operations.** Supporting each business's lead-generation presence — websites and marketing tooling built to feed the front end of the workflows described above.

```
[Estimating platform — Bedrock-assisted]
          ↓
[n8n automation layer on EC2]
     ↙         ↘
[HubSpot CRM]  [QuickBooks]
          ↓
[Back-office dashboard — Amplify]
```

## Decisions that mattered

**One platform, not four.** The natural temptation was to treat each business as a separate project. We pushed back on that. The field-services businesses share estimating logic. The commercial ventilation business shares customer communication patterns with the HVAC side. The restaurant is genuinely different — but it still needs to appear in the same back-office view. Building a shared platform with business-specific configuration is more work upfront and significantly less work over time.

**n8n for orchestration, not custom glue code.** Wiring HubSpot, QuickBooks, and custom apps together with hand-written API integration code would have been brittle and hard to modify. n8n gives the automation layer a visual representation that is practical to operate and extend. When a workflow needs to change — a new HubSpot field, a different invoice trigger condition — the change happens in the automation layer without touching application code.

**Bedrock for AI where it reduces real friction.** The estimating tool uses AI in places where it saves meaningful time: generating draft language from structured job data, suggesting line items based on job type, drafting follow-up communications. It does not use AI as a feature in search of a problem. The field-services businesses produce a high volume of quotes; reducing the per-quote effort is a concrete operational improvement.

**Retainer structure from the start.** An operating platform for four active businesses is not a project with a finish line. Businesses change. Pricing structures change. New service lines get added. The engagement was structured as a build phase followed by an ongoing retainer — not because there would be bugs to fix, but because the platform needs to evolve as the businesses do.

## Outcome

The platform delivered three months from kickoff: estimating tool, back-office dashboard, n8n automation workflows, HubSpot and QuickBooks integration, and supporting marketing infrastructure.

The four businesses now share an operating layer. Estimates that were assembled manually are drafted with AI assistance and reviewed before sending. Lead and invoice data that previously required manual entry in multiple systems flows automatically. The owner has a single place to see what is happening across the portfolio.

The engagement is on a monthly retainer. As the businesses grow and their operational needs change, the platform changes with them.

## Working with us

This engagement started with a conversation about how the four businesses actually operated day-to-day — where time was being lost, where data was being duplicated, what a good day looked like versus a bad one. The architecture followed from that: AWS because it fits the scale and the tooling, n8n because the automation layer needed to be maintainable by someone other than the original developer, Bedrock because the estimating volume made AI-assisted drafting worth the investment.

If you run multiple businesses and need an operating platform that reflects how they actually work — not a generic SaaS product you have to adapt yourself to — [let us know](/contact).

---

## Business Operations Platform for a Specialty Pharmacy
Source: https://tampadynamics.com/resources/case-studies/specialty-pharmacy-business-platform

> How we designed and shipped an Azure-hosted business operations platform that consolidates administration and marketing workflows through a custom HubSpot integration.

Date: 2025-12-15

## The situation

A specialty pharmacy was running business operations and marketing workflows across disconnected tools. Administration tasks, customer communications, and marketing operations each lived in their own system with no shared data layer and no single place to manage the full picture.

They needed a custom platform — not another off-the-shelf SaaS subscription — because their workflows did not fit standard templates. The application had to integrate tightly with HubSpot, where their CRM and marketing operations already lived, while giving their team a purpose-built admin interface designed around how they actually work.

## What we built

An Azure-hosted application that serves as the operational hub for administration and marketing, connected to HubSpot through a custom integration layer.

The architecture runs on Azure Container Apps, which gives the client a scalable, containerized hosting model without the overhead of managing infrastructure directly. On the HubSpot side, we built a companion app that uses HubSpot's API to expose pharmacy-specific workflows — data surfaced in the right context, not buried in generic CRM views.

```
[Azure Container Apps — ops platform]
          ↓
[Custom HubSpot integration layer]
          ↓
[HubSpot CRM + marketing operations]
```

The two systems stay in sync. Changes made in the ops platform propagate where they need to go. HubSpot remains the source of truth for contact and campaign data; the custom platform handles the domain logic that HubSpot alone cannot.

## Decisions that mattered

**Azure Container Apps over managed PaaS.** The client's other infrastructure is Azure-native. Staying in that environment meant no new cloud accounts, no new IAM model to reason about, and a hosting tier that scales without paying for reserved capacity they do not need.

**A custom HubSpot app, not just API calls.** Off-the-shelf HubSpot integrations are built for generic workflows. Pharmacy administration has specific shapes — specific data fields, specific approval sequences, specific reporting needs. We built a companion app that makes those domain-specific workflows first-class inside HubSpot rather than grafting them on top.

**Scope discipline on the initial build.** The client had a long list of eventual requirements. We agreed on a clear first-delivery scope, shipped it, and structured the retainer to evolve the platform from there. Shipping something real and iterating is more valuable than spending months on a specification that will change anyway.

**Monthly retainer from day one.** Business operations platforms are not fire-and-forget. Workflows change, HubSpot updates its API, new requirements surface. The engagement was structured as build-then-evolve, not build-then-hand-off.

## Outcome

The platform shipped in December 2025 against the agreed timeline. Business administration and marketing operations are now running through a single application, with HubSpot as the underlying data backbone.

The client did not have to compromise on how their workflows work to fit a product someone else designed. The platform reflects their actual process — and when that process changes, the retainer structure means it can be updated without starting over.

## Working with us

This engagement started with a scoping conversation about what the client actually needed versus what they thought they needed. The architecture decisions followed from that — Azure because they were already there, a custom HubSpot app because generic integrations would not do the job.

If you need a purpose-built operations platform and want engineering that is designed to last rather than demo well, [let us know](/contact).


# Architecture Guides

---

## Audit Logging for AI Agents
Source: https://tampadynamics.com/resources/architecture-guides/audit-logging-ai-agents

> A reference architecture for capturing, storing, and querying the audit trail of an AI agent system in regulated environments.

Date: 2026-04-18

If your AI agent operates in a regulated environment — healthcare, legal, financial services — the audit log is not a feature. It is the artifact you produce, after the fact, to answer the questions that an auditor or breach investigator will ask.

This guide covers the schema, storage, and query patterns we use for AI agent audit trails on AWS. The patterns generalize to other clouds.

## What the audit log has to answer

Before designing the schema, write down the questions. Ours, derived from real conversations with Security Officers and external auditors:

1. Who used the AI agent during a given window?
2. For a specific user, what did they ask, what did the agent see, and what did it return?
3. For a specific patient (or case, or account), every AI interaction that touched their record, with full content.
4. For a specific tool the agent has access to, every invocation — when, by whom, with what parameters, with what result.
5. For a specific suspected breach window, every model call with full content and result.
6. For a specific output the user is questioning, the full reasoning chain that produced it.

The schema has to make all six answerable in seconds, not days.

## The schema

We use a two-tier model: a metadata layer that is queryable and frequently accessed, and a body layer that is large, infrequently accessed, and pulled on demand.

### Metadata layer (DynamoDB)

```
{
  "pk": "CONV#<conversation_id>",
  "sk": "TURN#<timestamp>#<turn_id>",
  "user_id": "<application user id>",
  "tenant_id": "<customer id, matter id, or patient id>",
  "model_id": "anthropic.claude-3-5-sonnet-20241022-v2:0",
  "model_version": "<provider version>",
  "tool_calls": ["retrieve_policy", "lookup_patient_meds"],
  "input_token_count": 1248,
  "output_token_count": 412,
  "latency_ms": 2104,
  "rag_doc_ids": ["doc_a4f", "doc_71b"],
  "body_pointer": "s3://td-audit/conv/<id>/<turn_id>.json.gz",
  "outcome": "success",
  "approved_by": null,
  "ttl": 1893456000
}
```

This row is small — a few hundred bytes — and indexed by conversation, by user, by tenant, and by date. Common queries hit this layer alone.

### Body layer (S3, write-once)

```json
{
  "turn_id": "...",
  "timestamp": "2026-05-07T14:23:11.402Z",
  "user_id": "...",
  "tenant_id": "...",
  "input": {
    "prompt": "<full prompt including system message and history>",
    "user_message": "<verbatim user input>"
  },
  "context": {
    "rag_chunks": [
      { "id": "...", "doc_id": "...", "score": 0.84, "text": "..." }
    ],
    "visitor_context": {...}
  },
  "tool_calls": [
    {
      "name": "retrieve_policy",
      "params": {...},
      "result_summary": "...",
      "result_full": "<full result>"
    }
  ],
  "output": "<full model output>",
  "approval_chain": []
}
```

S3 with object lock in compliance mode gives you write-once semantics — no one, including admins, can modify or delete the object inside the retention window.

## Why two tiers

The metadata layer is queried constantly. "Show me every interaction this user had with this patient's record this week." That query hits DynamoDB — milliseconds, predictable cost.

The body layer is queried rarely, usually only during an investigation. "Show me the full prompt and output for these specific turns." That query reads from S3 — slower, but cheap to store at scale.

Putting everything in one tier produces either expensive DynamoDB rows (if you store full prompts) or slow queries (if you scan S3 for metadata).

## Retention

HIPAA requires six years for documentation related to PHI. Some workloads need longer — discovery in a litigation matter, contractual retention with a customer, regulatory holds.

We default to:

- DynamoDB: TTL set to 7 years from creation.
- S3: object lock retention period of 7 years, in compliance mode.
- Lifecycle policy moves S3 objects to Glacier Instant Retrieval after 90 days, Glacier Deep Archive after 1 year.

Review the retention numbers against your specific obligations. For pediatric records, HIPAA's clock can be different. For litigation holds, retention extends until the hold is released.

## Access control on the audit log

The audit log is itself sensitive. The bodies contain prompts, tool results, retrieved chunks — all of which can be PHI. Treat the log like any other PHI store:

- IAM policies limit who can read the audit data — typically a small Security and Compliance group, plus a break-glass role for incident response.
- Reads from the audit log are themselves logged. If a Security Officer queries the audit log, that query is captured in CloudTrail.
- Writes to the audit log come from a single service role used only by the agent runtime. No other path can write.
- KMS keys are separate from operational data keys. The audit-log key has a tighter access policy.

## What goes in the input

The full prompt — system message, conversation history, retrieved context, and the user's message. All of it. If a prompt is long enough that storing it is expensive, you have a different problem (prompt bloat) and should fix that, not skip the logging.

Common temptation: store only the user's message and reconstruct the prompt later. Do not. Prompt templates change. System messages change. The retrieved context changes by the second. The only way to know what the model actually saw is to log what was sent.

## What goes in the output

The full output, exactly as the model produced it, before any post-processing. If your application strips formatting, redacts PHI from display, or rewrites the output, log both versions.

## Tool calls

Every tool invocation — name, parameters, result. The result has to include enough detail that "did the agent get the right data" is answerable. For retrieval tools, log the chunk IDs and content. For database lookups, log the query and the row count or specific records returned. For writes, log the data written and the resulting state.

A common failure: logging the tool call but not the result, on the theory that the result is large and inferable. It is not inferable. The tool's downstream system may have changed since the call. Log the result.

## Approval chains for human-in-the-loop

When an AI proposal requires human approval before taking effect, the audit log records:

- The proposal (output of the model)
- The approver's identity
- The approver's decision (approve, edit, reject)
- The final action that was taken
- Any edits the approver made

If the approver edited the proposal, both the AI's original output and the human's edited version are in the log. The downstream system always acts on the human's version, and the log shows where the AI ended and the human began.

## Querying the log

For the six questions at the start, here is how each is answered:

1. **Who used the AI agent during a window?** GSI on `user_id` + date.
2. **What did this user ask?** GSI on `user_id`, ordered by timestamp. Pull bodies on demand.
3. **All AI interactions touching this patient?** GSI on `tenant_id`, ordered by timestamp.
4. **All invocations of a specific tool?** GSI on `tool_name` (extracted from `tool_calls`), or a separate tool-invocation table for high-volume tools.
5. **Specific suspected breach window?** Range query on timestamp, bodies pulled on demand.
6. **Reasoning chain for a specific output?** Look up turn by ID, pull the body.

In practice, all of these are written as predefined queries in the security team's runbook, not free-form data exploration.

## Common mistakes

**Logging in the application logs.** CloudWatch is not an audit log. It rotates, it is not write-once, and it does not have the access controls a HIPAA audit log requires.

**Storing the audit log in the same database as application data.** Operational queries and audit queries have different access patterns and different access controls. Mixing them creates risk.

**Logging too coarsely.** "Conversation completed at 14:23" is not an audit log. The minimum useful unit is the model invocation, with full input and output.

**Logging too verbosely.** Every keystroke, every component render, every cache miss does not belong in the audit log. The audit log is for events that have to be reconstructable for compliance purposes.

**No retention policy.** Logs that grow forever cost money and create discovery exposure. Define retention up front.

## Where to start

For a new AI agent project, the audit log is the first system to design, before the agent itself. Build the agent against the audit interface, so logging is a property of the system, not an afterthought.

If you are retrofitting audit logging onto an existing AI system, prioritize the metadata layer first — it is what answers the "who did what" questions that most investigations actually need. Body capture can come second.

---

## hipaa-compliant-architecture-aws
Source: https://tampadynamics.com/resources/architecture-guides/hipaa-compliant-architecture-aws

# HIPAA-Compliant Cloud Architecture on AWS
### A practical guide to designing and deploying healthcare applications on AWS while meeting HIPAA requirements
**Tampa Dynamics — Free Technical Resource**

Healthcare systems require strict attention to security, auditability, and data governance. AWS provides the infrastructure and tools to build HIPAA-compliant applications—as long as you architect them correctly.

This guide outlines the recommended architecture patterns, AWS services, CLI tools, and links to official resources you can use to deploy a secure, compliant, healthcare-ready platform.

---

# 1. AWS + HIPAA Compliance Basics

AWS signs a **Business Associate Addendum (BAA)** that allows you to process PHI using approved AWS services.

## Shared Responsibility Model

**AWS is responsible for:**
- Physical infrastructure
- Hypervisor, host OS, networking
- Managed service security

**You are responsible for:**
- Application security
- IAM design
- Encryption
- Logging/monitoring
- Workforce access controls

### Essential Links
- AWS HIPAA Overview: https://aws.amazon.com/compliance/hipaa-compliance/
- HIPAA Eligible Services: https://aws.amazon.com/compliance/services-in-scope/
- AWS Artifact (BAA Access): https://aws.amazon.com/artifact/

---

# 2. Reference Architecture for HIPAA Workloads

Standard Tampa Dynamics blueprint for healthcare platforms.

### Core Infrastructure
- VPC with private subnets  
- NAT gateway or VPC endpoints  
- ECS Fargate or Lambda  
- CloudFront → WAF → ALB → Private Services  
- S3 encrypted storage  
- DynamoDB for state  
- RDS PostgreSQL  
- CloudTrail, GuardDuty, Security Hub  
- Cognito + IAM Identity Center  

### AWS Documentation
- VPC: https://docs.aws.amazon.com/vpc/
- ECS Fargate: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/AWS_Fargate.html
- RDS PostgreSQL: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/
- S3 Security: https://docs.aws.amazon.com/AmazonS3/latest/userguide/security.html

---

# 3. HIPAA Security Requirements (Practical Breakdown)

## 3.1 Encryption
- At rest: KMS-managed encryption
- In transit: TLS 1.2+
- Key rotation via KMS

Docs: https://docs.aws.amazon.com/kms/

## 3.2 Identity & Access Management
- No long-lived IAM users
- Use IAM Identity Center
- RBAC
- Least privilege

Docs: https://docs.aws.amazon.com/IAM/latest/UserGuide/best-practices.html

## 3.3 Network Isolation
- PHI never in public subnets
- ALB is only public entry point
- Use VPC endpoints

## 3.4 Monitoring, Logging & Auditability
Enable CloudTrail, VPC Flow Logs, CloudWatch, GuardDuty, Security Hub.

Docs: https://docs.aws.amazon.com/awscloudtrail/

---

# 4. CLI Tools & Packages

## AWS CLI
```bash
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install
```
Docs: https://docs.aws.amazon.com/cli/

## AWS CDK
```bash
npm install -g aws-cdk
cdk bootstrap
```
Docs: https://docs.aws.amazon.com/cdk/

## Amplify Gen 2
```bash
npm install -g @aws-amplify/cli
amplify sandbox
```
Docs: https://docs.amplify.aws/

## Session Manager (SSH replacement)
```bash
aws ssm start-session --target i-XXXXXXX
```
Docs: https://docs.aws.amazon.com/systems-manager/

---

# 5. Security Tools

## Checkov (IaC scanning)
```bash
pip install checkov
checkov -d .
```
https://www.checkov.io/

## Trivy (Container scanning)
```bash
trivy image <image-name>
```
https://aquasecurity.github.io/trivy/

## Semgrep
```bash
semgrep scan .
```
https://semgrep.dev/

---

# 6. Healthcare Development Tools

## Medplum SDK
```bash
npm install medplum
```
Docs: https://www.medplum.com/docs

## FHIR Specs
https://www.hl7.org/fhir/

## Next.js Patterns
https://nextjs.org/docs

---

# 7. Example Architecture: Next.js + Medplum on AWS

### Frontend
- CloudFront/S3
- Server actions
- Strict CSP

### Backend
- ECS Fargate tasks
- Medplum agents
- AI/RAG workers

### Data
- RDS PostgreSQL
- DynamoDB
- S3

### Identity
- Cognito + MFA
- RBAC

---

# 8. Deployment Workflow

### Build
- SBOM
- Trivy, Semgrep, Checkov
- ECR scanning

### Deploy
- CDK → dev
- Manual → staging
- Compliance → prod

### Operations
- CloudTrail
- GuardDuty
- DR runbooks

---

# 9. Cost Optimization

- Prefer VPC Endpoints
- Fargate Spot
- S3 lifecycle policies
- Reduce CloudWatch retention
- OpenSearch UltraWarm/Serverless

---

# 10. Common HIPAA Mistakes

- Logging PHI  
- Public S3 buckets  
- Using non-BAA services  
- Developer access to prod DB  
- Secrets in .env files  

---

# 11. Further Reading

- HIPAA Architecture Whitepaper:  
  https://docs.aws.amazon.com/whitepapers/latest/architecting-hipaa-security-and-compliance/

- AWS Security Pillar:  
  https://docs.aws.amazon.com/wellarchitected/latest/security-pillar/

- AWS Healthcare Solutions:  
  https://aws.amazon.com/health/solutions/

- HL7 FHIR:  
  https://hl7.org/fhir/

---

## HIPAA-Compliant Cloud Architecture on AWS
Source: https://tampadynamics.com/resources/architecture-guides/hipaa-compliant-cloud-architecture

> A practical guide to designing and deploying healthcare applications on AWS while meeting HIPAA requirements.

Date: 2024-09-20

Building healthcare applications requires careful attention to regulatory requirements. This guide walks through the key architectural decisions for deploying HIPAA-compliant workloads on AWS.

## Understanding the Shared Responsibility Model

AWS provides HIPAA-eligible services, but compliance is a shared responsibility. AWS secures the infrastructure; you're responsible for how you configure and use it.

### What AWS Handles

- Physical security of data centers
- Network infrastructure security
- Hypervisor and host OS security
- Service-level encryption options

### What You Handle

- Data encryption in transit and at rest
- Access control and IAM policies
- Audit logging and monitoring
- Application-level security

## Core Architecture Components

### 1. Network Isolation

Start with a VPC designed for healthcare workloads:

```
VPC (10.0.0.0/16)
├── Public Subnets (ALB, NAT Gateway)
├── Private Subnets (Application tier)
└── Isolated Subnets (Database tier)
```

Key considerations:
- No direct internet access for resources handling PHI
- Use VPC endpoints for AWS services
- Enable VPC Flow Logs for network monitoring

### 2. Encryption Everywhere

**At Rest:**
- Use AWS KMS with customer-managed keys
- Enable default encryption on S3, EBS, RDS
- Consider client-side encryption for highly sensitive data

**In Transit:**
- TLS 1.2+ for all connections
- Use ACM for certificate management
- Enable HTTPS-only on CloudFront and ALB

### 3. Access Control

Implement least-privilege access:

- Use IAM roles, not long-lived credentials
- Enable MFA for all human access
- Implement attribute-based access control (ABAC) where possible
- Regular access reviews and credential rotation

### 4. Audit Logging

You must be able to demonstrate who accessed what, when:

- **CloudTrail** — API activity logging
- **Config** — Resource configuration history
- **GuardDuty** — Threat detection
- **Application logs** — Shipped to CloudWatch or a SIEM

Retain logs for a minimum of 6 years (HIPAA requirement).

## BAA Requirements

Before handling PHI on AWS, you must have a Business Associate Agreement in place. This is available through AWS Artifact for qualifying accounts.

Only use services covered under the BAA for PHI workloads. Check the [AWS HIPAA Eligible Services](https://aws.amazon.com/compliance/hipaa-eligible-services-reference/) list.

## Common Pitfalls

1. **Using non-eligible services** — Not all AWS services are HIPAA-eligible
2. **Insufficient logging** — You need comprehensive audit trails
3. **Overly permissive IAM** — Start with zero access and add permissions as needed
4. **Neglecting backups** — Test your disaster recovery regularly

## Next Steps

This guide covers the foundational architecture. For specific implementation guidance, consider:

- [Setting up a HIPAA-compliant CI/CD pipeline](/resources/architecture-guides/hipaa-cicd)
- [Implementing zero-trust networking](/resources/architecture-guides/zero-trust-healthcare)

Need help with your healthcare cloud architecture? [Let's talk](/contact).

---

## Multi-Tenant SaaS on AWS Amplify Gen 2
Source: https://tampadynamics.com/resources/architecture-guides/multi-tenant-saas-amplify-gen2

> Patterns for building a HIPAA-aligned multi-tenant SaaS on AWS Amplify Gen 2 — covering tenancy, auth, data isolation, and operational concerns.

Date: 2026-04-25

AWS Amplify Gen 2 is a sensible default for most B2B SaaS we ship. The framework gives you Cognito-based auth, AppSync GraphQL, DynamoDB-backed data, S3 storage, and Lambda functions in a TypeScript-first project structure. For multi-tenant SaaS — especially in regulated industries — the framework is enough infrastructure to get you started, with the right hooks to extend where you need to.

This guide covers the patterns we use for tenancy, isolation, and operations on Amplify Gen 2. It assumes you have done the basic setup and are now thinking about how to make the system work for multiple customers without leakage.

## What "multi-tenant" actually means

Three architectures get called "multi-tenant SaaS." They are different.

1. **Pool model.** All tenants share infrastructure. A `tenant_id` column on every row. Isolation is enforced in software.
2. **Bridge model.** Some resources are shared (the Cognito user pool, the application code), others are per-tenant (databases, S3 buckets, KMS keys).
3. **Silo model.** Each tenant gets its own stack — its own Cognito pool, its own DynamoDB tables, its own everything. Isolation is enforced at the AWS account or region level.

Pool is cheapest. Silo is most defensible. Bridge is the compromise most regulated SaaS converges on. Amplify Gen 2 supports all three; the design choice is yours, not the framework's.

For HIPAA workloads with sensitive data, we usually ship a bridge model: shared user pool, shared application code, but per-tenant data partitions, S3 bucket prefixes, KMS keys, and audit logs. The decision depends on customer expectations more than technical fit.

## Auth and tenancy

Amplify Gen 2 uses Cognito User Pools. For multi-tenant SaaS, the questions are:

- One user pool, with `tenant_id` as a custom attribute. Single sign-on per identity, simpler to operate.
- Multiple user pools, one per tenant. Stronger isolation, harder to operate at scale, often preferred by enterprise buyers.

We default to one pool with custom attributes. When a tenant's procurement requires a dedicated pool — not uncommon in healthcare — we move that tenant to a separate pool while keeping the rest in the shared one. This is not symmetric; it is a deliberate trade.

The tenant attribute has to be unforgeable at the application layer. In Amplify Gen 2, that means the tenant ID is set during user creation by the admin flow, not editable by the user, and read in every API call from the JWT — never from the request body.

```ts
// schema.ts
const schema = a.schema({
  Patient: a
    .model({
      mrn: a.string().required(),
      name: a.string().required(),
      tenantId: a.string().required(),
    })
    .authorization((allow) => [
      allow.ownerDefinedIn("tenantId").to(["read", "create", "update"]),
    ]),
})
```

The `ownerDefinedIn("tenantId")` rule binds the row to the JWT's tenant claim. Cross-tenant reads are physically impossible at the AppSync layer — not because the application code prevents them, but because the resolver does.

## Data isolation

For most workloads, AppSync's authorization rules are sufficient. The query engine enforces the tenant filter. There is no application-level path that can return another tenant's row.

For PHI specifically, we add belt-and-suspenders:

1. **Per-tenant DynamoDB partition keys.** The PK is `TENANT#<id>#PATIENT#<mrn>`. Even if authorization were misconfigured, a query has to know the tenant's ID to find the row.
2. **S3 prefix isolation.** Files are stored at `s3://td-app/<tenant_id>/<rest>`. IAM policies restrict each Lambda role to its tenant's prefix where possible. Cross-tenant Lambdas (admin, audit) have explicit, logged exceptions.
3. **KMS keys per tenant.** For high-value tenants, the KMS key is per-tenant. Encryption at rest is enforced by the key, not just the storage configuration.

The cost: more KMS keys, more S3 prefixes, slightly more complex IAM. The benefit: a misconfigured authorization rule does not produce a tenant breach. The cryptographic boundary stops it first.

## Custom Lambda paths

Amplify Gen 2 makes Lambda functions a first-class object in the schema. For business logic that goes beyond CRUD, you write a function and bind it to a query or mutation.

For multi-tenant work, the patterns:

- Read the tenant ID from the JWT at the top of the function. Never trust a tenant ID in the request body.
- Pass the tenant ID through to every downstream call — DynamoDB queries, S3 operations, Bedrock invocations, third-party APIs.
- Log the tenant ID with every log line. CloudWatch Logs Insights queries that filter by tenant become trivial.

```ts
export const handler: Schema["createPriorAuthRequest"]["functionHandler"] = async (
  event
) => {
  const tenantId = event.identity.claims["custom:tenantId"]
  if (!tenantId) throw new Error("missing tenant claim")

  const patient = await getPatient(tenantId, event.arguments.mrn)
  if (!patient) throw new Error("not found")

  // ...
}
```

## RAG and AI workloads

When the SaaS includes an AI feature — RAG over a tenant's documents, an agent that operates on tenant data — tenancy has to extend to the AI layer.

The pattern we use on Amplify Gen 2:

1. **Vector index per tenant.** Each tenant gets its own Bedrock Knowledge Base, OpenSearch index, or pgvector table. Cross-tenant retrieval is impossible.
2. **Tenant ID injected into every Bedrock call.** As metadata in the model invocation, captured in CloudTrail.
3. **Audit log scoped to tenant.** Every AI interaction in the audit log carries the tenant ID. Tenant-scoped queries to the audit log are first-class.

Amplify's `defineFunction` and `defineStorage` make per-tenant resources straightforward to provision. The catch is the operations: more resources to monitor, more cost lines, more lifecycle to manage. We invest in automation here from day one — no manual provisioning of per-tenant resources, ever.

## Audit and compliance

For HIPAA SaaS on Amplify Gen 2:

- Enable CloudTrail at the management-event level, with data events for the S3 buckets and DynamoDB tables that hold PHI.
- Enable Config rules for the standard set: encryption at rest, public access blocked, MFA required.
- Use Security Hub or a third-party CSPM for ongoing posture monitoring.
- Send application logs to CloudWatch with structured logging (JSON, with tenant ID, user ID, request ID).
- Set up an audit log pipeline as described in [Audit Logging for AI Agents](/resources/architecture-guides/audit-logging-ai-agents) for any AI interactions.

Amplify Gen 2 itself is HIPAA-eligible under the AWS BAA. The eligibility is on the underlying services (AppSync, Lambda, DynamoDB, S3, Cognito), not the framework branding. Verify each service in your data path is on the BAA list.

## Operations and rollout

Per-tenant resources require automation. The patterns we use:

- **Tenant onboarding as a step function.** Provisions per-tenant DynamoDB partition, S3 prefix, KMS key, vector index, audit log path. Idempotent. Logged.
- **Tenant offboarding as a step function.** Inverse of onboarding, with a hold period before destructive operations to allow contract dispute resolution.
- **Per-environment promotion.** Separate dev / staging / prod stacks, with the same tenancy model. Tenant data does not move between environments.
- **Per-tenant deploys are not a thing.** Application code is the same for all tenants; configuration is per-tenant.

## When Amplify Gen 2 stops being the right tool

For a small number of high-value tenants whose security posture demands hard isolation — separate AWS accounts, separate VPCs, separate everything — Amplify Gen 2 is not the natural fit. AWS Organizations + a CDK stack per tenant is more work but produces a stronger isolation story.

The line we draw: bridge model on Amplify Gen 2 for the long tail of tenants. Silo model on a CDK stack for the handful of tenants whose contracts require it. The same application code runs in both.

## Where to start

If you are at the beginning of a multi-tenant SaaS project, the design choices to make first:

1. **Pool, bridge, or silo?** Tied to your customer expectations and your compliance posture, not to your technology stack.
2. **One Cognito pool or many?** Affects user management, SSO, and admin tooling.
3. **What is your audit log strategy?** Has to be solid before you have customers.
4. **What is your tenant onboarding automation?** Has to be solid before you have customers.

The technology — Amplify Gen 2 vs. raw CDK vs. something else — is downstream of these decisions. We have shipped both Amplify Gen 2 and CDK-native architectures for similar problems. The framework matters less than the discipline.

---

## RAG Cost Models in Production
Source: https://tampadynamics.com/resources/architecture-guides/rag-cost-models-production

> How to think about — and budget for — the cost of a retrieval-augmented generation system in production. Covers embedding cost, retrieval cost, model invocation cost, and the operational tail.

Date: 2026-04-30

RAG systems have a reputation for being cheap to prototype and surprisingly expensive in production. The reputation is half-right. Production RAG is not expensive because of any single line item; it is expensive because the cost components compound across volume in ways that are not obvious until you look at the bill.

This guide breaks down where the money goes and what to do about it. Numbers are illustrative — the actual figures move with provider pricing — but the structure is stable.

## The four cost components

RAG cost in production breaks into four buckets:

1. **Embedding generation.** Each document chunk is embedded once at indexing time, plus every query is embedded at retrieval time.
2. **Vector storage and retrieval.** The vector database costs scale with the number of vectors stored and the queries served.
3. **LLM invocations.** The model call that synthesizes the answer from the retrieved context.
4. **Operations.** Logging, monitoring, retraining or re-indexing on changes, and the engineering time to maintain the system.

Most cost surprises come from underestimating one of these — usually #4.

## Embedding costs

For a standard embedding model (`text-embedding-3-large`, Cohere `embed-english-v3.0`, Bedrock Titan Embeddings, etc.), embedding cost is on the order of $0.10–$0.13 per million tokens.

For a document corpus of 100,000 documents averaging 5 pages each (about 2,500 tokens), that is 250M tokens — roughly $25–$33 to embed the whole corpus once. Trivial.

Where embedding cost gets interesting:

- **Re-indexing.** Every time you change embedding model, re-chunk, or update a significant fraction of the corpus, you re-embed. If you re-embed monthly, your annual embedding cost is 12× the one-time cost.
- **Query embedding.** Each query is embedded. At 1M queries per month with 200 token average queries, that is 200M tokens per month — about $20–$26.

The total embedding spend on most production RAG systems is small. It is rarely the line item to worry about.

## Vector storage and retrieval

Vector database cost depends heavily on the choice:

- **Bedrock Knowledge Bases backed by OpenSearch Serverless.** Pricing is OCU-based; a small workload is on the order of hundreds of dollars per month, scaling with index size and query rate.
- **Aurora PostgreSQL with pgvector.** The cost is the Aurora cluster — db.r6g.large is around $250/month plus storage. Scales with workload, not specifically with vectors.
- **Self-hosted Weaviate / Qdrant on EC2.** Compute and EBS costs. For a few million vectors, single-digit hundreds per month. For larger indexes, replication and HA add up.
- **Managed vendor (Pinecone, Weaviate Cloud).** Per-pod or per-RU pricing. Can be the cheapest at small scale, can be expensive at high query volumes.

For most regulated SaaS workloads in the < 10M vector range, vector storage is a few hundred dollars per month, climbing with index size and query volume. The cost is meaningful but rarely the largest line.

## LLM invocations — the line that dominates

This is where the bill lives. For a Claude 3.5 Sonnet call (a good default for many RAG workloads):

- Input: $3 per million tokens
- Output: $15 per million tokens

A typical RAG call has ~3,000 input tokens (system prompt + retrieved chunks + conversation history + user query) and ~500 output tokens. That is $0.009 input + $0.0075 output = ~$0.017 per call.

At 1M calls per month, that is $17,000.

This is the right ballpark to budget. The variations:

- **Smaller / cheaper model.** Claude Haiku, GPT-4o-mini, Llama 3.1 70B Instruct. 5–10× cheaper. Often good enough for retrieval-grounded answers, frequently insufficient for complex reasoning.
- **Larger / more capable model.** Claude Opus, GPT-4o. 5× more expensive than Sonnet, marginal accuracy gains for most RAG tasks.
- **Longer context.** Larger retrieved chunks, more conversation history, larger system prompts. Each turn becomes more expensive.

For most production RAG systems, LLM invocations are 60–85% of the AI-related bill.

## Operations — the line you forget

The cost line that surprises teams most is operations:

- **Audit logging.** S3 storage with object lock, lifecycle to Glacier, CloudWatch Logs ingestion. Scales with invocation volume. For 1M invocations per month with full prompt/output capture, this is $200–$800 per month.
- **Re-indexing on corpus updates.** If your corpus changes daily, you are running an indexing pipeline daily. Embedding cost (small), Lambda or Fargate cost (modest), bandwidth.
- **Evaluation harness.** A held-out eval set runs against every model or prompt change. Each eval run costs as much as a normal invocation. Run it on every PR and you are adding several percent to your invocation costs.
- **Engineering time.** RAG quality is an ongoing tuning problem. Budget for 0.25–1 FTE of engineering attention per significant RAG system, indefinitely.

Operations is where production RAG costs more than prototype RAG. The prototype skips logging, evaluation, and re-indexing automation. Production cannot.

## Where the savings actually come from

Tactics we use to control RAG cost in production, in order of impact:

### 1. Smaller models where possible

The biggest lever is moving non-critical paths to a smaller model. Use cases:

- **Intent classification and routing.** Haiku or GPT-4o-mini does this fine. No reason to spend Sonnet tokens on it.
- **Simple summarization.** Haiku is good enough for short summaries.
- **Generation when retrieval is high-confidence.** If retrieval scored well above threshold, the synthesis task is mechanical — a smaller model handles it.

A routing layer that sends 70% of traffic to Haiku and 30% to Sonnet cuts the LLM bill substantially.

### 2. Caching

Identical or near-identical queries hit a cache instead of the model. The hit rate depends on workload — a customer support assistant might see 40% cache hits; a document analysis tool might see 5%.

The infrastructure: a key derived from the query, retrieved chunks, and model version. Redis or ElastiCache in front of the LLM call. Cache TTL based on data freshness requirements.

### 3. Prompt compaction

The retrieved chunks dominate the input token count. Tactics:

- Re-rank and keep only top-K chunks instead of all retrieved. Top-3 instead of top-10 cuts input tokens by 70%.
- Summarize older conversation history rather than carrying the full transcript.
- Externalize stable system instructions to model-side caching where the provider supports it (Anthropic prompt caching cuts the cost of repeated system prompts dramatically).

### 4. Output budgeting

Set `max_tokens` to a realistic ceiling. The model produces what it needs and stops. Without a ceiling, the model sometimes generates much longer responses than the use case requires.

### 5. Embedding model selection

For most workloads, the cheapest BAA-covered embedding model is fine. Spending more on embedding rarely translates to retrieval quality gains worth the operational complexity.

## Budget framework

For a planning conversation with a finance partner, here is the rough shape:

- **Pilot (10K–100K invocations/month, < 1M vectors).** $500–$2,500 per month all-in. Embedding negligible, vector DB modest, LLM tens to low hundreds, ops modest.
- **Production (1M invocations/month, 1–10M vectors).** $15K–$30K per month. LLM dominates, ops growing, vector DB and embedding are background noise.
- **Enterprise scale (10M+ invocations/month).** Six figures per month. Routing, caching, and prompt compaction become essential to keep the LLM line manageable.

These numbers are illustrative, not pricing commitments. Real numbers depend on the model mix, the cache hit rate, and the complexity of each invocation.

## What to track

If you are operating a production RAG system, the cost metrics to put on a dashboard:

1. **Cost per invocation.** Total cost / total invocations per month. Tracks model selection and prompt compaction.
2. **Cache hit rate.** Cached invocations / total invocations. Higher is cheaper.
3. **Cost per active user.** Total cost / monthly active users. Connects cost to value.
4. **Tokens per invocation.** Input and output, separately. Trends here reveal prompt bloat early.
5. **Operations overhead.** Logging, eval, re-indexing as a percentage of total cost.

A dashboard that shows these trends weekly catches cost regressions before they become surprises.

## Where most teams overspend

The pattern we see most often: a team builds a working RAG system, deploys it, and never re-evaluates the model selection. They use the most capable model for everything because it was the right choice during prototyping. A year later, 80% of the workload would run fine on a model 5× cheaper, and the bill is 4× what it could be.

The fix is not exotic. Profile actual workload by complexity. Route accordingly. Re-profile every quarter. The savings show up immediately, the engineering work is bounded, and the system gets cheaper without getting worse.

---

## Vector Database Selection for HIPAA Workloads
Source: https://tampadynamics.com/resources/architecture-guides/vector-db-selection-hipaa

> A practical comparison of vector database options for healthcare AI workloads — covering BAA coverage, tenant isolation, encryption, and operational fit.

Date: 2026-04-15

When you build a RAG system that touches PHI, the vector database is in the data path. That means it has to be on your BAA, encrypted at rest with controls you can demonstrate, isolated by tenant, and capable of producing audit logs that are useful to a Security Officer.

This guide covers the options we have shipped to production and the ones we have considered and ruled out, with the trade-offs that matter in HIPAA-aligned work.

## What a HIPAA-aligned vector store needs

Before comparing products, the requirements:

1. **BAA coverage.** Either the vector database is BAA-covered itself, or it runs inside a covered service (AWS, Azure, GCP under their respective BAAs) on infrastructure you control.
2. **Encryption at rest with customer-managed keys.** AWS KMS, Azure Key Vault, or GCP Cloud KMS, with the ability to rotate keys and demonstrate which key encrypted which records.
3. **Encryption in transit.** TLS 1.2+, with no exceptions for internal traffic.
4. **Tenant isolation.** Either separate indexes, namespaces, or hard metadata filters that you can prove cannot leak across boundaries.
5. **Audit logging.** Every query, write, and admin action with a user identity, timestamp, and result. CloudTrail or equivalent.
6. **Access control.** IAM-style permissions, not just a shared API key.
7. **Right-to-deletion.** Specific records can be deleted on demand, and the deletion is verifiable.

Anything that does not satisfy all seven gets ruled out for PHI workloads.

## Options we ship to production

### Amazon Bedrock Knowledge Bases

For AWS-native HIPAA workloads, this is the default. Knowledge Bases is a managed RAG layer that ingests from S3, generates embeddings, stores them in OpenSearch Serverless or Aurora PostgreSQL with pgvector, and serves retrieval-augmented queries — all under the AWS BAA.

The trade-off: less control over chunking, embedding model selection, and retrieval logic than rolling your own. For workloads where the standard configuration is good enough, the operational simplicity is a real win.

### OpenSearch (managed) with kNN

OpenSearch Service on AWS is HIPAA-eligible. The k-NN plugin handles vector search natively, and OpenSearch's text search is mature, so hybrid retrieval (vector + BM25) works well.

Use this when you need full control over indexing, custom retrieval logic, and you are already operating other OpenSearch workloads.

### Aurora PostgreSQL with pgvector

For teams already running Postgres, pgvector is the lowest-friction path. Aurora is HIPAA-eligible. Encryption, IAM, audit logging, and access control come from the database layer you already operate.

The trade-off: pgvector is fast enough for indexes up to a few million vectors, but tuning it for larger workloads requires care. Index choice (HNSW vs IVFFlat), embedding dimensionality, and query patterns all matter.

### Azure AI Search

For Microsoft-centric workloads, Azure AI Search is the equivalent of OpenSearch with vector support, integrated with Azure OpenAI for embeddings. HIPAA-covered under Microsoft's BAA.

## Options we have considered and ruled out for PHI

### Pinecone

Pinecone is fast, well-engineered, and the team behind it knows what they are doing. For workloads that do not involve PHI, it is often the right answer.

For HIPAA, the question is whether your Pinecone deployment is BAA-covered. Pinecone offers HIPAA compliance through specific AWS-hosted deployments under a signed BAA, but the deployment options and pricing are different from the standard product. The lift to verify BAA scope, keep deployments inside the covered tier, and document the data path adds operational overhead. For most clients we work with, the BAA-covered AWS-native options are simpler.

If your team has specific needs Pinecone serves better — multi-region replication, very large indexes, specific performance characteristics — and you are willing to do the BAA verification work, it remains a viable option.

### Weaviate, Qdrant, Milvus (self-hosted)

These are excellent open-source vector databases. The catch for HIPAA is that you operate them yourself. Encryption at rest with customer-managed keys, audit logging, access control, backup and recovery, patching, and incident response are all your responsibility.

We use these for non-PHI workloads where the operational burden is justified by performance or cost. For PHI workloads, the engineering and ongoing operations cost rarely justifies the savings over a managed BAA-covered option.

### Chroma, FAISS

These are libraries, not databases. They are the right tool for prototyping or for embedded use cases where the index is small and lives in your application. They are not a fit for production PHI workloads — there is no audit log, no access control, no operational story.

## Decision framework

1. **AWS-native and you want minimal ops?** Bedrock Knowledge Bases.
2. **AWS-native and you need full retrieval control?** OpenSearch with kNN.
3. **Postgres-centric team?** Aurora PostgreSQL with pgvector.
4. **Microsoft-centric?** Azure AI Search.
5. **Have specific Pinecone-only needs and willing to do BAA verification?** Pinecone on the covered tier.
6. **Self-hosting open-source for PHI?** Probably not. The total cost of ownership is higher than it looks.

## What "tenant isolation" really means

Whichever option you pick, tenant isolation has to be verifiable. The two patterns we use:

- **Index-per-tenant.** Each customer (or each matter, or each clinic) gets its own index. Cross-tenant queries are physically impossible.
- **Single index with hard metadata filters.** All vectors share an index, but every query includes a tenant filter. The filter is enforced at the application layer before the vector search runs.

Index-per-tenant is more expensive but easier to defend. Single index with filters is cheaper but requires meticulous code review — a single missed filter is a tenant breach. We default to index-per-tenant for healthcare and legal work.

## Audit logging requirements

For each retrieval, log:

- The requesting user identity (carried from the application layer)
- The query (or its embedding hash, if the query itself is PHI)
- The retrieved chunk IDs
- The matter / case / patient context
- The timestamp and latency

These logs are PHI. Treat them like any other PHI store: encrypted, access-controlled, retained per HIPAA's six-year minimum, and never exported to systems lacking BAA coverage.

## Where to start

If you are in early architecture for a HIPAA RAG workload, the cheapest move is to default to Bedrock Knowledge Bases or Aurora pgvector and only deviate when you have specific evidence the default does not work. The temptation to over-engineer the vector layer is strong; resist it. The retrieval quality problems you will hit are almost always about chunking, embeddings, and re-ranking — not the vector database itself.


# Comparisons

---

## AWS Bedrock vs. Azure OpenAI for Regulated Workloads
Source: https://tampadynamics.com/compare/aws-bedrock-vs-azure-openai-regulated

> A practical comparison of the two BAA-eligible managed LLM platforms for healthcare, legal, and SOC 2 workloads — from a team that has shipped production systems on both.

Date: 2026-05-02

## The honest framing

Both platforms work for regulated workloads. Both offer BAAs. Both run inside your covered cloud tenancy. Both have audit logging that meets HIPAA's technical safeguards.

The decision is rarely about the platform's intrinsic merits. It is about where your data, your identity provider, and your existing compliance documentation already live. Moving an LLM endpoint to a different cloud than your data lives on creates a cross-cloud data path that has to be designed, monitored, and explained to auditors. That is work, and it produces nothing the customer cares about.

Pick the cloud where your sensitive data already sits. The model is a small part of the system; the data path is the whole game.

## What Bedrock does well

Amazon Bedrock is a native fit for AWS-centric architectures:

- **Model selection.** Bedrock offers Anthropic Claude, Meta Llama, Mistral, Cohere, AI21, Amazon Titan, and others through a single API. You can switch models by changing a parameter, not by re-architecting.
- **Knowledge Bases.** A managed RAG layer that ingests documents from S3, generates embeddings, and serves retrieval-augmented queries — all under the same BAA as Bedrock itself.
- **Bedrock Guardrails.** Configurable content filters, PII redaction, and prompt-injection detection that operate inside the BAA.
- **IAM integration.** The same IAM model that controls every other AWS service. Permissions, roles, and audit trails are familiar.
- **VPC endpoints.** Bedrock invocations can be confined to your VPC. No traffic over the public internet.

Bedrock also has CloudWatch and CloudTrail integration that makes audit logging straightforward — every model invocation is a CloudTrail event you can route, retain, and query.

## What Azure OpenAI does well

Azure OpenAI is the natural choice when the rest of your stack is Microsoft:

- **GPT-class models with enterprise terms.** Azure OpenAI was the first BAA-covered home for GPT-4-class models and remains the place most enterprises run them.
- **Entra integration.** Authentication, conditional access, and identity-based controls flow through Microsoft Entra natively.
- **Microsoft 365 integration.** If your knowledge base is SharePoint, OneDrive, and Teams, Azure OpenAI plus Azure AI Search is the lowest-friction stack.
- **Private networking.** Azure Private Link, VNet integration, and customer-managed keys are all first-class.
- **Compliance documentation.** Microsoft maintains some of the most thorough compliance documentation in the industry, including specific HIPAA, HITRUST, and SOC 2 attestations for Azure OpenAI.

For organizations whose IT and security teams already operate in Azure, Azure OpenAI is a known quantity in a known environment.

## Where they are equivalent

For most healthcare and legal workloads, the model quality difference between Claude on Bedrock and GPT-4-class on Azure OpenAI is not the deciding factor. Both are good enough. Both will surprise you in similar ways. Both will benefit from the same prompt engineering investment.

What differs is the surrounding system — retrieval, identity, networking, logging, deployment automation. That is where the cloud-fit decision pays off.

## A decision framework

1. **Where does your sensitive data already live?** AWS → Bedrock. Azure → Azure OpenAI. Both → keep them separate, do not bridge.
2. **Where does your identity live?** AWS IAM / SSO via IAM Identity Center → Bedrock fits. Microsoft Entra → Azure OpenAI fits.
3. **What is your compliance documentation already aligned to?** Re-aligning compliance docs across clouds is a non-trivial cost.
4. **Do you have a strong preference for a specific model family?** Claude on Bedrock, GPT on Azure. Otherwise, treat the model as fungible.
5. **Are you an AWS or Microsoft Partner?** Partner programs, support, and credits often tilt the calculation.

## What we usually ship

Most of our healthcare and SOC 2 work ships on AWS Bedrock, because most of those clients already had AWS BAAs and existing AWS infrastructure. The lift to add Bedrock to an AWS-native architecture is hours, not weeks.

For Microsoft-centric clients, Azure OpenAI is the right call. We have shipped both. The systems are similar at the architecture level; the integration details differ.

## Cross-cloud is the trap

The case we counsel against, almost always: data in AWS, model in Azure (or vice versa). The cross-cloud data path requires its own networking, its own egress costs, its own BAA mapping, its own audit story. Unless you have a specific reason — a model you cannot get on the other cloud, a regulatory requirement that forces it — pick one cloud and stay there for the AI workload.

---

## Boutique Consultancy vs. Big 4 Firms
Source: https://tampadynamics.com/compare/boutique-vs-big-consultancies

> When specialized expertise beats enterprise overhead for regulated industry software projects.

Date: 2025-01-15

## The Real Difference

The choice between a boutique consultancy and a Big 4 firm often comes down to what you actually need versus what looks good in a vendor selection committee.

Big 4 firms excel at large-scale transformation programs where brand recognition matters for stakeholder alignment, where you need boots on the ground across multiple regions, and where the project is more about organizational change than technical delivery.

Boutique consultancies like Tampa Dynamics excel when the work is technical, when you need senior engineers who will actually write code (not just manage people who write code), and when deep expertise in a specific domain matters more than breadth across every industry.

## What You Get With a Boutique Firm

**Senior engineers from day one.** When you hire us, you work directly with the people who design and build your system. There's no bait-and-switch where partners sell the engagement and junior consultants deliver it.

**Domain expertise that matters.** We focus on healthcare, legal, and compliance-driven industries. We understand HIPAA, attorney-client privilege, and SOC 2 requirements because we work with them daily—not because we read about them in a pre-engagement briefing.

**Accountability without bureaucracy.** When something needs to change, we change it. There's no waiting for approval from multiple layers of management or navigating complex internal processes.

## When to Choose Big 4

Big 4 firms make sense when:

- You need a brand name to get internal buy-in from risk-averse stakeholders
- The project scope is truly massive and requires hundreds of consultants
- You need presence in multiple countries with local regulatory expertise
- The engagement is primarily about organizational change, not technical delivery
- Your procurement process strongly favors established enterprise vendors

## When to Choose Boutique

A boutique consultancy makes sense when:

- You need senior technical expertise, not junior resources with oversight
- The project requires deep domain knowledge in a specific industry
- You want to move fast without enterprise overhead
- You're looking for a long-term partner, not a vendor you'll replace next year
- Value for money matters more than brand recognition

## The Bottom Line

Both models work. The question is which one fits your actual needs—not which one looks better in a procurement justification document.

If your primary challenge is technical and requires deep expertise in regulated industries, you'll likely get better results from a specialized firm that does this work every day than from a generalist that can do everything adequately.

---

## Build vs. Buy: Document Automation for Law Firms
Source: https://tampadynamics.com/compare/build-vs-buy-document-automation-legal

> A practical framework for deciding between off-the-shelf legal document automation platforms and a custom system built around your firm's actual workflow.

Date: 2026-04-26

## The honest framing

Most legal document automation projects we see start with the same question: "Should we buy HotDocs / Contract Express / Documate / Gavel, or build something?" The answer depends on what you are actually trying to automate.

For straightforward document assembly — clause libraries, conditional logic, intake forms feeding template merges — buying is almost always the right answer. The vendors have been at it for two decades. Their products work.

The question gets harder when AI-assisted drafting is in scope, when privilege is non-negotiable, or when the document workflow needs to integrate deeply with the rest of the firm's systems. Those are the cases worth this analysis.

## What "buy" gets you

The major SaaS document automation platforms give you:

- **A template engine.** Mature, well-documented, with conditional logic, computed fields, and clause libraries.
- **Intake forms.** Web-based questionnaires that feed the template engine.
- **Clause and template libraries.** Some vendor-curated, some community-contributed.
- **Integrations.** Connectors to common DMS, e-signature, and matter management platforms.
- **A maintained product.** Updates, support, and feature roadmap.

The trade-offs:

- Privileged content lives in the vendor's tenancy. Your BAA equivalent for legal — your engagement letter and your bar's confidentiality rules — has to extend to them.
- The workflow is the vendor's workflow, with configuration. If your firm's review process does not fit, you bend.
- AI features in vendor products are improving fast, but they are designed for the average customer, not for your firm's specific risk posture.

## What "build" gets you

A custom system gives you:

- **Workflow that fits.** Review steps, escalations, conflict checks, and approval routing match your firm exactly.
- **Tenancy control.** Documents and AI invocations stay inside infrastructure you own, with audit logs you control.
- **Deep integration.** Direct API connections to your matter management, billing, conflict, and DMS systems. No vendor middleware.
- **AI-assisted drafting on your terms.** Custom prompts grounded in your firm's clause library, with citation requirements and privilege boundaries enforced in code.

The trade-offs:

- You own the template engine. Even if you reuse open-source libraries, you are responsible for the conditional logic and the rendering pipeline.
- You own the maintenance. Bugs, browser compatibility, and accessibility are yours.
- Your timeline is months, not days. A meaningful build is a multi-phase engagement, not a weekend project.

## Where AI changes the calculation

AI-assisted drafting — where the model proposes language based on a clause library, opposing counsel's positions, or matter-specific facts — is harder to do credibly through an off-the-shelf product than it sounds.

The reasons are about confidentiality and audit. A drafting AI needs to see the matter's facts. Those facts are often privileged. The draft itself is privileged. The vendor's pipeline that touches them is operating on privileged content. If something goes wrong in the vendor's infrastructure, the firm has a problem.

Custom builds let you keep the full pipeline inside your tenancy. The model lives behind your BAA-equivalent (Bedrock, Azure OpenAI). The retrieval index of your clause library is in your account. The audit log is yours. When a partner asks "show me what the AI saw and what it produced for this matter," you have a clean answer.

## A decision framework

Ask:

1. **What document types?** Standard transactional documents, well-served by vendor templates. Specialized litigation documents, motion practice, complex agreements — leaning toward build.
2. **AI in scope?** If yes, the privilege boundary becomes load-bearing. Build is the safer default.
3. **Integration depth?** If you need the system to read the matter, write to billing, check conflicts, and update the DMS in one flow, build is usually faster than wiring six vendor integrations.
4. **Volume and value?** A small firm with a handful of matter types is usually well-served by buy. A firm with high volume, specialized practice, and meaningful AI ambitions usually does better with build over a 12–24-month horizon.

## Hybrid often wins

In practice, the right answer is rarely 100% build or 100% buy. A common pattern: keep the off-the-shelf platform for the routine documents your associates churn through; build custom infrastructure for the matters where AI assist and privilege matter. The two coexist in the firm's tech stack.

If you are working through this decision, an architecture review is the cheapest way to get clarity on which side of the line each of your document types should fall.

---

## Custom AI vs. ChatGPT Enterprise for Healthcare
Source: https://tampadynamics.com/compare/custom-ai-vs-chatgpt-enterprise-healthcare

> When a managed AI subscription is the right answer for a healthcare team — and when it stops being one.

Date: 2026-04-22

## The actual decision

The framing of "build vs. buy" does not capture this choice well. ChatGPT Enterprise and Microsoft Copilot are managed assistants — productivity tools for individual users. A custom AI system is an application: scoped to a specific workflow, integrated with your data, deployed inside your tenancy. They are different categories of thing.

The question is not which one is better. The question is which category your problem belongs in.

## When ChatGPT Enterprise / Copilot is the right answer

A managed assistant is the right tool when:

- The use case is general knowledge work — drafting, summarizing, brainstorming — not a clinical or operational workflow.
- PHI is not in scope. Either the work does not touch PHI, or you have a process to keep PHI out of the assistant entirely.
- The data the assistant needs to see is general internal documents, not regulated patient records.
- A per-seat subscription cost model fits your usage pattern.
- You want minimal IT overhead and immediate availability for your users.

Both ChatGPT Enterprise and Microsoft Copilot offer BAAs in their enterprise tiers, but a BAA does not automatically make every workflow you build on top of them appropriate. The BAA is a necessary condition; whether to handle PHI through a managed assistant is still a design decision.

## When custom AI is the right answer

A custom AI system is the right tool when:

- The workflow involves PHI, and you need full control over the data path, retrieval, and audit log.
- The workflow has specific tools — retrieving from your policy library, writing to your case management system, calling your eligibility API — that a managed assistant cannot expose.
- You need an audit log that captures every model invocation tied to a specific user, case, and clinical context, in a format your compliance team can query.
- The AI is part of a product you ship — not a tool used internally — and the product needs to be defensible end-to-end.
- The use case has volume and value that justifies the engineering investment.

## What custom does not mean

Custom does not mean training your own model. The model is almost always a foundation model from Amazon Bedrock, Azure OpenAI, or Anthropic — selected because the provider has a BAA and runs the model inside your covered cloud.

Custom means: the application that wraps the model, the retrieval that grounds it, the tools it can call, the guardrails on its outputs, and the audit log of everything it does. Those are yours.

## A decision framework

Ask the following:

1. **Does the workflow touch PHI?** If yes, custom is the default. Managed assistants can be made to work, but the burden of proof is on the deployer.
2. **Does the workflow need to call your systems?** Managed assistants offer plugins and connectors, but they are limited compared to a system designed against your APIs.
3. **What does your audit log look like?** If a Security Officer cannot get the answer to "show me every AI interaction this user had with this patient's data" from a managed assistant, you may need custom.
4. **Will this AI be embedded in a product you sell?** If yes, custom — your customers' compliance posture depends on yours.

## Hybrid is fine

The two are not mutually exclusive. Many teams use a managed assistant for general productivity and a custom system for the workflow that has compliance teeth. The mistake is using a managed assistant for the workflow because it was easy, then trying to retrofit it for compliance later.

## Engagement starting point

If you are unsure which side of the line your workflow falls on, an architecture review is the right first step. We will look at the data, the regulatory exposure, and the cost model and tell you what we think — not what we hope you will want to buy.

---

## RAG vs. Fine-Tuning for Compliance Use Cases
Source: https://tampadynamics.com/compare/rag-vs-fine-tuning

> When to retrieve and when to retrain — a practical comparison for AI workloads in HIPAA, SOC 2, and other audit-bound environments.

Date: 2026-04-29

## The default answer

For almost any compliance-bound AI workload, the answer is RAG. We say this not because RAG is fashionable, but because the data lifecycle obligations in regulated industries push you toward keeping data out of the model.

Fine-tuning encodes the training data into the weights. Whatever you train on — patient notes, contract clauses, internal investigation memos — becomes a property of the resulting model. A right-to-deletion request, a contract termination requiring data return, or a discovery of training data that should not have been used means retraining the model. That is expensive, slow, and often impossible to verify.

RAG keeps the data in an index you control. Deletion is a delete query. Updates are reindexing. Audit logs of which chunks were retrieved for which query are first-class artifacts. None of that exists in a fine-tuned model.

## Where fine-tuning wins

Fine-tuning is the right tool when:

- **The task is stylistic.** You need outputs in a very specific format — a particular section structure, vocabulary, or tone — that prompting cannot reliably enforce.
- **The base model is consistently failing on a narrow pattern.** You have evidence from evaluation that the model gets a specific class of inputs wrong, and prompt engineering has not closed the gap.
- **The data is non-sensitive and stable.** Public-domain text, your own marketing voice, deterministic patterns that do not change quarterly.
- **Latency or token budget is binding.** Fine-tuned models can be smaller, faster, and cheaper at inference if the use case is narrow enough to justify the upfront cost.

Notice that none of these are about the model "not knowing things." If you find yourself wanting to fine-tune so the model "knows" something, the answer is RAG.

## Where RAG wins

RAG is the right tool when:

- The model needs to reason over a corpus you control, where the corpus changes.
- Audit and source attribution are required. Every answer must cite the source documents.
- The data lifecycle is sensitive — you need to add, remove, or update specific records without retraining.
- Tenant isolation matters. Different customers, matters, or patient populations need separate retrieval scopes.
- You want to change models later. RAG is model-agnostic; the index does not care which LLM you query against.

## The hybrid case

Some workloads use both: a fine-tuned model for output structure and tone, with RAG for the factual grounding. This is rare, expensive, and usually unnecessary. We have done it once where the firm's house style was specific enough that prompting could not enforce it consistently. Both pieces required ongoing maintenance.

If you are reaching for hybrid, make sure RAG alone has been genuinely tuned first. Most retrieval-quality problems can be solved with better chunking, better embedding models, and better re-ranking — not by adding fine-tuning into the mix.

## What "tuned RAG" looks like

When we say RAG should be exhausted before fine-tuning, here is what tuning RAG means in practice:

1. **Chunk boundaries.** Sentence-aware, paragraph-aware, or semantic chunking. Fixed-size chunks at character boundaries are a starting point, not an endpoint.
2. **Embedding model selection.** Domain-specific embeddings (clinical, legal) often outperform general-purpose ones. The decision is also tied to BAA / privacy posture.
3. **Hybrid retrieval.** Vector search plus BM25, with reciprocal rank fusion. Pure vector search misses exact-phrase matches.
4. **Re-ranking.** A cross-encoder re-ranker on the top 50–100 candidates lifts precision dramatically.
5. **Filtering.** Metadata filters (tenant, document type, recency) before retrieval, not after.
6. **Citation enforcement.** The prompt requires citations; the orchestrator rejects outputs without them.
7. **Evaluation harness.** Held-out queries with known correct sources. You measure retrieval quality (recall@k) separately from generation quality.

If you have not done these and you are reaching for fine-tuning, you are probably solving the wrong problem.

## A practical decision tree

```
Does the workload involve sensitive or changing data?
├─ Yes → RAG. Stop here unless you have specific evidence it is insufficient.
└─ No → Does the model produce wrong content, or wrong-format content?
   ├─ Wrong content → RAG. The model needs grounding, not adaptation.
   └─ Wrong format → Has prompting failed?
      ├─ Yes → Consider fine-tuning.
      └─ No → Better prompts first.
```

## Engagement starting point

If you are weighing fine-tuning for a compliance use case, the first conversation is usually about whether RAG has actually been tuned, or whether you are looking at unbaseline RAG and concluding it does not work. We have those conversations regularly. They take an hour and save weeks of misdirected work.