# Tampa Dynamics — Full Content Founder-led, engineering-first consultancy that designs and builds secure, cloud-native platforms and AI workflows for regulated industries (healthcare, legal, financial services). Site: https://tampadynamics.com Generated: 2026-05-21T15:10:59.095Z This file concatenates every published article, guide, case study, and comparison on the site. Each entry is preceded by its canonical URL. # Blog --- ## AI Document Analysis for Regulated Industries: A Production Architecture Guide Source: https://tampadynamics.com/blog/ai-document-analysis > How to design AI document analysis pipelines that hold up under HIPAA, SOC 2, and legal review. Extraction, RAG, accuracy thresholds, hallucination mitigation, and the architectural decisions that determine whether your system passes audit. Date: 2026-02-14 Document analysis is one of the highest-value applications of AI in regulated industries — and one of the most frequently misunderstood. Teams often come in expecting a solution that "reads documents and answers questions." What they get, if the system is not designed carefully, is a pipeline that appears to work in demos and fails in production on the edge cases that matter most. This guide covers what document analysis actually involves at the system design level, where the real complexity lives, and how to make architecture decisions that hold up under compliance scrutiny. --- ## What Document Analysis Actually Means "Document analysis" is not a single capability. It is a family of distinct tasks, and conflating them is the source of most project failures. **Extraction** is pulling structured data from unstructured text — dates, names, dollar amounts, clause identifiers, diagnosis codes. The input is a document; the output is a structured record. **Classification** is assigning a document or section to a category — contract type, claim status, document priority. The input is text or a document; the output is a label. **Understanding** — the capability most teams actually want — is answering questions about document content, summarizing complex documents, identifying inconsistencies, or reasoning across multiple documents. This is where large language models are most useful and where hallucination risk is highest. Each of these tasks has different accuracy characteristics, different failure modes, and different implementation requirements. A system that needs to extract structured fields from a known document template has very different architecture requirements than one that needs to answer open-ended questions about a collection of contracts. Before designing anything, define which of these tasks your system needs to do, in what combination, with what accuracy threshold, and under what regulatory constraints. --- ## Rule-Based vs. ML-Based Approaches The default assumption in 2026 is that ML — specifically LLMs — is the right tool for all document analysis tasks. That assumption is worth interrogating. **Rule-based extraction** using regular expressions, template matching, or structured parsers is still the right choice when: - Document structure is consistent and predictable (e.g., specific form types, standard templates) - The extracted fields are well-defined and have predictable formats - Auditability requires deterministic, inspectable logic - The document volume justifies the upfront engineering investment A prior authorization form that always places the diagnosis code in the same position does not need a language model. A deterministic parser is faster, cheaper, more accurate on that specific template, and easier to audit. **ML-based approaches** — including LLMs and fine-tuned models — are appropriate when: - Document structure varies significantly across instances - The task requires semantic understanding, not just pattern matching - Documents contain natural language reasoning that must be interpreted, not just extracted **The practical recommendation** is a layered architecture: rule-based extraction for structured, predictable fields; ML models for classification and semantic tasks; LLMs for understanding and generation tasks that cannot be reduced to extraction or classification. Reserve the most expensive, least deterministic components for the tasks where they are genuinely necessary. --- ## OCR and Document Preprocessing Language models do not read PDFs. They read text. Before any ML-based document analysis can occur, your documents need to be converted to clean text, and that conversion step is where many production systems degrade. ### OCR Quality Is a Limiting Factor For scanned documents — common in healthcare (faxed records, scanned intake forms) and legal (historical contracts, court filings) — OCR quality directly determines downstream accuracy. A language model cannot reason correctly about text that has been garbled by a poor OCR pass. Key OCR considerations: - **Engine selection** — AWS Textract, Google Document AI, and Azure Document Intelligence each have different accuracy profiles across document types. Evaluate on your actual document corpus, not benchmarks. - **Document quality preprocessing** — Deskewing, denoising, contrast normalization, and resolution normalization upstream of OCR materially improve output quality. - **Table and form detection** — General OCR reads text linearly. Documents with tables, checkboxes, and multi-column layouts require layout-aware extraction to preserve the semantic relationships between fields. - **Confidence scoring** — Production OCR pipelines should expose per-field confidence scores and route low-confidence extractions to human review rather than passing them silently to downstream components. ### Text Normalization and Chunking After OCR, raw text typically requires normalization — handling line breaks, hyphenation artifacts, header/footer stripping, and encoding issues — before it is useful for ML processing. For RAG systems specifically, chunking strategy is a significant architectural decision. Document chunks that are too small lose context; chunks that are too large dilute relevance scores and exceed context windows. The right strategy depends on document structure: paragraph-based chunking for narrative documents, section-based chunking for structured reports, hierarchical chunking for documents with clear heading hierarchies. --- ## RAG Architecture for Document Q&A Retrieval-Augmented Generation (RAG) is the standard architecture for document question-answering in production systems. Rather than loading entire documents into a model's context window — which has cost, latency, and context length limitations — RAG retrieves the specific passages most relevant to a query and passes only those to the model. ### The Core Pipeline A RAG document analysis pipeline consists of: 1. **Ingestion** — Documents are preprocessed, OCR'd if necessary, chunked, and converted to embeddings using an embedding model (text-embedding-3-large, Cohere embed-v3, or similar). Embeddings are stored in a vector database (Pinecone, pgvector, OpenSearch, Weaviate). 2. **Retrieval** — At query time, the user query is embedded using the same model, and the vector store returns the k most semantically similar chunks. 3. **Augmentation** — Retrieved chunks are assembled into a prompt context and passed to a language model along with the query and any system instructions. 4. **Generation** — The language model produces an answer grounded in the retrieved context. ### Hybrid Search Pure vector similarity search has known failure modes: it can miss exact matches, struggle with proper nouns and identifiers, and rank tangentially related content highly based on surface-level semantic similarity. Production systems typically combine dense vector search with sparse keyword search (BM25) in a hybrid retrieval step. This captures both semantic relevance and keyword precision. ### Re-ranking After initial retrieval, a cross-encoder re-ranker evaluates each retrieved chunk against the query with more precision than the initial embedding similarity. Re-ranking improves precision at the cost of latency. For regulated workflows where accuracy is more important than speed, the trade-off is usually worth it. ### Attribution Every answer generated by a RAG system should be traceable to its source chunks. This means: - Returning source document identifiers and chunk positions alongside generated answers - Displaying citations in the UI so users can verify claims against source documents - Logging which chunks were retrieved and which contributed to the final answer — this is your audit trail Attribution is not optional in regulated industries. An AI that produces correct-looking answers without provenance is not useful for legal review, clinical decision support, or financial due diligence. --- ## Use Cases by Vertical ### Legal: Contract Review and Due Diligence Legal document analysis typically involves: - **Clause extraction and classification** — Identifying indemnification clauses, limitation of liability language, auto-renewal provisions, and non-standard terms across large contract sets - **Obligation and deadline extraction** — Pulling dates, notice periods, and party-specific obligations into structured summaries - **Inconsistency detection** — Flagging conflicts between document sections or between a contract and a template standard - **Due diligence Q&A** — Answering questions across a data room of hundreds of documents during M&A or financing processes The accuracy requirement in legal is extremely high. A system that misses a jurisdiction-specific limitation clause in a commercial contract creates real liability. Human review of AI-flagged issues is not optional — the AI's role is to triage and surface, not to conclude. Attorney-client privilege considerations also shape system architecture. Legal documents in a RAG system must not be retrievable across client matter boundaries. Strict tenant isolation at the vector store and data layer is required. ### Healthcare: Prior Authorization and Clinical Documentation Healthcare document analysis use cases include: - **Prior authorization support** — Extracting relevant clinical criteria from patient records and matching them against payer requirements to support authorization requests - **Clinical documentation assistance** — Extracting structured information from unstructured clinical notes to populate fields in downstream systems - **Referral and discharge summary processing** — Parsing incoming referral documents to route and triage efficiently HIPAA applies to the entire pipeline. The PHI in clinical documents must be handled with the same controls as any other PHI: access-controlled storage, audit logging of every retrieval, BAA with all vendors whose infrastructure processes the documents, and de-identification before data reaches any vendor that cannot provide a BAA. ### Finance: Due Diligence and Regulatory Filing Analysis Financial services document analysis includes: - **SEC filing analysis** — Extracting financial figures, risk factors, and forward-looking statements from 10-Ks and 10-Qs - **Loan document review** — Identifying covenant terms, trigger conditions, and non-standard provisions across credit agreements - **Regulatory correspondence** — Classifying and routing regulatory notices and examination findings Financial document analysis has its own auditability requirements: investment decisions supported by AI analysis may need to demonstrate that the supporting information was accurate and appropriately sourced. --- ## Accuracy vs. Cost Trade-offs Every document analysis system involves trade-offs between accuracy, latency, and cost. These trade-offs need to be explicit, not implicit. **Embedding model quality** varies significantly. Higher-quality embedding models improve retrieval precision but increase per-document indexing cost and per-query latency. Evaluate on your document corpus before committing to a model. **Generation model selection** is the largest cost variable. GPT-4o, Claude 3.5 Sonnet, and their peers produce higher-quality answers on complex documents than smaller models, but at significantly higher per-query cost. For high-volume, lower-complexity extractions, a smaller model or a fine-tuned model may provide adequate accuracy at a fraction of the cost. **Chunk count and context length** — retrieving more chunks per query improves recall but increases prompt size, cost, and the risk of the model being confused by tangential content. The right architecture is not the one that maximizes accuracy on all tasks — it is the one that applies the right level of capability to each task, with human review at the points where errors have the most consequence. --- ## Hallucination Risks and Mitigation Hallucination — the model generating plausible-sounding but incorrect content — is the central reliability problem in LLM-based document analysis. In regulated industries, a hallucinated clause interpretation or fabricated clinical detail can cause direct harm. Mitigation strategies, in order of effectiveness: **Constrain the generation task.** Extraction tasks with explicit output schemas (JSON with defined fields) hallucinate far less than open-ended summarization tasks. Where possible, decompose complex Q&A into a series of constrained extraction sub-tasks. **Ground answers in retrieved text.** Instruct the model to answer only based on provided context and to explicitly state when the context does not contain sufficient information to answer. Evaluate whether models follow this instruction reliably on your task. **Verify claims against source text.** Post-generation verification — checking that specific claims in the output can be found verbatim or near-verbatim in the source chunks — catches fabrications that the model produced despite constrained prompting. **Human review at high-stakes decision points.** No mitigation strategy eliminates hallucination. For decisions with significant consequences — a contract interpretation that will be executed, a clinical documentation entry that will affect care — human review is not a fallback. It is a required step in the workflow design. --- ## Compliance Considerations ### Data Retention and Storage Documents ingested into a document analysis system need retention policies. In regulated industries, this means: - Defining retention periods based on document type and regulatory requirements - Implementing deletion capabilities that cover both raw documents and their derived embeddings - Ensuring deletion of a document removes it from the vector store as well (a frequently missed step — deleting the source document does not automatically delete its embeddings) ### Access Controls Document-level access controls in a RAG system are more complex than in a traditional document management system. You need access controls that operate at the retrieval layer — not just at the document storage layer — so that a query from User A cannot surface documents that User A does not have rights to see. This typically means: - Tagging chunks at indexing time with access control metadata (document owner, matter, tenant, sensitivity classification) - Filtering retrieval results by the requesting user's access rights before chunks are passed to the model - Auditing which documents were retrieved for each query ### Audit Logging Every document retrieval and every AI generation event is an auditable action in regulated workflows. Your audit log should record the query, the retrieved document identifiers, the model and version used, and the generated output. This log is your evidence that the system operated correctly if the output is ever challenged. --- ## Human-in-the-Loop Design Patterns The framing that AI replaces human review is the wrong model for regulated industries. The right frame is that AI changes the nature of human review — reducing the time spent on mechanical scanning and increasing the time spent on judgment. Effective human-in-the-loop patterns for document analysis: **Triage and prioritization** — AI classifies documents by urgency, complexity, or risk level. Humans review in AI-determined priority order, rather than sequential processing. **Flagging, not concluding** — AI identifies sections or provisions that warrant attention. Humans evaluate the flagged items. The AI does not render a final judgment; it guides human attention. **Confidence-gated automation** — High-confidence extractions (e.g., standard date fields from a consistent form) proceed automatically. Low-confidence extractions route to a human review queue. Thresholds are calibrated based on the cost of errors. **Active review interfaces** — Rather than presenting AI output as a finished product, present it as an annotated draft. Reviewers can accept, reject, or modify each AI-generated annotation. This surfaces model errors, creates training data for improvement, and ensures the human genuinely engages with the output. The design of the review interface is as important as the design of the underlying AI pipeline. A system that makes it easy for reviewers to rubber-stamp AI output is not a safe human-in-the-loop system. --- ## Building Document Analysis Systems That Hold Up Document analysis in regulated industries is an engineering problem more than it is an AI problem. The AI components — embedding models, language models, vector stores — are available and capable. The harder work is designing pipelines with appropriate accuracy controls, building attribution into the output from the start, enforcing document-level access controls at the retrieval layer, and designing review interfaces that make human oversight practical rather than performative. If your team is evaluating or building a document analysis system for legal, healthcare, or financial workflows, [an architecture review](/contact) is a structured way to identify the decisions that will be expensive to change later. We also cover the overlap between document analysis and broader AI system design in our [healthcare AI consulting](/healthcare-ai-consulting) and [legal AI consulting](/legal-ai-consulting) practices. --- ## AI for Small Business: Practical Use Cases That Don't Require a Data Science Team Source: https://tampadynamics.com/blog/ai-for-small-business > Practical AI use cases for small businesses — from document processing to customer support automation. No machine learning expertise required. Date: 2026-01-28 Most small business owners have absorbed two years of AI headlines and are left with a version of the same question: what, specifically, is this supposed to do for my business? The honest answer is narrower than the headlines suggest. AI is not going to transform your operations overnight, and the use cases that actually work in practice are more specific and more modest than the ones described in product marketing. But there are genuine time savings and quality improvements available to businesses with fewer than 50 employees, with no data science background required, using tools that cost less than a full-time hire. This post describes the use cases that are actually viable for small businesses right now, what each one requires to implement, and where the pitfalls are. --- ## The Gap Between AI Hype and Small Business Reality Enterprise AI implementations make the news. A hospital system deploying AI-assisted radiology, a law firm using AI for contract review at scale, a logistics company optimizing routes across millions of data points — these are real use cases, but they involve specialized models, significant integration work, and teams with engineering capacity to build and maintain them. Small businesses operate in a different context. You have limited engineering resources, limited budget for tooling, existing software that was not built to integrate with anything, and workflows that live in a combination of email threads, spreadsheets, and institutional knowledge. The use cases that work for small businesses share a few characteristics. They address a task that is repetitive and time-consuming but does not require deep domain judgment. They produce outputs that a human reviews before they matter. And they use general-purpose AI capabilities — language understanding, document parsing, summarization — that do not require custom model training. --- ## Use Cases That Actually Work ### Document Processing and Data Extraction If your business handles paper or PDF-based documents — invoices, contracts, intake forms, insurance documents, applications — there is almost certainly an AI tool that can reduce the manual data entry burden. Document AI tools (Google Document AI, AWS Textract, and the document processing capabilities built into platforms like Zapier or Make) can extract structured data from unstructured documents with reasonable accuracy. An insurance agency that was manually keying data from carrier documents into a management system can often automate 70-80% of that extraction. What you actually need to implement this: - A consistent document format, or a set of templates that account for format variation - A review step where a human spot-checks the extracted data — especially for high-stakes fields like dollar amounts, dates, and names - A destination system that can receive the extracted data (most practice management, CRM, and accounting platforms have APIs or native integrations) What you should not expect: perfect accuracy without review. Document extraction tools are accurate enough to eliminate most manual keying, but not accurate enough to operate without a human quality check on anything that matters. ### Email Draft Generation Email is the highest-volume writing task for most small businesses, and it is also one of the clearest applications of current AI capabilities. Tools like Gmail's Help Me Write, Outlook Copilot, or standalone tools built on GPT-4 or Claude can draft responses to customer inquiries, follow-up sequences, proposal emails, and client communications. The workflow that works: provide the AI with context (the incoming email, the key points you need to address, your preferred tone), let it draft, review and edit, send. For experienced users, this shifts email from a writing task to an editing task, which is faster. The failure mode: treating AI drafts as final output without review. AI email drafts are consistently fluent and often completely wrong about specific details — pricing, availability, commitments you may or may not have made. The review step is not optional. A secondary use case: summarizing email threads. If you have been cc'd on a 30-message thread and need to understand where things stand, asking an AI to summarize the thread is faster than reading it top-to-bottom, and the summary is usually accurate enough to be useful. ### Customer FAQ Automation If your business receives the same 10-20 questions repeatedly — business hours, pricing, process, requirements, turnaround time — an AI-powered FAQ tool can handle the first response layer for a significant portion of inbound inquiries. The implementation options range from simple (a website chatbot trained on your FAQ content, using a tool like Intercom, Tidio, or Freshdesk) to more involved (a custom RAG system that queries your knowledge base and routes complex questions to humans). For most small businesses, the simple implementation is the right starting point. The chatbot handles the routine questions that were previously answered by whoever checked email. Complex or novel questions escalate to a human. The chatbot is honest about its limitations and does not try to answer what it does not know. The critical design principle: the chatbot should not be trying to do everything. Define the scope tightly. What are the 15 questions you receive most often? Train the system on those, and make the escalation path to a human clear and easy. A chatbot that tries to answer everything and answers many things wrong is worse than no chatbot. ### Meeting Notes and Summaries If your business involves regular client meetings, team standups, or sales calls, transcription and summarization tools have become genuinely useful. Otter.ai, Fireflies.ai, and the built-in transcription capabilities in Zoom and Teams can produce accurate transcripts, summarize action items, and generate structured meeting notes automatically. The time savings compound. A 60-minute client meeting that previously required 20-30 minutes of note-writing can produce a summarized document automatically, with action items identified and attributed to specific people. The implementation is straightforward — most of these tools integrate directly with your video conferencing platform. The primary consideration is disclosure: many states and most professional contexts require that participants be informed when a meeting is being recorded and transcribed. Make this part of your meeting opening. ### Invoice Processing and Accounts Payable For businesses that receive a significant number of vendor invoices — construction, retail, restaurants, professional services with multiple vendors — AI-assisted invoice processing can reduce the hours spent on manual entry. Tools like Dext (formerly Receipt Bank), Hubdoc, or the built-in AI capabilities in QuickBooks and Xero can extract line items, amounts, dates, and vendor information from invoices with reasonable accuracy, and route them to the appropriate GL code based on learned patterns. This is not fully automated accounts payable — someone still needs to approve invoices and catch anomalies. But shifting from manual entry to review-and-approve significantly reduces the time cost for businesses processing 20 or more invoices per month. --- ## Common Pitfalls **Data quality problems surface immediately.** AI tools are sensitive to inconsistency in the data they process. If your customer data has inconsistent naming conventions, if your document formats vary widely, or if your knowledge base has contradictory information, AI tools will reflect those inconsistencies in their outputs. Before deploying any AI automation, clean the data it will work with. **Integration complexity is usually underestimated.** The AI tool itself is often straightforward. Connecting it to your existing systems — your CRM, your accounting software, your industry-specific platform — is where projects stall. Many small business platforms expose limited APIs or require middleware. Budget time and potentially budget for an integration resource. **Cost is not always as low as advertised.** Most AI tools have free or low-cost entry tiers, but production usage scales with volume. A chatbot handling 500 conversations per month costs more than a chatbot handling 50. Document processing tools often price per page. Model API costs for custom implementations can be significant at volume. Run the numbers at your actual expected volume, not the minimum-tier pricing. **Accuracy thresholds vary by task.** For email drafting, 80% accuracy is fine — you are reviewing every email anyway. For invoice processing, an error rate that results in miscoded expenses compounds into significant accounting problems over time. Match your accuracy expectations to the stakes of the task, and design review workflows accordingly. --- ## How to Evaluate AI Vendors The AI tool market is crowded and the marketing is often indistinguishable across vendors. Evaluating vendors effectively means looking past the demo. **Ask for a pilot on your actual data.** Any vendor worth considering will let you test with a sample of the specific document types, emails, or use cases you intend to automate. If the tool cannot handle your real-world inputs, the demo is irrelevant. **Understand the data handling model.** Where does your data go? Is it used to train the vendor's models? Who has access to it? For businesses handling client data, proprietary information, or anything sensitive, these questions matter before you integrate a tool. **Evaluate the integration path before committing.** Do they have a native integration with your existing systems? If not, is there a documented API? What does the actual integration work look like, and who will do it? **Check what happens when it is wrong.** Every AI tool will produce incorrect outputs sometimes. The question is whether the tool makes errors visible, whether there is a review step in the workflow, and what the process is for correcting errors and feeding that correction back into the system. --- ## Build vs. Buy vs. Off-the-Shelf Small businesses have three options for AI implementation, and they are not equally appropriate for every use case. **Off-the-shelf tools** (Otter.ai, Intercom, Dext, Gmail's AI features) are the right starting point for use cases where general-purpose tools cover your needs. Low cost, low implementation effort, limited customization. **Configured platforms** (Zapier AI, Make.com, HubSpot AI features) sit in the middle. They require more setup and some technical knowledge, but they allow you to connect multiple systems and customize workflows in ways that off-the-shelf tools do not support. **Custom builds** (bespoke AI integrations, custom RAG systems, purpose-built document processing pipelines) make sense when your use case is specific enough that no off-the-shelf tool covers it, or when the volume or accuracy requirements exceed what general tools can deliver. Custom builds require technical resources and ongoing maintenance — they are not a small business starting point, but they are sometimes the right answer for a specific high-value workflow. The decision framework: start with off-the-shelf and validate that the use case actually saves time and produces acceptable quality. Move to custom builds only when you have evidence that the use case has meaningful value and that general tools cannot deliver the quality or integration you need. --- ## When to Get Outside Help For businesses that have identified a high-value use case but hit a wall on implementation, the question of when to bring in outside help is practical. A few indicators: - The use case requires integrating with a system that has a non-obvious API or no native integration with AI tools - Data quality issues are significant enough that they need remediation before automation can work - The workflow involves sensitive data (client information, financial records, medical information) where vendor selection and security configuration need careful attention - You have tried an off-the-shelf tool and it does not produce sufficient accuracy for your specific document types or use case Bringing in a development partner for AI implementation does not mean a large engagement. A focused scoping conversation — describing what you are trying to automate, what your existing systems are, and what your data looks like — is enough to determine whether there is a tractable path and what it realistically costs. --- ## Frequently Asked Questions ### Do I need technical expertise on my team to use AI tools? For off-the-shelf tools — no. Email AI, meeting transcription, basic chatbots, and accounting AI features are designed for non-technical users. For configured platform integrations, some technical comfort helps. For custom builds, you need development resources either internally or through a partner. ### Will AI replace my employees? Not the ones doing complex, judgment-intensive work. AI is most useful for automating the repetitive, time-consuming portions of knowledge work — the email drafting, the data entry, the note-taking — which frees people to do the parts of their job that require judgment and relationships. The businesses that use AI well tend to redeploy time, not reduce headcount. ### How do I know if a use case will actually save time? Track the time cost of the task you are trying to automate before you implement anything. If someone is spending 8 hours per week on manual invoice entry, an automation that reduces that to 2 hours of review is worth real money. If a task consumes 30 minutes per week, automation may not be worth the implementation effort. --- If you are evaluating AI for a specific workflow and are not sure whether the use case is tractable or which approach makes sense, [an architecture conversation](/contact) with our team can give you a practical answer quickly — no sales cycle, no vague roadmap. --- ## AI Procurement Checklist for Healthcare CIOs Source: https://tampadynamics.com/blog/ai-procurement-checklist-healthcare > A practical checklist for evaluating AI vendors and AI projects in healthcare — the questions to ask before money moves and the red flags to watch for in vendor responses. Date: 2026-05-04 AI procurement in healthcare is harder than software procurement was a decade ago. The data path is more complex, the regulatory exposure is broader, and the vendor ecosystem includes everything from established health IT firms to two-person startups with a Cohere wrapper. This is the checklist we wish more CIOs had when we walk into a procurement conversation. It is organized by the questions auditors and Security Officers actually ask, not by the marketing categories vendors use. ## Data path 1. **Where does PHI go?** Trace every system the data touches: ingestion, embedding, vector storage, model invocation, post-processing, logging, analytics. Each is on the BAA or it is not. 2. **What model is the vendor using, and does it have a BAA?** "We use AI" is not an answer. The specific model and the specific provider matter. AWS Bedrock under the AWS BAA, Azure OpenAI under Microsoft's BAA, and Google Cloud's Vertex under Google's BAA are different vendors with different documentation. Get specifics. 3. **Where does the model run?** A model "deployed in your VPC" is different from a model "accessed via the vendor's API endpoint." Both can be appropriate. Mixing them up is not. 4. **Does the vendor train on customer data?** Default policy at major LLM providers is that enterprise tier customers' data is not used for training, but verify in the contract, not the marketing page. ## Tenant isolation 5. **How is your data separated from other customers' data?** Index per customer, namespace per customer, or shared resources with metadata filters. Each has different security implications. Get specifics. 6. **Can you produce evidence of isolation?** Vendor diagrams are not evidence. Test data, configuration, and logs are. 7. **What happens if another customer is breached?** A vendor whose architecture has a single shared resource is one breach away from being yours. A vendor with hard isolation is not. ## Audit logging 8. **What does the vendor log?** Every model invocation, with prompts, retrieved context, and outputs, is the right answer. "Aggregate metrics" is not. 9. **Can you access the audit log?** Some vendors keep the AI audit log internal. For HIPAA work, you need to be able to query "every interaction touching this patient" without a support ticket. 10. **How long is the audit log retained?** Six years is the HIPAA minimum. Some vendors retain shorter; some retain forever (which has its own discovery exposure). 11. **What is in the audit log for tool calls?** If the AI calls back into your systems, every tool invocation needs a log entry with parameters and results. ## Right-to-deletion 12. **If a patient asks for deletion, what is the path?** PHI in source systems, PHI in the vector index, PHI in audit logs. Each needs a path. "We can delete records on request" without specifics is not enough. 13. **If you terminate the contract, what happens to your data?** Data return, data destruction, attestation. Spell out the timeline. ## Access control 14. **How are users authenticated to the AI system?** Federation with your IdP (Entra, Okta, Cognito) is the default expectation. Vendor-managed user accounts are a flag. 15. **How does role-based access work?** A clinician seeing patient AI summaries should see only their patients. A pharmacist seeing prior-auth recommendations should see only the cases assigned to them. Test the access model with realistic scenarios. ## Outputs 16. **Is every output cited?** Free-form answers without source citations are not auditable. Compliance teams should reject them. 17. **What happens when retrieval comes back empty?** A safe system tells the user it does not know. A dangerous system makes something up. 18. **What is the human-in-the-loop boundary?** For any decision with clinical or regulatory consequence, there should be a human review step. Confirm where it sits and how it is logged. ## Operations 19. **Who can change the prompts and retrieval logic?** Vendor-side changes that alter system behavior should produce audit entries. Silent prompt updates that change AI behavior are a regulatory liability. 20. **What is the model upgrade path?** When the underlying model changes — and it will — does behavior change? Is there an evaluation harness? Will you be notified before deployment? 21. **What is the SLA for incident response?** AI systems break in ways traditional software does not. Hallucinations, retrieval failures, prompt injection. The vendor's response capability matters. ## Contract terms 22. **BAA signed, with all sub-processors named.** The LLM provider, the vector database, the embedding endpoint, the logging vendor — each is a sub-processor. 23. **Indemnification for breach caused by vendor's AI.** Carve-outs for "AI-generated content" that shift liability back to you should be flagged. 24. **Audit rights.** You should be able to audit the vendor's controls without their permission, on reasonable notice. ## Red flags A few responses that should slow procurement down: - "We use the latest models" without specifying which. - "Your data is secure" without explaining how. - "We have HIPAA compliance" without naming services and BAAs. - Vague answers to "what is in the audit log." - An inability to demonstrate tenant isolation. - Refusal to answer detailed technical questions in front of a Security Officer. These are not always disqualifying. They often indicate a vendor whose security posture is less mature than they realize. A good procurement process surfaces them before contract. ## What this checklist does not cover Outcomes. The checklist above ensures the vendor can be deployed responsibly. Whether the vendor solves the problem you bought them to solve is a separate question. We have seen vendors clear every item on this list and still produce a system that does not work for the intended workflow. The checklist is necessary, not sufficient. The right pairing: this checklist for the security and compliance review, plus a pilot deployment with measurable success criteria for the operational evaluation. Both questions need an answer before procurement closes. ## Where we fit If you are working through this checklist for a vendor evaluation, or if you are deciding between buying and building, an architecture review is the lowest-risk way to get a second opinion. We do this for healthcare CIOs regularly. [Get in touch](/contact) if you have a procurement decision in front of you. --- ## Five Questions Every AI System Should Be Able to Answer Source: https://tampadynamics.com/blog/audit-ready-ai-five-questions > If your AI system cannot answer these five questions in seconds, it is not audit-ready — and that gap will surface at the worst possible moment. Date: 2026-03-21 Every AI system in production will eventually produce an output that someone wants to investigate. A patient will ask why a recommendation was made. A regulator will ask what data was used. An auditor will ask whether the system has been operated within policy. A breach investigator will ask what data the model saw during a specific window. When that happens, you will need to answer five questions. The teams whose AI systems can answer them in seconds keep operating. The teams whose systems cannot, do not. ## 1. Who used the AI, when, and what did they do? The most basic audit question. "Show me every AI interaction by user X between dates A and B." The answer requires the audit log to capture the requesting user identity — not the AI service account, but the human user the request originated from — alongside every model invocation. The log has to be queryable by user, by date, by tenant. Most AI systems we see in early stages capture the AI's outputs but not the calling user. The user identity has to be threaded through the application layer to the AI invocation layer and persisted with each turn. Adding it later is a rebuild. ## 2. What did the AI see? For any given output, what was in the context window when the model produced it? This is the question that exposes RAG systems built without retrieval logging. The model produced output; the output references a fact; the fact came from somewhere. That somewhere has to be in the audit log, with chunk-level detail. A complete answer captures: the system prompt, the conversation history, the retrieved chunks (by document and chunk ID), the user message, and the full input as sent to the model. Not a summary. The full input. ## 3. What did the AI produce, and what happened next? The model output, plus the action that was taken. For a recommendation that was approved by a human, the audit log shows the AI's draft, the human's approval, any edits made, and the final action. For a recommendation that was rejected, the same chain. If the AI's output triggered a downstream system call — wrote to the case record, sent an email, updated a billing line — that call is in the audit log too. The chain has to be reconstructable end-to-end. ## 4. For this specific subject (patient, customer, case), every interaction. Tenant-scoped queries. "Show me every AI interaction that touched this patient's record, ever." This requires the tenant identifier to be in every audit row, with appropriate indexes for tenant-scoped queries. For healthcare, the patient identifier; for legal, the matter identifier; for financial services, the account or customer identifier. The query has to return everything: model invocations, retrieval calls, tool calls, approval decisions. If a piece of the data path bypassed the audit log, that is the gap an investigation will find. ## 5. Was anything outside policy? The hardest question. "Were there any AI interactions during this window that should not have happened?" Answering it requires policy to be expressed in something more than prose. The audit log captures the inputs, outputs, and actions; an automated evaluation runs against the log to flag interactions that violated guardrails (PHI sent to an unapproved endpoint, retrieval crossing tenant boundaries, outputs without citations, model versions outside the approved set, tool calls outside the allowed scope). Most organizations treat policy review as manual log inspection. That works at small scale and breaks at production scale. The systems that scale have policy expressed as code that runs against the audit log, with alerts and dashboards on policy violations. ## What "audit-ready" actually requires To answer all five questions, a production AI system needs: - **Audit log capture at the model invocation layer.** Every model call, with full input and output, in a structured format. - **User identity threaded through every layer.** The application user is captured at the model layer, not invented or replaced with a service account. - **Tenant identifier on every row.** With appropriate indexes for tenant-scoped queries. - **Tool calls captured.** Every tool invocation with parameters and results. - **Approval chains captured.** For human-in-the-loop decisions, the human's identity, decision, and any edits. - **Retention sized to obligations.** Six years for HIPAA, longer for some workloads, with object lock or equivalent immutability. - **Policy expressed as code.** Automated evaluation of the log against the policy set. This is not light infrastructure. For an AI system handling regulated workloads, the audit infrastructure is often comparable in size and effort to the AI itself. The teams that ship the AI without this infrastructure ship faster initially and slower in the long run, because the audit work has to happen eventually. ## When the question gets asked The question that prompts this post: "Can your AI system answer these five questions today?" If the answer is yes, the system is audit-ready. The compliance review will be a confirmation, not a discovery process. The breach investigation, if one happens, will be hours of work, not weeks. If the answer is no — not all five, or not in seconds, or only with engineering effort — the gap is a project. The cheapest time to close it is before the audit. The most expensive time is during. We have walked into both situations. The retrofitted audit log is always more painful than the day-one one. We say this not as theory but as engineers who have done both. ## Where we fit If you are operating an AI system in a regulated environment and any of the five questions feel uncertain, that is the project. We do audit-readiness reviews of existing AI systems regularly. They produce a written gap analysis, a prioritized remediation plan, and an estimate. [Reach out](/contact) if your AI is in production and the audit conversation is starting to come up. --- ## AWS Amplify Gen 2 in Production: Architecture Decisions That Matter Source: https://tampadynamics.com/blog/aws-amplify-gen2-architecture > A practical guide to AWS Amplify Gen 2 for production applications — authentication, data modeling, custom resolvers, and the limitations to know before you build. Date: 2026-01-07 AWS Amplify Gen 2 is a meaningful rearchitecting of the platform. The TypeScript-first, code-first model eliminates most of the friction that made Gen 1 difficult to work with in complex production environments — the YAML configuration fragmentation, the opaque CLI-generated resources, the difficulty of customizing beyond what the CLI anticipated. But Amplify Gen 2 is still Amplify, which means it makes strong assumptions about your architecture, and those assumptions are not always the right ones for every application. Before you commit to Amplify for a production deployment, you need to understand both what it does well and where it runs out of road. This is a practical guide written for teams evaluating Amplify Gen 2 for a Next.js application that needs to survive production. --- ## What Changed from Gen 1 to Gen 2 The most significant change is the move from a CLI-driven, YAML-based configuration model to a TypeScript-first, code-first model. In Gen 1, you ran `amplify add auth` and got a generated configuration file that was difficult to read, harder to modify, and nearly impossible to version meaningfully. Drift between environments was common and painful. In Gen 2, your entire backend is defined in TypeScript files in your repository: ```typescript // amplify/auth/resource.ts import { defineAuth } from "@aws-amplify/backend"; export const auth = defineAuth({ loginWith: { email: true, }, multiFactor: { mode: "OPTIONAL", totp: true, }, }); ``` This is a genuine improvement. Your backend infrastructure is now a first-class part of your codebase — version controlled, code-reviewed, and deployable through the same pipeline as your application code. The configuration is readable by engineers who were not present when it was created. Gen 2 also ships with a unified data modeling layer built on AppSync and DynamoDB, a cleaner authentication integration with Cognito, and sandbox environments for local development that provision real AWS resources in an isolated account context. What did not fundamentally change: Amplify is still an opinionated platform built on AppSync + DynamoDB for data, Cognito for auth, and CloudFront + S3 (or Lambda) for hosting. If your requirements live comfortably in that stack, Gen 2 is a significant improvement. If your requirements push outside it, you will hit the same walls. --- ## Authentication Setup Amplify Gen 2 authentication is built on Cognito, and the Gen 2 configuration model gives you reasonable control over the Cognito setup through the TypeScript definition layer. ### Standard Configuration The baseline auth setup handles email/password authentication, MFA, and the standard social providers (Google, Facebook, Apple): ```typescript export const auth = defineAuth({ loginWith: { email: { verificationEmailStyle: "CODE", verificationEmailSubject: "Verify your email", }, externalProviders: { google: { clientId: secret("GOOGLE_CLIENT_ID"), clientSecret: secret("GOOGLE_CLIENT_SECRET"), }, callbackUrls: ["https://yourapp.com/auth/callback"], logoutUrls: ["https://yourapp.com"], }, }, multiFactor: { mode: "REQUIRED", totp: true, sms: true, }, userAttributes: { givenName: { required: true, mutable: true }, familyName: { required: true, mutable: true }, }, }); ``` ### Custom Auth Flows Cognito supports custom authentication challenges — magic links, device-based authentication, custom verification logic — through Lambda triggers. Gen 2 exposes these through a `triggers` configuration: ```typescript export const auth = defineAuth({ loginWith: { email: true }, triggers: { createAuthChallenge: defineFunction({ entry: "./create-auth-challenge.ts", }), defineAuthChallenge: defineFunction({ entry: "./define-auth-challenge.ts", }), verifyAuthChallengeResponse: defineFunction({ entry: "./verify-auth-challenge.ts", }), }, }); ``` This is how you implement magic link authentication or hardware key flows. The trigger functions are standard Lambda functions with Cognito event shapes. The integration is cleaner in Gen 2 than Gen 1, but the Cognito trigger model itself is unchanged — the same constraints around session management and challenge sequencing apply. ### Production Auth Considerations A few things that matter in production and are easy to miss: **Token expiry configuration** — Amplify defaults are not always appropriate for your application. The ID token, access token, and refresh token have separate expiry windows. Define these explicitly based on your security requirements. For applications handling sensitive data, shorter-lived tokens with well-implemented refresh flows are preferable to long-lived sessions. **Advanced Security Features** — Cognito's advanced security features (compromised credential detection, adaptive authentication) are not enabled by default and are not configurable through the standard Amplify Gen 2 auth definition at the time of writing. Enabling them requires a custom CDK construct via `defineBackend`. This is not a blocking issue, but it is something to account for if you are building a security-sensitive application. **User pool limits** — Cognito scales well but has default limits on API calls per second that can affect authentication under high concurrency. If you are building for significant scale, review the Cognito service quotas early. --- ## Data Modeling with DynamoDB Amplify Gen 2 data modeling is built on AppSync with DynamoDB as the underlying store. The TypeScript model definition is the most significant improvement over Gen 1's GraphQL SDL-in-YAML approach: ```typescript // amplify/data/resource.ts import { defineData, a } from "@aws-amplify/backend"; const schema = a.schema({ Organization: a .model({ name: a.string().required(), plan: a.enum(["starter", "professional", "enterprise"]), members: a.hasMany("User", "organizationId"), createdAt: a.datetime(), }) .authorization((allow) => [ allow.owner(), allow.group("admin"), ]), User: a .model({ email: a.string().required(), organizationId: a.id().required(), organization: a.belongsTo("Organization", "organizationId"), role: a.enum(["member", "admin", "viewer"]), }) .authorization((allow) => [ allow.owner(), allow.authenticated().to(["read"]), ]), }); export const data = defineData({ schema, authorizationModes: { defaultAuthorizationMode: "userPool", }, }); ``` The authorization model is one of the stronger aspects of Amplify Gen 2. Field-level and model-level authorization rules compile down to DynamoDB condition expressions and VTL resolvers, which means the enforcement is at the data layer — not just in your application logic. ### What DynamoDB Access Patterns Require Amplify abstracts DynamoDB but does not eliminate its fundamental access pattern constraints. DynamoDB performs well for queries by partition key, poorly for ad-hoc queries across arbitrary dimensions. The Amplify data layer adds a set of secondary indexes automatically, but complex relational queries require either: 1. Designing your schema to match the access patterns you need (the correct DynamoDB approach) 2. Layering a custom query with a Lambda resolver for complex filtering 3. Accepting that some queries will result in a full table scan through AppSync filters (which works at small scale and fails badly at large scale) If your application requires complex relational queries — reporting, multi-dimensional filtering, analytics — DynamoDB through Amplify is the wrong storage layer for those use cases. A common production pattern is DynamoDB for transactional data + Aurora Serverless or RDS Proxy for reporting workloads, connected via custom resolvers. --- ## Custom Business Logic: Lambda Resolvers and Custom Queries Amplify Gen 2 supports two paths for custom business logic: Lambda functions as custom resolvers in the AppSync layer, and HTTP endpoints via the Amplify function definition. ### Custom Resolvers Custom resolvers replace the auto-generated AppSync resolvers for specific operations. Use them when the default CRUD behavior is insufficient — when you need to enforce business rules, trigger side effects, or integrate with external services: ```typescript // amplify/data/resource.ts const schema = a.schema({ // ... createOrderWithInventoryCheck: a .mutation() .arguments({ productId: a.id().required(), quantity: a.integer().required(), }) .returns(a.ref("Order")) .handler( a.handler.function( defineFunction({ entry: "./create-order-handler.ts" }) ) ) .authorization((allow) => [allow.authenticated()]), }); ``` The handler function receives the AppSync event shape and has access to the full AWS SDK. This is where you put logic that the auto-generated resolvers cannot express — inventory checks, external payment API calls, conditional workflows. ### Function Definitions For operations that live outside the AppSync layer — scheduled jobs, event-driven processing, webhook handlers — Amplify Gen 2 provides function definitions that deploy as Lambda functions and can be connected to EventBridge, SQS, or HTTP endpoints: ```typescript // amplify/functions/process-webhook/resource.ts import { defineFunction } from "@aws-amplify/backend"; export const processWebhook = defineFunction({ name: "process-webhook", entry: "./handler.ts", timeoutSeconds: 30, environment: { STRIPE_WEBHOOK_SECRET: secret("STRIPE_WEBHOOK_SECRET"), }, }); ``` The function connects to API Gateway through the backend definition, giving you a deployable webhook endpoint with the Amplify environment variables and IAM context already configured. --- ## Deployment Pipelines Amplify Gen 2 integrates with Amplify Hosting for CI/CD. The deployment model provisions sandbox environments (development) from developer machines and deploys production through connected branches in Amplify Hosting. ### Branch-Based Environments Each Git branch can map to an isolated Amplify environment with its own provisioned resources. This gives you true environment parity between staging and production — the same AppSync API, the same Cognito user pool configuration, the same DynamoDB tables, just with separate provisioned instances. The configuration in `amplify.yml`: ```yaml version: 1 backend: phases: build: commands: - npm ci - npx ampx pipeline-deploy --branch $AWS_BRANCH --app-id $AWS_APP_ID frontend: phases: build: commands: - npm run build artifacts: baseDirectory: .next files: - "**/*" cache: paths: - node_modules/**/* ``` ### Custom Domains and Edge Configuration Custom domain configuration in Amplify Hosting is functional but limited compared to a CloudFront distribution you control directly. Advanced edge behaviors — custom cache policies, origin request policies, Lambda@Edge functions — require either going through the Amplify console (which exposes a subset of CloudFront options) or managing the CloudFront distribution outside Amplify. For Next.js applications that require sophisticated edge configuration — geo-based routing, A/B testing at the edge, advanced cache invalidation — this is a meaningful constraint. --- ## Limitations and When Not to Use Amplify Amplify Gen 2 is the right tool for applications that map cleanly onto its assumptions. It is the wrong tool when those assumptions conflict with your requirements. **Complex relational data models.** If your application is fundamentally relational — complex joins, ad-hoc reporting, transactions across multiple entities — DynamoDB is not the right storage layer. Amplify does not support PostgreSQL or MySQL as a primary data store through its standard data layer. **Strict infrastructure control requirements.** Some organizations require specific VPC configurations, custom KMS key management, fine-grained IAM policies that differ from what Amplify provisions, or integration with existing AWS infrastructure that predates the Amplify deployment. Amplify supports CDK customization via `defineBackend`, but the further you go down that path, the more you are managing raw CDK rather than Amplify. **Multi-region deployments.** Amplify Hosting is primarily a single-region deployment model. Global multi-region active-active architectures are not a natural fit for Amplify's deployment model. **High-throughput APIs with complex rate limiting requirements.** AppSync is a capable GraphQL endpoint, but organizations with sophisticated API rate limiting requirements, complex quota management, or high per-second throughput needs may find that a custom API Gateway + Lambda setup gives them more control. **Teams that need to own the infrastructure.** Amplify abstracts significant infrastructure complexity. That is its value proposition. For teams that need deep operational visibility into their infrastructure, that abstraction can be an obstacle rather than an asset. --- ## Production Readiness Considerations ### Logging and Observability Amplify deploys CloudWatch logging for AppSync and Lambda functions by default. For production applications, this is necessary but not sufficient. A useful production observability stack on Amplify: - **CloudWatch Logs Insights** for structured log querying across Lambda and AppSync - **CloudWatch Alarms** on error rates, latency percentiles, and function throttles - **X-Ray** for distributed tracing across AppSync resolvers and Lambda functions — enable this at the AppSync level and instrument Lambda handlers - **RUM** (Real User Monitoring via CloudWatch) for frontend performance data The Amplify console exposes a subset of these metrics. For serious production monitoring, go directly to CloudWatch and build the dashboards and alarms you need there. ### Custom Domain Setup Amplify Hosting manages ACM certificate provisioning for custom domains. The setup is straightforward for domains managed in Route 53. For domains managed externally, you will need to add CNAME validation records manually, and the console workflow for this is functional but not fast. One practical consideration: Amplify uses CloudFront under the hood, but the distribution is managed by Amplify. If you have existing CloudFront behavior configurations you want to apply, check what Amplify exposes through the console before assuming it is configurable. ### Environment Variable Management Amplify Gen 2 introduced the `secret()` function for referencing sensitive values. Secrets are stored in AWS Secrets Manager and injected at build time and runtime — they are not in your TypeScript code or your environment variable files. For non-secret configuration that varies by environment (feature flags, API endpoint URLs, tier-specific configuration), use Amplify environment variables configured per branch in the Amplify console or in `amplify.yml`. Do not put environment-specific configuration in your TypeScript backend definition files — it defeats the purpose of having branch-based environment parity. --- ## Frequently Asked Questions ### Should I use Amplify Gen 2 or CDK directly? If you want the managed CI/CD, the integrated auth and data layers, and the Gen 2 TypeScript configuration model, Amplify Gen 2 is reasonable. If you have complex infrastructure requirements or need precise control over every AWS resource, CDK directly gives you more flexibility at the cost of more responsibility. Teams that start with Amplify often migrate specific concerns to CDK via `defineBackend` as the application grows — this is a supported pattern. ### Can I use Amplify Gen 2 with an existing Next.js application? Yes. The Amplify backend is separate from your Next.js application. You can add Amplify to an existing Next.js project, connect it to Amplify Hosting, and adopt the auth and data layers incrementally. The most common starting point for an existing application is adding Amplify Hosting for CI/CD, then optionally adopting the auth and data layers. ### How does Amplify Gen 2 handle database migrations? It does not, in the traditional sense. DynamoDB is schemaless, so there is no migration runner. Schema changes in Amplify Gen 2 that affect DynamoDB (new fields, new indexes) are applied by deploying the updated schema definition. Removing a field from the schema does not delete data from DynamoDB — it stops the auto-generated resolvers from reading it. Managing backward compatibility during schema evolution is your responsibility. --- Amplify Gen 2 is worth evaluating seriously for Next.js applications that align with its model. The TypeScript-first approach is a genuine improvement, and the integrated auth and data layers save real time compared to assembling those components manually. The constraints are real but predictable — knowing them in advance lets you design around them or make an informed decision to use a different stack. If you are evaluating Amplify Gen 2 for a production application and want a second opinion on whether the stack fits your requirements, [start with an architecture conversation](/contact). The answer is usually clear within a focused discussion. --- ## BAA-Ready AI: What to Ask Vendors Source: https://tampadynamics.com/blog/baa-ready-ai-vendor-questions > The specific questions that separate AI vendors who can support a HIPAA workload from vendors who say they can. A practical guide for healthcare buyers in early evaluation. Date: 2026-04-02 "We support HIPAA" is a sentence vendors say. What it means in practice ranges from "we have a BAA template ready to send" to "we have never seen a healthcare customer and we hope it works out." This is the list of questions we use during vendor evaluation for AI products that will touch PHI. The answers you get separate vendors who have done this work from vendors who think they can. ## On the BAA itself **Will you sign your BAA, or ours?** Most established health-tech vendors have a BAA they will sign. Some require theirs. A vendor who balks at signing any BAA is not a healthcare vendor. **What is the BAA's scope?** Does it cover all of the vendor's services or just specific ones? Does it cover sub-processors? Does it carve out any data uses (analytics, model improvement) that a covered entity should refuse? **What sub-processors are in scope?** Every third party in the data path needs BAA coverage. The LLM provider, the embedding endpoint, the vector database, the logging service. Get the list. **What is the breach notification timeline?** HIPAA requires timely notification. Vendor BAAs commonly specify 24, 48, or 72 hours. Anything longer than that should be flagged. ## On the data path **Where does PHI go from the moment we send it?** The vendor should be able to draw a diagram of every system the data touches, with each labeled as "BAA-covered" or "not." If they cannot, they have not thought through the data path. **Where is data stored?** AWS region, Azure region, GCP region. Multi-region is fine; "we don't know" is not. Some healthcare contracts require US-based storage; some require specific regions. **Where do models run?** A model accessed via the vendor's API endpoint is not the same as a model deployed in your VPC. Both can be appropriate; the vendor should be clear which they offer. **What happens to PHI at rest?** Encryption with which key? Customer-managed keys? Vendor-managed keys with attestation? The detail matters. ## On training data **Is our data used to train models?** Default at major LLM providers (Anthropic, OpenAI, AWS Bedrock) for enterprise tier is no. Confirm the contract reflects this. Verify the vendor's downstream use of your data is also no. **Does the vendor fine-tune models on customer data?** If yes, what is the data lifecycle? If a customer leaves, can the contributions to fine-tuned models be removed? Often the answer is "not really," which is a deal-breaker for sensitive workloads. **Does the vendor use your data for anything other than serving you?** "Quality improvement," "feature development," "product analytics" are all answers that should slow procurement down. Get specifics. ## On audit and logging **What is logged for every model invocation?** The right answer includes: requesting user identity, prompt, retrieved context, model output, tool calls, timestamps. "Aggregate analytics" is not the right answer for HIPAA work. **Can we access the audit log directly?** Some vendors keep the AI audit log internal. For HIPAA, you need to be able to query it without a support ticket. API access, with rate limits and quotas, is the expectation. **How long are audit logs retained?** Six years minimum for HIPAA. Some vendors keep less; some keep indefinitely. Both have implications. **Are audit logs encrypted at rest? With customer-managed keys?** The audit log is PHI. Treat it that way. ## On tenant isolation **How is our data isolated from other customers'?** Index-per-tenant, namespace-per-tenant, or shared resources with metadata filters. Each has different security implications. Vendors who answer "logically separated" without specifics are flagging. **Can you demonstrate isolation?** The vendor should be able to show the architecture, the access control layer, and the controls that enforce isolation. Marketing diagrams are not demonstration. **What happens if another customer experiences a breach?** Worst case, what is your exposure? A vendor whose architecture has a single shared resource with weak isolation is one breach away from being yours. ## On model behavior **Which model is being used?** Specifics. "Claude Sonnet" is more useful than "an LLM." "Claude 3.5 Sonnet on AWS Bedrock" is the level of specificity that matters. **When does the model change?** Vendors update underlying models. Will you be notified? Is there an evaluation step? Will behavior change without your knowledge? **Are the prompts proprietary?** If the vendor will not show you the prompts, they will not be able to explain the model's behavior, and neither will you. For HIPAA work where every output may need to be defended, this matters. **What guardrails are in place?** Content filters, PII detection, prompt-injection protection. The vendor should have answers. ## On outputs **Are outputs cited?** Every claim the model makes should reference the source documents that informed it. Free-form outputs without citations are not auditable. **What happens when retrieval comes back empty?** The system tells the user it does not know. Or the system makes something up. The first is acceptable for healthcare; the second is not. **Is there a human-in-the-loop boundary?** For any clinical decision, a clinician should review and approve. The vendor's product should support this. If the workflow is "AI takes action automatically," that is a flag for HIPAA workloads. ## On customer support **Has the vendor passed a real procurement at a covered entity?** Reference customers. Names. The vendor should be willing to put you in touch with at least one customer in healthcare. **What is the SLA for security incidents?** AI systems break in novel ways. Hallucinations, prompt injection, retrieval failures. The vendor's response capability matters as much as their preventative posture. **Who is your security contact?** A specific person. A specific email. A specific escalation path. Not "support@vendor.com." ## Red flags A few responses that should slow procurement down materially: - "We use the latest models" without specifying. - "Your data is secure" without explanation. - "We're working on HIPAA" — not the same as HIPAA today. - "We can sign a BAA but it covers fewer services than we offer" — the carve-outs are where the risk lives. - "Our model provider handles that" — they may, but the contract is between you and the vendor in front of you. - Vague answers to detailed technical questions, especially in front of a Security Officer. Some red flags are disqualifying. Some indicate a vendor whose security posture is less mature than they realize, where the right move is a longer evaluation and a stricter SOW. ## The deeper test Beyond the checklist, the question that often separates ready vendors from unready ones: how the vendor talks about regulatory work. Vendors who have done HIPAA work talk about it as a set of practical engineering and operational decisions. Encryption keys, audit log schemas, retention policies, sub-processor lists. The work is detailed but tractable. Vendors who have not done HIPAA work tend to talk about it abstractly. "We're compliant." "We support HIPAA." "Our infrastructure is secure." The lack of specifics is the signal. If you are working through this checklist and the vendor's answers are consistently abstract, that is the answer. Move on. ## Where we fit We do BAA-readiness reviews for healthcare clients evaluating AI vendors regularly. The conversation often produces a shorter shortlist than the original RFP, but the vendors that survive are the ones that will clear the rest of the procurement process. [Get in touch](/contact) if you have a vendor evaluation in front of you. --- ## Building Compliant AI Workflows for Regulated Industries Source: https://tampadynamics.com/blog/building-compliant-ai-workflows > How to integrate AI into healthcare, legal, and compliance-focused systems while maintaining security, auditability, and regulatory compliance. Date: 2024-11-30 Integrating AI into regulated industries requires more than just connecting an LLM to your application. It demands careful attention to data handling, audit trails, and human oversight. ## The Challenge Organizations in healthcare, legal, and financial services face unique constraints when adopting AI: - **Data residency and privacy** — PHI, PII, and privileged information can't flow through arbitrary third-party services - **Auditability** — Every AI-assisted decision needs a clear trail for compliance reviews - **Human oversight** — AI should augment human judgment, not replace it without review ## Our Approach At Tampa Dynamics, we architect AI workflows with these principles from day one: ### 1. Data Never Leaves Your Control We design systems where sensitive data stays within your infrastructure. AI models can be self-hosted, or we use privacy-preserving patterns that anonymize data before it reaches external APIs. ### 2. Every Decision is Logged Our systems capture: - What data was sent to the AI - What response was received - Who reviewed the output - What action was taken ### 3. Guardrails by Default We implement validation layers that catch potential issues before they reach end users—whether that's checking for hallucinated information or ensuring outputs meet compliance standards. ## Getting Started If you're exploring AI adoption in a regulated environment, we'd recommend starting with a focused pilot: 1. Identify a specific workflow that's manual and time-consuming 2. Define clear success metrics and compliance requirements 3. Build with audit logging and human review from the start 4. Iterate based on real-world feedback Ready to discuss your AI strategy? [Request an architecture review](/contact) to explore what's possible. --- ## Custom Software for Healthcare Providers: When It Makes Sense and How to Do It Right Source: https://tampadynamics.com/blog/custom-software-for-healthcare > A practical guide to custom healthcare software development — covering use cases, HIPAA requirements, integration complexity, and what distinguishes successful projects. Date: 2026-02-13 Most healthcare technology decisions are not binary choices between custom software and nothing. They are choices between custom software and a commercial product that may or may not fit the specific clinical workflow, data environment, or integration requirement at hand. The question "should we build custom software?" almost always deserves a more specific framing: "Is the workflow we need to support well-served by available commercial products, or does our specific combination of clinical context, data model, and integration requirements make custom development the better investment?" This guide is for healthcare operators, clinical IT leaders, and engineering teams working through that evaluation. --- ## When Custom Software Beats Commercial Commercial healthcare software is good at solving the problems that most organizations share: scheduling, billing, basic EHR data capture, standard reporting. The further your requirements deviate from the median, the worse commercial software performs. Custom software is likely the better investment when: **Your workflow is genuinely non-standard.** Specialty practices, research environments, and care delivery models that differ from the ambulatory or inpatient norm often find that commercial tools approximate their workflows without quite fitting. Workarounds in clinical workflows have patient safety implications. Custom software that fits the actual workflow is safer than a commercial product that requires the workflow to adapt to it. **Your data sensitivity requires architectural control.** Some organizations handle data that demands tighter controls than commercial platforms provide — not because commercial platforms are insecure, but because the organization's compliance posture, legal exposure, or regulatory environment requires specific architectural patterns that multi-tenant commercial products cannot accommodate. Research data, behavioral health records, and substance use disorder treatment records (42 CFR Part 2) are examples where standard commercial platforms create compliance complexity. **Integration requirements are unusual.** If you need to integrate with proprietary systems, legacy infrastructure, or non-standard EHR configurations that commercial vendors do not prioritize, custom software may be the only practical path. Commercial platforms optimize their integrations for the most common EHRs and the most standard interface patterns. Edge cases in their integration support are often expensive or impossible to address without vendor involvement. **You are building a product, not just solving an internal problem.** Healthcare providers that are building clinical workflow tools to offer to affiliated organizations, health systems deploying internally developed tools across a network, or care delivery companies with proprietary clinical intelligence are building products. Products require control over the roadmap, the data model, and the integration architecture that commercial software does not provide. --- ## Common Use Cases for Custom Healthcare Software ### Clinical Workflow Tools Specialty-specific workflow tools that extend or sit alongside an EHR — rather than replacing it. Examples include: - Pre-procedure checklists and documentation workflows that the EHR supports generically but not specifically enough for the clinical protocol - Care coordination tools that aggregate data from multiple source systems into a single interface for a specific team or role - Clinical decision support tools that apply organization-specific protocols to patient data - Population health management tools that operate on data extracted from the EHR and normalized for the organization's specific patient cohort These tools typically integrate with the EHR rather than replacing it, and they require careful attention to data synchronization, conflict resolution, and the user experience of moving between systems. ### Patient Communication and Engagement Patient-facing applications for appointment scheduling, secure messaging, care instructions, and remote patient monitoring are an area where commercial products are abundant but often generic. Custom development makes sense when: - The care model is distinct enough that generic patient portal features do not serve it - The organization needs integration between patient-facing functionality and proprietary backend systems - Branding and user experience are strategic differentiators - The engagement model involves workflows (remote monitoring, chronic disease management, post-procedure follow-up) that commercial platforms support poorly Patient-facing healthcare applications have a different UX bar than clinician-facing tools. Patients are not trained on software; they use it infrequently and often under stress. The design investment in patient-facing custom software is higher, not lower, than for internal tools. ### Analytics and Clinical Reporting Healthcare providers often have significant data assets — years of clinical records, operational data, billing data — that commercial analytics platforms do not integrate or model correctly. Custom analytics platforms are appropriate when: - The analysis requires joining clinical, operational, and financial data in ways that commercial BI tools do not support - The clinical metrics being tracked are organization-specific and not handled by standard quality reporting platforms - Machine learning on proprietary clinical data is part of the strategy Analytics platforms that process PHI have the same HIPAA obligations as any other system in the environment. De-identification for analytics is a significant architectural consideration — see the HIPAA compliant app development guide for technical patterns. ### Administrative Automation Revenue cycle, prior authorization, referral management, and credentialing are workflow areas with significant administrative burden. Custom automation is appropriate when: - The existing process is handled manually in ways that create errors, delays, or staff burden - Commercial products for the specific workflow are not available or are a poor fit - Integration with the organization's specific systems creates more complexity than commercial platforms can handle --- ## HIPAA Technical Requirements That Shape Development Every custom healthcare application that creates, receives, maintains, or transmits PHI is a HIPAA-covered system. The technical requirements are not optional add-ons — they are architectural constraints that need to be designed in from the start, not retrofitted. The technical safeguards most likely to affect system design: **Access control architecture** — HIPAA's minimum necessary standard requires that access to PHI be scoped to what is actually needed for the specific purpose. This is not a RBAC checkbox; it is an access control design problem. Role-based access is a minimum. Healthcare applications typically require attribute-based access control that considers facility assignments, care team membership, and the patient's relationship to the accessing clinician. **Audit logging** — Every access, modification, and export of PHI must be logged with sufficient detail to reconstruct the complete history of any record. Audit logs must be retained for six years, stored separately from application logs, and protected against modification. **Encryption** — PHI at rest must be encrypted with documented key management. PHI in transit must use TLS 1.2 or higher. Encryption keys must be managed separately from encrypted data. **Automatic session termination** — Workstations with active clinical sessions must automatically terminate after configurable inactivity periods. This is a required specification under 45 CFR §164.312(a)(2)(iii). **BAA requirements for the full vendor stack** — Every vendor whose infrastructure processes PHI needs a BAA. This includes cloud providers, database hosting, authentication services, monitoring and observability tools, and any AI or LLM services integrated into the application. Mapping your vendor stack to BAA coverage is an early architectural requirement, not a legal afterthought. The HIPAA Security Rule uses required and addressable specifications. Neither category is optional — addressable specifications must either be implemented or documented with a compliant alternative. If your team is unfamiliar with this distinction, reviewing the full guidance before design begins is worth the time. --- ## EHR Integration Complexity EHR integration is consistently the most underestimated source of complexity in healthcare software projects. Understanding what you are getting into before committing to an integration approach is essential. ### FHIR as the Standard Path HL7 FHIR (Fast Healthcare Interoperability Resources) is the current standard for healthcare data exchange, and the 21st Century Cures Act mandated FHIR-based APIs for certified EHR systems. In theory, this means a standardized path to EHR data. In practice: - FHIR implementations vary significantly across EHR vendors — the same FHIR resource may be populated differently, contain different optional fields, or have different behavior in edge cases - FHIR APIs typically expose a subset of EHR data, not all of it. Custom clinical data, proprietary fields, and legacy data structures may not be available via FHIR - FHIR access often requires going through the EHR vendor's developer program, which involves its own approval process and timeline FHIR is the right starting point for EHR integration. It is not a guarantee of a smooth integration. ### Proprietary APIs and HL7 v2 Many EHR systems have proprietary APIs that expose more data than FHIR or support write operations that FHIR does not. Older interfaces, particularly in larger health system environments, may still use HL7 v2 — a message-based format that predates FHIR by decades and has its own significant implementation variability. Integrating with HL7 v2 interfaces requires an interface engine (Mirth Connect, Rhapsody, or similar) to translate between v2 messages and your application's data model. This adds infrastructure, operational overhead, and expertise requirements. ### Integration Scoping Before committing to EHR integration, define: - Which specific EHR platform(s) you are integrating with (not "major EHRs" — specific systems and versions) - Which data flows you need (read vs. write vs. bidirectional, which specific resources) - Which interface type is available and supported (FHIR R4, SMART on FHIR, HL7 v2, proprietary API) - What the EHR vendor's developer program requirements and timelines look like - Whether a third-party integration platform (Health Gorilla, 1upHealth, Redox) simplifies the integration at acceptable cost Discovering integration constraints after development has started is expensive. Discovery is part of the project, not a pre-project exercise to skip. --- ## PHI Data Handling Architecture Custom healthcare software requires a documented, deliberate PHI data architecture. The core pattern: **PHI lives in a designated, access-controlled data store.** This is not your general-purpose application database — it is a separate store with tighter access controls, encryption at rest with managed keys, and audit logging on every access. **Application logic fetches PHI only when a specific clinical purpose requires it.** The application does not load full patient records by default; it fetches the minimum PHI necessary for the current operation. **PHI identifiers and PHI content are separated.** Operational systems work with patient IDs and record identifiers. PHI content — the actual clinical data — is retrieved only at the point of rendering, under the access controls and audit logging of the PHI store. **Non-production environments never use real PHI.** Development, staging, and QA environments use synthetic data or de-identified data. Exposing real PHI in development environments, including through production database snapshots, expands your HIPAA scope to systems and people who should not be in scope. --- ## Patient-Facing vs. Clinician-Facing Design Considerations Healthcare software has two distinct user populations with fundamentally different design requirements. **Clinician-facing tools** are used by trained professionals in high-stakes, time-pressured environments. The design priorities are efficiency, information density, and error prevention. Clinicians learn software through training and repeated use; they tolerate complexity if it supports efficiency. The cost of a usability failure is clinical workflow disruption and, in some contexts, patient safety risk. **Patient-facing tools** are used by people with no specialized training, often in stressful situations, sometimes on mobile devices, sometimes in low-bandwidth environments. The design priorities are clarity, accessibility, and trust. The cost of a usability failure is patient disengagement or incorrect understanding of clinical information. These are different product design problems. Teams that try to serve both populations with the same design language typically under-serve one of them. If your application serves both, design the two experiences separately and integrate at the data layer. --- ## Build vs. Buy Decision Framework The build vs. buy decision in healthcare is not a simple cost comparison. The relevant factors: **Workflow fit** — How well does the commercial product fit your specific clinical workflow, without modification? If the answer is "well enough with workarounds," model what those workarounds cost in user time, error rate, and staff frustration over three years. **Integration compatibility** — Can the commercial product integrate with your specific EHR, at the data flows you need, on a timeline that matches your project? Integration promises from vendors deserve skepticism until confirmed in technical detail. **Data control** — Does the commercial product allow you to control your data in the way your compliance posture requires? Data portability, deletion capability, and the vendor's data use policies are all relevant. **Roadmap dependency** — Commercial products build features on their roadmap, not yours. If your requirements are evolving and you need control over the feature set, a commercial product may become a constraint faster than you expect. **Total cost of ownership** — Commercial licensing costs are visible. The cost of living with a poor fit — workarounds, training, data cleanup, integration maintenance — is less visible but often larger. Custom software has higher upfront costs and requires ongoing development investment. Commercial software has lower upfront costs and defers that investment into licensing, configuration, and the cost of workarounds. Neither is universally better; the right answer depends on how well the commercial product fits your specific requirements. --- ## Common Project Failure Patterns Healthcare software projects fail in predictable ways: **Underscoped EHR integration.** Integration complexity is discovered during development rather than during scoping. Timeline extends, scope contracts, and the application ships with integration gaps that limit its utility. **Compliance retrofitted rather than designed in.** HIPAA requirements are added to an existing design rather than shaping the design from the start. The result is expensive rework of access control, audit logging, and data handling architecture. **Clinician input deferred.** The application is designed by IT and product teams, validated with clinicians late in the process, and fails to reflect actual clinical workflow. Adoption suffers. **Scope expansion without timeline adjustment.** Clinical requirements grow during development as the organization discovers that the original scope does not fully solve the problem. Timelines do not adjust proportionally. **Vendor accountability gaps.** Development vendor delivers code that passes acceptance testing but does not reflect production-grade security controls. PHI handling, access control, and audit logging are not tested at the level required for a HIPAA-compliant system. --- ## What Good Healthcare Software Vendors Actually Deliver A vendor that has done this work before will: - Start with a thorough discovery phase that covers clinical workflow, EHR integration specifics, HIPAA obligations, and BAA requirements for the full vendor stack - Design the access control and audit logging architecture before writing application code - Use de-identified or synthetic data in all non-production environments - Provide a clear data flow diagram that maps all PHI through the system - Treat security review and compliance documentation as part of the deliverable, not an afterthought - Structure the engagement so that clinical workflow validation happens early, not at the end If a vendor is proposing to build a HIPAA-covered application without asking detailed questions about your PHI data flows, access control requirements, and BAA coverage for the full stack, that is a signal worth paying attention to. --- ## The Right Conversation to Start With Custom healthcare software projects that succeed share a common pattern: the technical requirements — HIPAA architecture, EHR integration constraints, access control model, audit logging design — were treated as engineering problems to be solved, not compliance documents to be produced. If you are evaluating custom software development for a clinical workflow, patient engagement, or analytics use case, [an architecture review](/contact) is a structured engagement to work through the technical requirements before committing to a development plan. Our [healthcare software development practice](/services/healthcare-software) and [healthcare AI consulting](/healthcare-ai-consulting) work covers the full range of custom clinical system design. The conversation starts with your specific clinical context, not with a sales pitch about technology capabilities. --- ## DynamoDB Access Patterns for High-Performance Applications Source: https://tampadynamics.com/blog/dynamodb-patterns > A practical guide to DynamoDB data modeling — covering single-table design, access pattern planning, GSIs, sparse indexes, and the patterns that prevent expensive rework. Date: 2026-01-14 DynamoDB is one of the most performant and scalable databases available on AWS. It is also one of the most expensive to retrofit when the data model is wrong. The reason is the same in both cases: DynamoDB is built around access patterns. You define the access patterns first, model the data to support them efficiently, and then query exactly as the model expects. Do this correctly and you get single-millisecond latency at any scale. Design it in reverse — start with the data model and figure out access patterns later — and you eventually hit a wall that requires either rebuilding the data model or replacing DynamoDB with a relational database. This guide covers the patterns that matter, starting with the cardinal rule. --- ## The Cardinal Rule: Access Patterns Before Data Model In a relational database, you design a normalized schema and then write queries. The query layer is flexible — you can join tables, filter on any column, sort by arbitrary fields, and add indexes later if queries are slow. In DynamoDB, the query layer is not flexible. You can only query by primary key and sort key. Secondary indexes (GSIs and LSIs) expand what you can query, but they are defined at table creation time and must be maintained. Ad-hoc queries that were not anticipated in the data model either require expensive scans or are impossible. This means the design process is inverted. Before modeling any data, document every access pattern your application will use: ``` 1. Get user by user_id 2. Get all orders for a user, sorted by date (descending) 3. Get all orders with status=PENDING across all users (admin use) 4. Get order by order_id 5. Get all items in an order 6. Get all orders containing a specific product_id ``` Write these down. All of them. Then design the data model to support every pattern with a primary key query or a GSI query. If a pattern cannot be supported this way, it either needs to be rethought or moved to a different data store. --- ## Primary Key Design A DynamoDB primary key is either a simple key (partition key only) or a composite key (partition key + sort key). Most production tables use composite keys. **Partition key (PK)** — Determines which physical partition holds the item. DynamoDB distributes items across partitions based on the partition key hash. All queries must specify the partition key. **Sort key (SK)** — Enables range queries within a partition. Items with the same partition key and different sort keys are stored together, sorted lexicographically. Sort key queries support begins_with, between, and comparison operators. The most important property of a good partition key: high cardinality with even distribution. A partition key with low cardinality (e.g., a status field with three possible values) concentrates traffic on a small number of partitions, creating hot partitions that hit throughput limits. A partition key with high cardinality (e.g., user_id or order_id) distributes traffic evenly. --- ## Single-Table Design Single-table design is the dominant pattern in production DynamoDB systems. The idea: store all entity types in a single table, using the primary key structure to differentiate entities and support multiple access patterns. This is counterintuitive to engineers with a relational background, where entities live in their own tables. The reason single-table works in DynamoDB is that DynamoDB queries are partition-scoped — items with the same partition key are stored and retrieved together efficiently. Storing related data under a single partition key, using the sort key to differentiate it, enables fetching an entity and its related data in a single query. ### A Concrete Example Consider an order management system with Users, Orders, and Order Items. ``` Table: OrdersTable # User entity PK: USER#user_123 SK: #METADATA#user_123 Attributes: name, email, created_at # Order entity (under the user) PK: USER#user_123 SK: ORDER#2026-02-17#order_456 Attributes: total, status, shipping_address # Order item entity (under the order) PK: ORDER#order_456 SK: ITEM#item_789 Attributes: product_id, quantity, unit_price # Order entity (accessible by order_id directly) PK: ORDER#order_456 SK: #METADATA#order_456 Attributes: user_id, total, status, created_at ``` This structure supports: - **Get user**: `PK = USER#user_123, SK = #METADATA#user_123` - **Get all orders for user**: `PK = USER#user_123, SK begins_with ORDER#` - **Get orders for user in date range**: `PK = USER#user_123, SK between ORDER#2026-01-01 and ORDER#2026-02-28` - **Get order by ID**: `PK = ORDER#order_456, SK = #METADATA#order_456` - **Get all items in order**: `PK = ORDER#order_456, SK begins_with ITEM#` This is the core value of single-table design: multiple access patterns served by a single table with no joins. ### Item Collections Items that share a partition key form an item collection. In a single-table design, a user's item collection might contain their profile, their orders, and their addresses — all stored under `PK = USER#user_id`. Item collection size is limited to 10GB if a local secondary index exists on the table. For most applications, this limit is not reached, but if individual item collections can grow large (e.g., a user with millions of orders), design with this limit in mind. --- ## Global Secondary Indexes A Global Secondary Index (GSI) is a separate index with its own partition key and sort key, built from a subset of the table's attributes. GSIs enable access patterns that the base table's primary key does not support. GSIs are eventually consistent by default (you can request strongly consistent reads from the base table, but not from GSIs). They add storage cost and write throughput cost — every write to the base table that affects a GSI attribute triggers a write to the GSI. ### GSI Overloading A single GSI can support multiple access patterns if you use the same overloading pattern as the base table. This is GSI overloading. Example: You need to support two additional access patterns: 1. Get all PENDING orders (across all users) — admin view 2. Look up a user by email address Rather than creating two GSIs, create one with `GSI_PK` and `GSI_SK` attributes: ``` # For the order entity, populate GSI attributes for status-based lookup GSI_PK: STATUS#PENDING GSI_SK: ORDER#2026-02-17#order_456 # For the user entity, populate GSI attributes for email lookup GSI_PK: EMAIL#user@example.com GSI_SK: #METADATA#user_123 ``` Now a single GSI supports both patterns. The overloaded GSI pattern keeps the number of indexes minimal while supporting a wide range of access patterns. --- ## Sparse Indexes A sparse index is a GSI that only indexes a subset of items — specifically, only items that have the GSI's partition key attribute defined. If the `GSI_PK` attribute is only populated on items with a specific status — say, PENDING orders — then the GSI only contains those items. Queries against the sparse index automatically filter to that subset without needing a filter expression. ``` # Only PENDING orders have this attribute set PENDING_ORDER_GSI_PK: "PENDING" # Only set on orders with status=PENDING # COMPLETED orders do not have this attribute at all # → They are not in the GSI ``` Querying the GSI for `PK = PENDING` returns only pending orders, efficiently, without scanning completed orders. When an order is fulfilled and its status changes to COMPLETED, the attribute is removed, and the item is automatically removed from the GSI. Sparse indexes are useful for any pattern that involves "get all X where Y is true" where Y is a state that applies to a minority of items. --- ## Relationship Patterns ### 1:1 Relationships Store as separate items sharing a partition key, or as attributes on a single item if the data is always accessed together and the total item size remains under 400KB. ### 1:N Relationships Use the parent entity's ID as the partition key and the child entity's ID (or a sortable attribute) as the sort key. This supports fetching all children of a parent in a single query. ``` PK: ACCOUNT#account_123 SK: TRANSACTION#2026-02-17T14:23:11Z#txn_456 ``` For large collections, where the parent may have millions of children, consider whether you actually need to fetch all children or only recent children. The sort key's range query capability is particularly useful here — `SK begins_with TRANSACTION#2026-02` fetches only February transactions, for example. ### M:N Relationships Many-to-many relationships require explicit join items. If users can belong to many teams, and teams have many users: ``` # User → Teams lookup PK: USER#user_123 SK: TEAM#team_456 Attributes: role, joined_at # Team → Users lookup (duplicate item with swapped PK/SK for the GSI, or separate item) PK: TEAM#team_456 SK: USER#user_123 Attributes: role, joined_at ``` This pattern duplicates the relationship data, which is the DynamoDB approach to supporting queries in both directions without joins. --- ## Transactional Writes DynamoDB supports ACID transactions across up to 100 items in a single TransactWriteItems call. This enables: - Creating multiple related items atomically (e.g., creating an order and decrementing inventory in a single transaction) - Conditional writes that fail if a precondition is not met (e.g., only create an item if an item with that key does not already exist) - Consistent multi-item updates that should not be partially applied Transactions in DynamoDB cost twice the write capacity of non-transactional writes (the overhead of the coordination mechanism). For operations that genuinely require atomicity, this is the right tool. For operations that do not, pay the lower cost of standard writes. ### Optimistic Locking with Version Numbers For concurrent update scenarios, DynamoDB's conditional write expressions enable optimistic locking without a separate locking mechanism: ``` # Write condition: only update if version_number matches expected value ConditionExpression: "version_number = :expected_version" UpdateExpression: "SET version_number = :new_version, ..." ``` If two processes try to update the same item concurrently, the second write fails the condition check. The second process then retries with the current state. This is the standard pattern for preventing lost updates in DynamoDB. --- ## DynamoDB Streams for Event-Driven Patterns DynamoDB Streams captures a time-ordered sequence of item-level changes (inserts, updates, deletes) and makes them available for downstream processing. This enables event-driven architectures without polling. Common patterns built on DynamoDB Streams: **Derived data maintenance.** When an order item is updated, a stream processor recalculates the order total and updates the parent order item. The parent is always consistent with its children without requiring the write path to do both updates. **Cross-region replication.** Stream processors read changes from a primary region and replicate them to secondary regions. (AWS Global Tables is a managed version of this pattern.) **Audit logging.** Every item change is captured in the stream and written to an audit log store. This is a clean separation between the application write path and the audit trail — the application writes to DynamoDB, the stream processor writes to the audit log, and neither path knows about the other. **Search index synchronization.** Item changes in DynamoDB trigger an OpenSearch index update via a stream processor. The operational database and the search index stay in sync without coupling the write path to the search index write. Stream records are available for 24 hours. Consumers must process them within that window or miss them. For audit and compliance use cases, ensure your stream consumer has adequate error handling and retry logic. --- ## TTL for Automatic Expiration DynamoDB's Time to Live (TTL) feature automatically deletes items when a specified timestamp attribute passes. TTL deletions are background operations — they do not consume write capacity and occur within approximately 48 hours of the TTL attribute's expiry time (not exactly at the specified time). TTL is appropriate for: - **Session data** — Session records that should expire after a fixed inactivity period - **Temporary state** — Pending verifications, one-time tokens, in-progress operations with timeouts - **Caching** — Items used as a DynamoDB cache layer, where stale data should be removed automatically TTL deletions appear in DynamoDB Streams, which means TTL can be used to trigger downstream cleanup operations — deleting related items in other tables or updating derived data when the primary item expires. For compliance use cases where data must be retained for a defined period and then deleted, TTL combined with a stream processor provides a clean deletion mechanism with a downstream audit log of the deletion event. --- ## Capacity Planning: On-Demand vs. Provisioned DynamoDB offers two billing modes: **On-demand** — You pay per request. DynamoDB automatically scales to any traffic level with no configuration. No capacity planning required. Higher per-request cost than provisioned at sustained load. **Provisioned** — You specify the read and write capacity units (RCUs and WCUs) the table should maintain. Lower per-unit cost than on-demand at predictable load. Requires capacity planning and auto-scaling configuration to handle traffic spikes. For most applications, on-demand mode is the right default: - No risk of throttling from under-provisioning - No wasted spend from over-provisioning - Zero capacity planning required - Appropriate for variable or unpredictable traffic Switch to provisioned mode when: you have a well-characterized, stable traffic pattern and the per-request cost difference justifies the operational overhead of capacity management. For high-throughput applications with millions of requests per day, the cost difference is significant. --- ## Cost Optimization Patterns **Project attributes to reduce item size.** DynamoDB bills on the size of the items read and written. Large items cost more per operation. Storing large blobs in S3 and storing the S3 reference in DynamoDB reduces item size and read/write cost. **Use batch operations.** BatchGetItem and BatchWriteItem reduce per-request overhead for bulk operations. TransactWriteItems is more expensive than batch writes for non-transactional operations — use it only when atomicity is actually required. **Prefer eventual consistency where possible.** Eventual consistency reads cost half the RCUs of strongly consistent reads. For read patterns that do not require seeing the most recent write (e.g., displaying a list that updates periodically), eventual consistency is appropriate and less expensive. **Archive infrequently accessed data.** Items that are rarely accessed but must be retained (e.g., historical order records, archived documents) can be moved to S3 and accessed via Athena, reducing DynamoDB storage costs. --- ## When NOT to Use DynamoDB DynamoDB is the wrong choice when: **Your access patterns are not known upfront.** If you need ad-hoc queries across arbitrary fields — analytical queries, exploratory data access, complex filtering — DynamoDB will frustrate you. Use a relational database (Aurora PostgreSQL) or a purpose-built analytics store (Redshift, Athena over S3). **You need complex transactions across many items.** DynamoDB's 100-item transaction limit and lack of multi-table join capability make it unsuitable for systems with complex relational constraints — financial ledgers with multi-table consistency requirements, inventory systems with cascading updates across many entities. **Your data model is highly relational and frequently changing.** Single-table DynamoDB models for complex domains with many entity types and many access patterns are difficult to design correctly and difficult to evolve. If the access patterns are genuinely unpredictable, the flexibility of a relational database is worth the scaling trade-off. **You need full-text search or fuzzy matching.** DynamoDB supports exact key lookups and range queries. Full-text search, stemming, and fuzzy matching require OpenSearch or a purpose-built search service. DynamoDB is excellent for: user and session data, time-series event data, leaderboards and rankings, operational data stores with well-defined access patterns, and high-throughput write workloads. It is not a general-purpose replacement for relational databases. --- ## Building the Right Model Before You Build the System The upfront investment in DynamoDB access pattern analysis pays back multifold. An hour spent writing out every access pattern before touching the data model prevents weeks of rework when the application is in production and a new access pattern requires restructuring the table. If you are building an application on AWS and working through the data architecture — whether that is DynamoDB, Aurora, or a hybrid — [an architecture review](/contact) covers this territory. Our [cloud architecture practice](/services/cloud-architecture) and [AI development work](/services/ai-development) both involve DynamoDB design as a regular part of system design engagements. The goal is always the same: model the data correctly the first time, so the system does not need to be rebuilt when it scales. --- ## FHIR vs HL7: A Practical Comparison for Healthcare Software Teams Source: https://tampadynamics.com/blog/fhir-vs-hl7 > A technical comparison of FHIR and HL7 v2 for engineering teams building healthcare integrations. Covers data models, interoperability use cases, EHR compatibility, and implementation considerations. Date: 2026-02-04 If you are building software that integrates with hospitals, clinics, health systems, or any EHR platform, you will encounter both HL7 v2 and FHIR. Understanding the difference between them is not just an academic exercise — choosing the wrong integration approach for a given environment can cost months of development time and produce an integration that no one can maintain. This guide is written for engineering teams who need to make concrete decisions about healthcare integrations, not for readers who want a survey of standards bodies and working group history. --- ## HL7 v2: What It Is and Why It Still Dominates HL7 v2 is a messaging standard that has been in production use since 1987. It is pipe-delimited, segment-based, and looks like this: ``` MSH|^~\&|SendingApp|SendingFacility|ReceivingApp|ReceivingFacility|20260217142311||ADT^A01|MSG000001|P|2.5 EVN|A01|20260217142311 PID|1||12345^^^Hospital^MR||Smith^John^A||19800315|M|||123 Main St^^Tampa^FL^33601^US|||||||12345-6789 PV1|1|I|2NORTH^201^01^Hospital||||1234^Attending^Doctor|||||||||||V|2026001234 ``` Each caret-delimited field is a segment. The message type (ADT^A01, in this case an admit notification) tells the receiving system what to do with the message. HL7 v2 has message types for every significant clinical event: admissions (ADT), laboratory results (ORU), orders (ORM), pharmacy (RDE), scheduling (SIU), and more. HL7 v2 is not a modern standard. It predates JSON, REST, and modern API design patterns by decades. But it is the integration backbone of essentially every hospital built before 2015, and many built after. Epic, Cerner (Oracle Health), MEDITECH, and McKesson — the EHRs running most US hospitals — all speak HL7 v2 natively through their integration engines. The reason HL7 v2 persists is not technical merit. It persists because hospitals have integration engines (Mirth Connect, Rhapsody, Infor Cloverleaf, Iguana) that process millions of HL7 v2 messages per day across connections that have been stable for years. Replacing that infrastructure is expensive and risky, and there is no compelling reason to do it when the existing system works. ### What "Works" Actually Means HL7 v2 interoperability is often described as "minimal interoperability" in the standards community, and this is fair. The standard permits significant field-level variation. The same concept may be encoded differently across hospitals, or even across departments in the same hospital. Z-segments — custom extensions — are common and non-portable. What this means in practice: building an HL7 v2 integration involves not just implementing the standard but negotiating with the specific hospital's implementation of the standard, which will have its own field population patterns, message volume characteristics, and historical quirks. You do not implement HL7 v2 once and connect to every hospital. You implement HL7 v2 and then tune for each site. This site-by-site variation is the primary cost driver for HL7 v2 integrations. Budget for discovery and customization at each new site. --- ## FHIR: What It Actually Is FHIR (Fast Healthcare Interoperability Resources, pronounced "fire") is a standard developed by HL7 International starting around 2011, with R4 reaching normative status in 2019. It is fundamentally different from HL7 v2 in both design and intent. FHIR is REST-based and resource-oriented. Clinical data is modeled as typed resources (Patient, Observation, Encounter, Medication, DiagnosticReport, and hundreds of others), exchanged as JSON or XML, and accessed through a RESTful API: ``` GET /Patient/12345 GET /Observation?patient=12345&code=http://loinc.org|2339-0 POST /DocumentReference ``` A FHIR Patient resource looks like this (abbreviated): ```json { "resourceType": "Patient", "id": "12345", "identifier": [ { "system": "http://hospital.org/mrn", "value": "MRN-789456" } ], "name": [ { "family": "Smith", "given": ["John", "A"] } ], "birthDate": "1980-03-15", "gender": "male", "address": [ { "line": ["123 Main St"], "city": "Tampa", "state": "FL", "postalCode": "33601" } ] } ``` FHIR resources reference each other by ID, support standard CRUD operations, and are designed to be queried using a standardized search API. The model is far more developer-friendly than HL7 v2 if you are building a new integration and your target system supports it. ### SMART on FHIR SMART on FHIR (Substitutable Medical Applications and Reusable Technologies) is a framework layered on top of FHIR that adds OAuth 2.0-based authorization and a standardized launch context. It enables applications to be launched from within EHRs with the appropriate patient and user context, and to request scoped permissions to read or write FHIR resources. SMART on FHIR is the mechanism by which third-party applications integrate with modern EHR APIs. If you are building an application that will be accessed from within Epic, Cerner, or another SMART-enabled EHR, SMART on FHIR is not optional — it is the integration model. --- ## Key Differences: A Side-by-Side View | Dimension | HL7 v2 | FHIR R4/R5 | |---|---|---| | Protocol | TCP/MLLP (or HTTPS) | HTTPS REST | | Format | Pipe-delimited segments | JSON or XML | | API style | Message-based (push) | Resource-based (request/response) | | Query support | Limited (predefined message types) | Rich search API with many parameters | | Versioning | v2.1 through v2.8.2 | DSTU2, STU3, R4, R5 | | EHR compatibility | Universal (all legacy EHRs) | Modern EHRs (Epic, Cerner, etc.) via APIs | | Standardization | Significant site variation | More consistent, still some variation | | Developer experience | Requires specialized knowledge | REST-standard, more accessible | | Real-time events | Push model via MLLP | Subscriptions (R4B+), webhooks | --- ## FHIR R4 vs R5 R4 is the current normative standard and is what you will encounter in production EHR APIs. Epic, Cerner, and most payers that offer FHIR APIs support R4. R5 was released in 2023 and introduces meaningful improvements: a more robust subscription model, improved versioning support, enhanced search capabilities, and refinements to several resource types. However, as of early 2026, R5 is not widely implemented in production EHR systems. Building a new integration against R5 will limit your compatible systems. The practical guidance: target R4 for any integration you are building now. Monitor R5 adoption, particularly in the subscription and event notification use cases where R5 improvements are most significant. --- ## The 21st Century Cures Act and FHIR Mandates The 21st Century Cures Act (2016) and the HHS Interoperability and Information Blocking rules (finalized 2020, enforcement began 2022) created a regulatory mandate for FHIR-based data access. The key requirements: **Certified EHR Technology** — EHRs that seek ONC certification (required for Meaningful Use incentives) must support FHIR R4 APIs for patient data access, specifically through SMART on FHIR-enabled apps. **US Core profiles** — The regulations mandate support for the US Core Implementation Guide, which defines the minimum data elements and FHIR profiles that must be supported. US Core R4 profiles specify required fields and search parameters for Patient, Condition, AllergyIntolerance, Immunization, MedicationRequest, and other core resources. **Information blocking prohibitions** — Health systems, EHR vendors, and health IT developers are prohibited from practices that unreasonably restrict the access, exchange, or use of electronic health information. This creates a legal environment where refusing to provide FHIR API access is increasingly difficult to justify. The practical implication for software teams: large health systems now have a regulatory obligation to expose FHIR APIs, and Epic and Cerner have both implemented FHIR R4 APIs across their customer base. Accessing data from a modern EHR is often cleaner through FHIR than through legacy HL7 v2 interfaces, and the regulatory trend continues to push in that direction. --- ## When You Will Encounter Each Standard **HL7 v2 environments** — Community hospitals and critical access hospitals running older EHR versions, radiology and laboratory information systems, older pharmacy systems, long-term care facilities, legacy data migration projects, and any integration with a hospital's existing interface engine. If the hospital is running any version of Epic before 2018, Cerner before approximately 2020, or MEDITECH for most use cases, expect HL7 v2 to be the primary interface mechanism. **FHIR environments** — Epic MyChart APIs, Epic third-party developer program (App Orchard), Cerner HealtheIntent, payer data exchange (the Payer-to-Payer FHIR rule requires payers to support FHIR R4), government programs (CMS Blue Button 2.0 for Medicare data), patient-facing applications accessing data from modern EHR patient portals, and most new health tech integrations being designed after 2021. **Both simultaneously** — Large health systems with a mix of modern and legacy infrastructure, organizations aggregating data from multiple facilities, and applications that need both real-time event processing (where HL7 v2 ADT feeds are common) and on-demand data access (where FHIR APIs are preferred). --- ## Implementation Complexity ### HL7 v2 Integration Complexity Building an HL7 v2 integration without an integration engine is painful. The MLLP transport protocol (a TCP-based framing protocol) is not HTTP, which means your existing HTTP client libraries do not work out of the box. Message parsing requires either a specialized library or custom parsing logic. Libraries worth evaluating: - **Java**: HAPI HL7 (the de facto standard, actively maintained) - **Python**: `hl7apy`, `python-hl7` - **Node.js**: `node-hl7-client` - **.NET**: NHapi The more significant challenge is integration engine interoperability. Most hospitals route HL7 v2 messages through an integration engine (Mirth Connect is open source and common; Rhapsody and Infor Cloverleaf are common in large health systems). You will often need to work with the hospital's IT team or their integration engine vendor to establish the connection, configure message filtering, and handle acknowledgment patterns. Factor in time for the hospital IT approval process. Even straightforward HL7 v2 integrations routinely take three to six months from initial conversation to live data flow, because health system IT approval processes are cautious by design. ### FHIR Implementation Complexity FHIR is technically more accessible than HL7 v2 — it is REST over HTTPS, using JSON, with a well-documented API. The technical implementation is straightforward compared to MLLP and HL7 v2 parsing. The complexity in FHIR integrations comes from three places: **Profile compliance** — FHIR resources have a base definition and then profiles (US Core, QI-Core, Da Vinci, etc.) that add constraints and required elements. A Patient resource from Epic may differ from a Patient resource from Cerner in which extensions are populated, which identifiers are present, and which search parameters are supported. The FHIR specification permits significant optional variation; actual interoperability requires working within the profiles that both systems support. **OAuth 2.0 and SMART flows** — SMART on FHIR adds authorization complexity, particularly around the launch context flow (EHR-launched vs. standalone launch), scope management, and token refresh. For patient-facing applications, the authorization flow involves the patient authenticating with the health system's patient portal, which varies across EHRs. **Epic and Cerner API-specific behaviors** — In practice, you are often integrating not with abstract FHIR but with Epic's FHIR API or Cerner's FHIR API. Each has quirks: rate limits, non-standard extensions, specific search parameter support, and their own developer registration and access approval processes. Epic's sandbox environment is accessible through their open.epic.com developer portal; Cerner's through the code.cerner.com portal. --- ## Security Considerations for Both ### HL7 v2 Security HL7 v2 over MLLP has minimal native security. The historical assumption was that HL7 v2 traffic ran over private hospital networks, not the public internet. Modern deployments should use: - **MLLP over TLS** (MLLP+) for encrypted transport when messages traverse any network segment you do not fully control - **VPN tunnels** for connections between facilities or between a vendor system and a hospital's internal network - **IP allowlisting** at the firewall level to restrict which systems can send and receive HL7 v2 messages Even over private networks, HL7 v2 messages contain PHI in plaintext. Network access controls and encryption in transit are necessary even in internal environments. ### FHIR Security FHIR over HTTPS provides transport encryption. But several additional security considerations apply: **Token scope management** — SMART on FHIR scopes define what the application can access. Request only the scopes you need. Broad scopes (`patient/*.*`) are appropriate in some development scenarios; they are not appropriate in production. **Token storage** — Access tokens and refresh tokens must be stored securely. In a web application context, this means server-side storage or secure httpOnly cookies — not localStorage or sessionStorage, which are accessible to JavaScript and vulnerable to XSS. **Audit logging** — Access to FHIR APIs that return PHI needs to be logged at the application level, not just at the transport level. Your application needs an audit trail of which patient data was accessed, by whom, and for what purpose — independent of what the EHR logs on its end. **PKCE** — The SMART on FHIR 2.0 specification requires PKCE (Proof Key for Code Exchange) for public clients. If you are implementing SMART on FHIR authorization, use PKCE regardless of whether it is strictly required for your client type. --- ## Practical Guidance: Which to Use When **Use HL7 v2 when:** - Your integration target is a community hospital or facility running a legacy EHR - You need real-time ADT feed processing (admit/discharge/transfer events) - The hospital IT team has an existing integration engine and prefers to route messages through it - You are building a device integration (many medical devices and monitoring systems still output HL7 v2) - The data you need is only available through the legacy interface **Use FHIR when:** - Your integration target is a modern EHR (Epic post-2018, Cerner, athenahealth, modern eClinicalWorks) - You are building a patient-facing application that accesses data the patient has a right to access - You need on-demand data access (query by patient, query by date range, search by condition code) - You are building for payer data exchange under the CMS Interoperability Rule - You are starting a new integration and have the option to choose **Use both when:** - You are aggregating data from a mixed environment (modern and legacy EHRs) - You need real-time event notifications (HL7 v2 ADT) plus on-demand data access (FHIR queries) - You are building a platform that needs to connect to any hospital, regardless of their technology generation --- ## Frequently Asked Questions ### Can we translate HL7 v2 messages into FHIR resources? Yes, and there are published mappings for the most common conversions (ADT to FHIR Encounter/Patient, ORU to FHIR DiagnosticReport/Observation). The HL7 community maintains a ConceptMap and StructureMap library for common v2-to-FHIR translations. Integration engines like Mirth Connect have modules for FHIR transformation. The translations are not perfect — HL7 v2 and FHIR do not have 1:1 field correspondence for all data — but they are workable for most common use cases. ### Is FHIR replacing HL7 v2? Gradually, and not uniformly. FHIR is the direction the industry is moving, accelerated by regulatory mandates. But HL7 v2 will remain in production healthcare systems for many years because replacing integration infrastructure at hospitals is slow and expensive. Any team building healthcare integrations should be comfortable with both. ### What is CDA and how does it relate to FHIR? CDA (Clinical Document Architecture) is an HL7 standard for structured clinical documents — discharge summaries, referral letters, care plans. It uses XML and was the basis for Meaningful Use Stage 2 requirements (via C-CDA, Consolidated CDA). FHIR has largely superseded CDA for new implementations, but C-CDA documents are still widely used for transitions of care (hospital discharge to primary care, for example). Many FHIR integrations include the ability to generate or consume C-CDA documents for document-based workflows even when data APIs are FHIR-based. --- If your team is building a healthcare integration and needs clarity on which approach fits your specific environment, or if you are designing a system that needs to handle both HL7 v2 and FHIR at scale, [start with an architecture conversation](/contact). We work in this space regularly and can give you a direct answer based on your actual target environment. --- ## Fintech Software Development: Compliance, Security, and Scale Source: https://tampadynamics.com/blog/fintech-development-guide > A technical guide to fintech software development — covering regulatory frameworks, security architecture, payment processing, and the engineering patterns that matter in financial services. Date: 2026-01-21 Fintech development is not regular software development with a few extra security requirements. The regulatory environment, the sensitivity of financial data, the liability exposure of processing errors, and the fraud surface area create a distinct set of engineering constraints that shape architecture decisions from the beginning. This guide is for engineering leaders, CTOs, and founders building financial services software — covering the regulatory landscape, core security architecture, and the engineering patterns that distinguish fintech systems from general-purpose software. --- ## The Regulatory Landscape Fintech operates in a fragmented regulatory environment. The specific regulations that apply depend on the product type, the customers served, and the states and countries where the product operates. Getting this wrong is not just a compliance problem — it is a legal and operational risk. ### PCI DSS The Payment Card Industry Data Security Standard applies to any system that stores, processes, or transmits cardholder data. PCI DSS is not a government regulation — it is a contractual requirement imposed by the card networks (Visa, Mastercard, etc.) through payment processor agreements. The most important decision in PCI DSS compliance is scope reduction: minimize the systems that touch cardholder data. Using a payment processor like Stripe and never allowing raw card data to reach your servers reduces your PCI scope dramatically. Stripe handles card data; your system handles tokens. The compliance burden of a tokenized integration is far lower than processing card data directly. If your product requires storing cardholder data — recurring billing without a vault, legacy hardware integrations — PCI DSS compliance is a significant engineering and audit commitment. Level 1 compliance requires an annual assessment by a Qualified Security Assessor and quarterly network scans. ### SOC 2 SOC 2 Type II is the de facto standard for fintech companies selling to enterprise customers or other businesses. It is an audit of your security controls over a defined period (typically six to twelve months), covering five Trust Services Criteria: security, availability, processing integrity, confidentiality, and privacy. SOC 2 is not a specific technical standard — it is a framework that requires you to define your controls and demonstrate that you operate them consistently. Engineering teams are responsible for implementing the controls; compliance teams manage the audit process. Common technical controls that SOC 2 assesses: - Logical access controls and multi-factor authentication - Encryption at rest and in transit - Audit logging and monitoring - Change management and deployment controls - Incident response procedures SOC 2 readiness is not a last-minute project. The "Type II" designation means the audit covers a period of time, not a point-in-time assessment. You need to operate your controls consistently for the audit period before you can achieve the certification. Starting SOC 2 readiness work six months before you need the certification is late. ### GLBA The Gramm-Leach-Bliley Act applies to financial institutions and requires safeguards for customer financial information. The FTC Safeguards Rule, updated in 2023, specifies more concrete technical requirements: encryption, access controls, multi-factor authentication, audit logging, and a written information security program. If your product is a financial institution — or if you handle customer financial data on behalf of a financial institution — GLBA applies and its technical requirements need to be reflected in your architecture. ### State Licensing and Money Transmission Payment products, lending products, and money transmission products require state-specific licenses in most US states. The licensing requirements vary significantly by state and by product type. This is a legal and compliance problem, not an engineering problem — but the engineering team needs to understand which states the product is licensed to operate in, because that determines where customers can be onboarded. ### Open Banking and CFPB 1033 The CFPB's Section 1033 rule, finalized in 2024, establishes consumer rights to access their financial data and requirements for financial institutions to provide data access to authorized third parties. If your product aggregates financial data or relies on consumer-permissioned data access, Section 1033 creates a more standardized regulatory framework for how that access must be provided and how the data must be handled. --- ## Payment Processing Architecture The payment processing architecture decision is one of the most consequential in fintech development. ### Direct Integration with Stripe or Similar For most fintech products, a direct integration with Stripe (or its equivalents — Adyen, Braintree, Square) is the right starting point. These platforms provide: - Card tokenization — raw card data never touches your servers - ACH and bank transfer support - Subscription billing management - Fraud detection and chargeback handling - Regulatory compliance for the card processing layer The engineering cost is low relative to alternatives. The trade-offs: per-transaction fees, dependency on a third-party platform, and limited control over the payments experience. ### Banking as a Service (BaaS) Products that need to hold customer funds, issue cards, or operate as banking-like products (neobanks, earned wage access, embedded finance) use Banking as a Service providers — Synapse (acquired), Column, Treasury Prime, Unit, or Stripe Treasury. BaaS platforms sit between your application and a licensed bank partner, providing the financial infrastructure under a banking license your company does not hold. BaaS architecture adds complexity: you are now integrating with an intermediary that has its own API contracts, reliability characteristics, and regulatory requirements. The BaaS provider's compliance requirements flow through to your application — KYC/AML requirements, transaction monitoring, and reporting obligations are all part of the integration. ### Plaid and Open Banking Data Products that aggregate financial accounts — personal finance management, underwriting, lending — use Plaid or similar aggregators to access consumer bank account data via OAuth-based connections. The architecture consideration: Plaid's data is delayed (not real-time), coverage varies by institution, and the data model requires normalization before use. Plan the data access, refresh, and normalization pipeline before designing features that depend on financial data freshness. --- ## Security Architecture for Financial Data ### Encryption and Key Management Financial data — account numbers, routing numbers, transaction history, identity documents — is high-value for attackers and carries regulatory consequences if exposed. The encryption baseline: - AES-256 for data at rest - TLS 1.3 for data in transit - Field-level encryption for the most sensitive fields (account numbers, SSNs) beyond full-database encryption Key management is where financial services engineering teams most commonly fall short. Encrypting data with keys that are stored adjacent to the encrypted data — in the same database, the same secrets manager, the same environment — provides weaker protection than it appears. AWS KMS with customer-managed keys provides separation between key access and data access. Envelope encryption adds another layer. Define key rotation policies, document who holds each key, and test the key rotation procedure before an incident requires it. ### Secrets Management Financial applications integrate with a large number of external services: payment processors, data aggregators, identity verification providers, banking partners. Each integration has API keys and credentials. These must live in a secrets manager (AWS Secrets Manager or Secrets Manager-compatible alternatives), never in environment files, application code, or version control. Rotation of third-party credentials needs to be automated or at minimum procedurally enforced. Credentials that cannot be rotated quickly are a liability when a team member leaves or a key is inadvertently exposed. --- ## Fraud Detection and Anomaly Detection Fraud is an operational reality in financial services, not an edge case to handle later. The architecture decisions that support fraud detection: **Event streaming for real-time analysis.** Fraud signals — unusual transaction velocities, geographic anomalies, device fingerprint changes — are time-sensitive. A batch analytics architecture that processes transactions hours later cannot support real-time fraud intervention. Event streaming (Kinesis, Kafka) enables real-time signal processing. **Behavioral baseline modeling.** Fraud detection at the account level requires a behavioral baseline — what is normal for this user? Establishing baselines for transaction amounts, frequencies, geographic patterns, and session behavior enables anomaly scoring relative to the individual baseline, not just population-level thresholds. **Risk scoring at decision points.** Integrate risk scores into the transaction processing flow at the point where intervention is still possible: before a transaction is authorized, before a withdrawal is initiated, at account creation for identity verification. Post-hoc fraud detection that occurs after funds have moved has far less operational value. **Human review queues for high-risk events.** Fully automated fraud decisions are appropriate for low-risk events (flagging for review) and high-confidence fraud signals (blocking). A human review queue for medium-risk events — where the cost of a false positive is high and automated decision confidence is lower — is a standard production pattern. --- ## KYC/AML Requirements and Implementation Know Your Customer (KYC) and Anti-Money Laundering (AML) requirements apply to financial institutions, money services businesses, and many fintech products. The regulatory obligation is real; the implementation is an engineering problem. ### Identity Verification KYC requires verifying that users are who they claim to be. The standard implementation uses a managed identity verification provider — Persona, Alloy, Jumio, or Stripe Identity — to collect and verify government-issued identity documents and match them against the presented identity. Using a managed provider limits the sensitivity of the documents your system directly handles and offloads the compliance burden of keeping document verification capabilities current. What your system needs to store: the verification result (pass/fail/review), the provider's reference ID, the timestamp, and the identity information required for your downstream compliance purposes. Raw identity documents should generally not be stored by your application — let the IDV provider hold them under their compliance controls. ### Transaction Monitoring AML requires monitoring transactions for suspicious activity and filing Suspicious Activity Reports (SARs) when warranted. The technical implementation: - Define transaction monitoring rules appropriate for your product and customer risk profile - Log all transaction events with sufficient detail to reconstruct the full activity picture - Implement automated flagging for rule-based triggers (cash structuring patterns, transactions with sanctioned countries, velocity thresholds) - Build a case management workflow for investigating flagged transactions OFAC screening — checking transactions and counterparties against the Office of Foreign Assets Control sanctions list — is a required step in payment processing. This is typically handled through your payment processor or a dedicated compliance API rather than building your own screening logic against the OFAC SDN list. --- ## Audit Logging for Financial Transactions Financial services audit logging has more prescriptive requirements than general software audit logging. Every financial transaction must be recorded with sufficient detail to reconstruct its complete history: who initiated it, what was the state at initiation, what approvals or verification steps occurred, what was the outcome, and any subsequent modifications or reversals. Audit logs in financial systems must be: - **Immutable** — Write-once, with no application-level ability to delete or modify records - **Complete** — Every state transition of a financial record must be captured, not just the final state - **Attributable** — Every action linked to an authenticated identity - **Retained** — Financial records typically require multi-year retention; confirm requirements for your specific product and jurisdiction The practical architecture: a dedicated audit log store (separate from the operational database) that the application can write to but cannot modify or delete from. A write-only IAM role for the audit log writer, a separate read-only role for audit review, and an entirely separate admin path for the rare cases where audit log correction is legally required. --- ## Multi-Currency and International Considerations If your product operates internationally or handles multiple currencies, the complexity of the data model increases significantly. **Currency representation.** Store monetary amounts as integers in the smallest currency unit (cents for USD, pence for GBP), not as floating-point numbers. Floating-point arithmetic with financial values produces precision errors that accumulate across transactions. The currency is stored as a separate field alongside the amount. **Exchange rate handling.** If your product converts between currencies, the exchange rate used at the time of a transaction must be stored with the transaction record. Exchange rates change; knowing the rate that was applied is required for reconciliation and dispute resolution. **Regulatory variation.** Different countries have different transaction reporting requirements, different KYC standards, and different data residency requirements. International expansion is not just a product question — it is a compliance and engineering question that requires country-specific architectural considerations. --- ## Open Banking and API-First Architecture Open banking — the practice of financial institutions exposing customer financial data through standardized APIs — is both a regulatory trend and a product opportunity. For fintech products that depend on financial data access, the API-first design principle matters: design your product assuming that data comes from APIs, not from proprietary scraping or batch file transfers. This positions the product for the regulatory direction of travel and provides a cleaner, more maintainable data access architecture. For fintech products that are financial institutions (or adjacent to them), exposing a well-designed API is increasingly a regulatory requirement and a commercial differentiator. Design the API surface with the same engineering rigor as your core product — versioning, consistent authentication, rate limiting, comprehensive documentation. --- ## Build vs. Partner Decisions Fintech companies face a recurring decision: build or integrate a partner for capabilities that are adjacent to the core product. **Build:** Core IP, proprietary workflows, and capabilities that differentiate the product in the market. The risk scoring model, the underwriting logic, the investment portfolio optimization — whatever makes your product distinct. **Partner:** Regulated, commoditized infrastructure — card processing, ACH rails, banking ledgers, KYC identity verification, OFAC screening. Building these from scratch means acquiring the licenses, compliance expertise, and infrastructure management burden that managed providers have already absorbed. The pattern that works in fintech: narrow the custom-built surface area to the product's actual differentiation, and use managed partners for regulated infrastructure. The failure pattern is building regulated infrastructure from scratch — card processing, identity verification, AML transaction monitoring — without the compliance expertise to operate it correctly. --- ## Engineering for Financial Services The distinguishing characteristic of well-engineered fintech systems is not the sophistication of the technology stack — it is the rigor applied to correctness, auditability, and failure handling. Money movement errors have real consequences: customer harm, regulatory scrutiny, reputational damage. Systems that process financial transactions need to be designed for idempotency (duplicate transaction prevention), reconciliation (state recovery after failures), and complete audit trails — not as features added after launch, but as first-class engineering concerns from the start. If you are building a fintech product and want a structured review of your architecture — payment processing, fraud controls, data security, or compliance infrastructure — [an architecture review](/contact) is where that conversation starts. Our [custom software development practice](/services/custom-software) and [compliance engineering work](/services/compliance-engineering) covers regulated financial system design. --- ## HIPAA Compliant App Development: A Technical Guide for Engineering Teams Source: https://tampadynamics.com/blog/hipaa-compliant-app-development > A practical, architecture-level guide to HIPAA compliant app development. Covers technical safeguards, PHI data flows, audit logging, encryption, BAA obligations, and common mistakes that cause compliance failures. Date: 2026-02-16 Building a HIPAA compliant application is not primarily a legal exercise. It is an engineering discipline. The regulation defines outcomes — confidentiality, integrity, availability of protected health information — but leaves implementation to you. That flexibility is also where most teams get into trouble. Without a clear architectural framework, "HIPAA compliance" becomes a checklist of surface-level controls that looks good in a vendor assessment and fails badly in an audit or breach investigation. This guide is written for CTOs, engineering leads, and product owners who are building or rebuilding healthcare software and need to understand what HIPAA technical safeguards actually require — not at the legal summary level, but at the level of system design. --- ## What HIPAA Technical Safeguards Actually Require The HIPAA Security Rule organizes requirements into three categories: administrative safeguards, physical safeguards, and technical safeguards. Engineering teams own the technical safeguards, and that category is more specific than most developers realize. The Security Rule (45 CFR §164.312) defines five technical safeguard standards: 1. **Access control** — Unique user identification, emergency access procedures, automatic logoff, and encryption/decryption 2. **Audit controls** — Hardware, software, and procedural mechanisms to record and examine activity in systems that contain PHI 3. **Integrity** — Mechanisms to authenticate that PHI has not been improperly altered or destroyed 4. **Transmission security** — Guard against unauthorized access to PHI transmitted over electronic networks 5. **Person or entity authentication** — Verify that a person or entity seeking access to PHI is who they claim to be Each of these has required specifications (mandatory) and addressable specifications (implement if reasonable and appropriate, or document why an equivalent alternative was used). The common mistake is treating addressable as optional. It is not. You must either implement it or document a compliant alternative — and that documentation will be examined if you are audited. ### What "Encryption" Actually Means Under HIPAA HIPAA does not mandate a specific encryption algorithm, but it does reference NIST guidance. In practice, this means: - AES-256 for data at rest - TLS 1.2 or higher for data in transit (TLS 1.3 strongly preferred) - Key management must be documented — who holds the keys, how rotation works, what happens during personnel changes Encrypting your database and using HTTPS is necessary but not sufficient. If your encryption keys are stored in the same environment as the encrypted data, the protection is weaker than it appears. Key management is where many implementations fall short. --- ## Architecture Patterns for HIPAA Compliant Software ### Separate PHI Storage from Operational Data The most durable pattern in HIPAA software architecture is to treat PHI as a distinct data tier with its own access controls, encryption, and audit logging — separate from your general application database. This means: - PHI lives in a dedicated data store with row-level or field-level encryption - Application logic fetches PHI only when explicitly required for a specific operation - PHI identifiers (patient IDs, record IDs) are separate from PHI content A common implementation uses a dedicated encrypted database — RDS with encryption at rest, for example — while operational data (scheduling, billing metadata, workflow state) lives in a separate store. The application joins these only at the point of rendering, and the join itself is logged. This pattern reduces the blast radius of a breach. If your operational database is compromised, it contains identifiers but not PHI content. It also makes access control simpler: you can apply strict IAM policies to the PHI store without restricting access to general operational data. ### Access Control: Role-Based Is Not Enough Role-based access control (RBAC) is the baseline, but healthcare applications typically require attribute-based access control (ABAC) or a hybrid. The difference matters: - **RBAC**: A user with the "clinician" role can access patient records - **ABAC**: A user with the "clinician" role can access patient records for patients currently assigned to their care team, within the facilities where they hold active credentials Pure RBAC grants overly broad access. A clinician at a large hospital system should not be able to query PHI for patients in facilities they have no relationship to. HIPAA's minimum necessary standard requires that access be scoped to what is actually needed for the current purpose. Designing for ABAC upfront is significantly easier than retrofitting it. The key components: ``` AccessDecision = f( subject.role, subject.department, subject.facility_assignments[], resource.patient_id, resource.facility_id, resource.sensitivity_flags[], action.purpose_of_use ) ``` This decision function lives in your authorization layer — not scattered across individual API handlers. Every PHI access request passes through it, and the decision (including the parameters that drove it) is logged. ### Audit Logging: What to Log and How Audit logging is one of the most commonly under-implemented HIPAA controls. The regulation requires that you record and examine activity in systems containing PHI. In practice, this means your audit log needs to capture: - **Who** — Authenticated user identity (not just user ID, but enough to uniquely identify a person) - **What** — The specific PHI record accessed, modified, or deleted - **When** — Timestamp with sufficient precision (millisecond-level for most systems) - **How** — The operation type (read, write, export, print, share) - **From where** — Source IP, device identifier, and application context - **Why** — Purpose of use where the system can determine it A minimal audit log record looks like this: ```json { "event_id": "evt_01HX...", "timestamp": "2026-02-16T14:23:11.847Z", "actor": { "user_id": "usr_abc123", "email": "provider@clinic.org", "role": "physician", "session_id": "sess_xyz789" }, "resource": { "type": "patient_record", "record_id": "pt_def456", "facility_id": "fac_ghi789" }, "action": "read", "purpose_of_use": "treatment", "source_ip": "10.0.1.45", "user_agent": "Mozilla/5.0 ...", "result": "success" } ``` Critical implementation requirements: - **Audit logs are append-only.** The application user cannot delete or modify audit records. Use a separate write-only connection or a dedicated audit service. - **Audit logs are separate from application logs.** Mixing them makes audits difficult and creates risk that log rotation or deletion touches audit records. - **Audit logs are retained for six years.** This is a specific HIPAA documentation requirement. Architect your storage with this retention window in mind. - **Audit logs are themselves PHI-adjacent.** They may contain information that reveals PHI existence. Protect them accordingly. --- ## PHI Data Flows: Designing Systems That Minimize Exposure Every system that handles PHI should have a documented data flow diagram. Not as a compliance artifact — as an engineering tool. Knowing exactly where PHI enters, where it is stored, how it moves, and where it exits is the foundation of a defensible architecture. ### The Principle of PHI Minimization Before designing a feature, ask: does this component actually need PHI, or does it only need a pseudonymous identifier? A scheduling system does not need a patient's full medical history to book an appointment. It needs a patient identifier, a provider, and a time slot. The clinical record system can be queried separately, under stricter controls, only when a clinician is actively rendering care. PHI minimization in practice: - **Tokenization** — Replace PHI fields with non-sensitive tokens in operational systems. The token mapping lives in a separate, access-controlled store. - **De-identification** — For analytics, reporting, and ML training data, de-identify records to the Safe Harbor or Expert Determination standard before they leave the PHI boundary. - **Data masking** — In non-production environments (development, staging, QA), PHI should be masked or replaced with synthetic data. Developers should never need real PHI to do their work. ### Where PHI Leaves Your System Outbound PHI flows are where many organizations have the least visibility. Common unintended PHI exits: - **Third-party analytics and error tracking** — If your error monitoring SDK captures request bodies or user context, it may be capturing PHI. This requires either a BAA with your monitoring vendor or ensuring PHI is scrubbed before it reaches those systems. - **Log aggregation** — Application logs that include request parameters or response payloads may contain PHI. Structured logging with explicit field exclusions is safer than unstructured log strings. - **Client-side data** — React Query caches, localStorage, browser session storage — all of these can hold PHI. Design your frontend state management to hold PHI only as long as needed and to clear it on logout or session expiry. - **PDF generation and file exports** — Export pipelines are often an afterthought. Every generated document containing PHI needs to be accounted for, stored securely, and its access logged. --- ## BAA Obligations and What They Mean for Your Stack A Business Associate Agreement is a contractual requirement, not a technical control — but your vendor choices determine which BAAs you can obtain, and a BAA cannot be obtained from every vendor. ### Who Needs a BAA Any vendor, service provider, or contractor that creates, receives, maintains, or transmits PHI on your behalf is a Business Associate and requires a BAA. In a modern SaaS application, this list is longer than most teams initially expect: - Cloud infrastructure provider (AWS, Azure, GCP — all offer BAAs) - Database hosting (e.g., RDS, managed PostgreSQL services) - Authentication provider (Auth0, Okta, Cognito — check BAA availability per tier) - Error monitoring and observability (Datadog, Sentry — BAAs are available but often require enterprise tiers) - Email delivery (if PHI is included in transactional email) - AI and LLM providers (this is where most health tech teams have the largest gap) ### The AI/LLM BAA Problem If you are integrating AI into a healthcare application, you need to understand which AI providers offer BAAs and under what conditions. AWS Bedrock offers a BAA under its standard AWS HIPAA compliance program. Azure OpenAI Service offers a BAA through the Microsoft Products and Services Agreement. OpenAI's consumer API does not offer a BAA and should not be used with PHI. Anthropic's API does not currently offer a BAA for the standard tier. This is not a comprehensive or permanent list — BAA availability changes as providers update their commercial terms. But the principle is stable: **if PHI will be sent to or processed by a service, that service needs a BAA before you write the first line of integration code.** The architectural implication is that your AI integration layer needs to distinguish between what is and is not PHI. If your RAG system retrieves clinical documents to answer a query, those documents may be PHI. If you are sending them to an LLM, that LLM provider needs a BAA. If no BAA is available, you need an architecture that de-identifies or synthesizes the context before it leaves your HIPAA boundary. --- ## Common Mistakes That Cause HIPAA Failures in Software These are the patterns we see most often when reviewing healthcare application architectures: **Broad database access credentials.** The application service account has read/write access to the entire database, including all PHI tables. When that credential is compromised — through a misconfigured environment, a leaked secret, or an SSRF vulnerability — the entire PHI store is exposed. Instead: least-privilege database credentials, scoped to the minimum required tables and operations for each service. **PHI in URLs and query parameters.** Patient IDs, record identifiers, or any PHI appearing in URL paths or query strings will end up in web server access logs, browser history, and HTTP referer headers. Use POST bodies for PHI, or use opaque identifiers that cannot be reverse-mapped without authenticated database access. **Shared sessions without proper isolation.** Multi-tenant systems where session state or cache entries are not fully isolated by tenant. This is a standard software engineering problem, but the consequence in healthcare is PHI cross-contamination between organizations. **Logging PHI in application logs.** Structured logging is good. Logging request bodies, response payloads, or user objects that contain PHI is not. Every logging call that touches user-supplied data needs to go through a sanitization function that strips PHI fields before writing. **Missing automatic session termination.** HIPAA requires automatic logoff. Workstations left logged in with an active clinical session are a physical and technical risk. The implementation is straightforward — an inactivity timer that terminates the session after a configurable interval — but it is frequently omitted from initial builds. **Treating development environments as outside scope.** Development and staging environments that use real PHI are HIPAA in scope. Using production database snapshots for local development, without de-identifying the data first, exposes PHI on developer workstations that are rarely subject to the same controls as production infrastructure. **Insufficient encryption key management.** Encrypting the database but storing the encryption key in the same AWS account, in an environment variable, or in the same secrets manager instance as the application credentials — this is encryption theater. Key management needs to be a distinct architectural concern, with access to keys separated from access to the encrypted data. --- ## What a Compliant Architecture Review Looks Like When we work with engineering teams on HIPAA compliant app development, the starting point is a structured review of the existing or proposed system design — not a compliance checklist, but an architecture conversation. A useful review covers: - **PHI inventory** — What data qualifies as PHI, where it is created in the system, where it is stored, and every path by which it moves or exits - **Access control model** — How users are authenticated, how authorization decisions are made, and whether the model supports minimum necessary access in practice - **Audit log completeness** — What is currently logged, whether it is sufficient to reconstruct the history of any PHI record, and whether the log store is adequately protected - **Encryption posture** — Encryption at rest and in transit, key management, and whether encryption is applied at the right layer for your threat model - **Third-party data flows** — Every vendor that touches PHI, whether BAAs are in place, and whether the data minimization principle is applied before PHI reaches external services - **AI integration risk** — If AI is part of the system, how PHI is handled within the AI pipeline, which providers are in scope, and what guardrails are in place The output is a clear picture of where the architecture is solid and where there are gaps — with specific, prioritized recommendations for addressing them. It is not a certification and it is not a legal opinion. It is an engineering assessment. If your team is building a healthcare application and you want a structured review of your architecture before you go further, that is the conversation we are set up to have. --- ## Frequently Asked Questions ### Does HIPAA require a specific encryption standard? HIPAA does not mandate a specific algorithm, but it references NIST guidance, and in practice AES-256 for data at rest and TLS 1.2 or higher for data in transit are the accepted standards. The more important question is often not which algorithm you use but how you manage the keys — who controls them, how they are rotated, and what happens when they are compromised. ### What is the difference between required and addressable HIPAA specifications? Required specifications are mandatory — you must implement them. Addressable specifications must be implemented if reasonable and appropriate given your organization's risk assessment, or you must document why an equivalent alternative was used instead. Addressable does not mean optional. During an audit or breach investigation, you will be expected to demonstrate that you considered each addressable specification and made a documented, defensible decision. ### Can we use a HIPAA compliant cloud provider and consider ourselves covered? No. A cloud provider offering a BAA and HIPAA-eligible services means the shared responsibility model applies — the provider is responsible for the physical infrastructure and some platform controls, but you are responsible for everything you build on top of it. Your application access controls, audit logging, encryption key management, and secure coding practices are entirely your responsibility regardless of what your cloud provider does. ### Do we need a BAA with our AI or LLM provider? Yes, if PHI will be sent to or processed by that provider. This includes PHI used as context in prompts, PHI retrieved from your knowledge base and passed to a model, or PHI included in documents that are analyzed by the model. Review each AI provider's BAA availability before integrating them into a system that handles PHI. If a BAA is not available, you need an architecture that prevents PHI from reaching that provider. ### How long do audit logs need to be retained? HIPAA requires that documentation of policies, procedures, and actions be retained for six years from the date of creation or the date when it was last in effect — whichever is later. For audit logs specifically, six years is the required retention window. Architect your log storage with this in mind: cold storage for older logs is acceptable, but retrieval needs to be practical if you are ever audited or investigating an incident. --- ## Build Healthcare Software That Is Defensible, Not Just Documented The difference between healthcare applications that survive audits and those that do not is not usually the legal documentation — it is the engineering. Systems that log the right things, scope access correctly, keep PHI out of places it should not be, and handle third-party integrations with appropriate controls are genuinely more defensible than systems that rely on compliance documents to paper over architectural gaps. If you are building a HIPAA compliant application and want an architecture review from engineers who work in this space, [start with a conversation](/contact). If you are specifically evaluating AI capabilities within a HIPAA-compliant framework, our [healthcare AI consulting](/healthcare-ai-consulting) practice works through exactly these design decisions. The engagement is structured and time-bounded — a focused review to understand your systems, constraints, and goals, with clear deliverables. No pitch deck, no vague roadmap. If we are a fit, you will know exactly what the next steps look like. --- ## How to Choose a Software Development Partner: A Practical Evaluation Guide Source: https://tampadynamics.com/blog/how-to-choose-software-development-partner > A practical guide to evaluating and selecting a software development partner — covering technical due diligence, contract structure, engagement models, and red flags to watch for. Date: 2025-12-09 Most companies that select a software development partner badly do so for one of two reasons: they evaluate vendors on the wrong criteria, or they do not know what they actually need until they are already deep into an engagement that is not working. This guide is written for technical leaders, product owners, and founders who are evaluating software development partners — for a new build, a platform migration, or ongoing engineering capacity. It covers how to structure the evaluation, what to look for and what to avoid, and how to negotiate contracts that protect your interests without creating adversarial dynamics. --- ## Define What You Actually Need Before You Evaluate Vendors The most consequential decision in vendor selection happens before you talk to any vendor: defining the engagement type you actually need. Most buyers conflate three distinct models, and confusing them results in mismatched partnerships from the start. ### Staff Augmentation Staff augmentation means embedding individual engineers into your existing team. The vendor provides people; your team provides direction, architecture, and management. The augmented engineers work within your processes, your tools, and under your technical leadership. **When it works:** You have a well-functioning engineering team with strong technical leadership and need to increase capacity. The backlog is defined. The architecture is established. You need execution, not design. **When it fails:** You bring in augmented engineers because you lack technical leadership and expect them to provide strategic direction. Individual contributors cannot substitute for technical leadership, regardless of their seniority level. ### Project-Based Engagement A project engagement scopes a defined deliverable — a feature set, a platform migration, an application build — with a beginning and an end. The vendor takes ownership of delivery within an agreed scope. You define the outcomes; the vendor determines how to get there. **When it works:** You can define the scope clearly enough to contract around it. The deliverable has clear acceptance criteria. You have the bandwidth to review and approve work during the engagement. **When it fails:** The scope is underspecified, requirements change substantially mid-engagement, or the client expects project-priced delivery with an open-ended scope. This is the source of most fixed-price software project disputes. ### Product Partnership A product partnership is a longer-term relationship where the development partner functions as an extension of your product and engineering organization — contributing to architecture, roadmap, and strategic decisions, not just execution. **When it works:** You are building a technology-intensive product and lack the internal capacity to own the technical strategy and execution. You want a partner who understands the business context, not just the ticket. **When it fails:** You are not ready to give the partner enough context and authority to actually make good decisions. Treating a product partnership like a staff augmentation arrangement — managing to individual tickets without sharing strategic context — produces good task completion and poor architecture. Being honest with yourself about which of these you need, before you issue an RFP or schedule discovery calls, will filter your vendor options significantly and save considerable time. --- ## Technical Evaluation Criteria Most vendor evaluations focus on portfolio work, pricing, and team bios. These are necessary but insufficient. The technical evaluation criteria that actually surface differences: ### Architecture Interviews Ask the technical lead who will own your engagement to walk through how they would approach a specific technical problem from your domain. Not a whiteboard exercise with a contrived problem — a real design question from your actual system. What you are evaluating: do they ask the right clarifying questions before proposing an approach? Do they identify the tradeoffs in different approaches rather than presenting one solution as obviously correct? Do they demonstrate familiarity with the specific constraints of your domain (regulatory, operational, scale)? A vendor who jumps immediately to a specific technology answer without understanding your constraints is either not thinking carefully or is selling you the solution they already know how to build. Neither is what you want. ### Code Sample Review Ask for a code sample from a previous engagement in the relevant stack. Review it with someone who can evaluate it technically — look at structure, naming conventions, error handling, test coverage, and documentation. Code that is hard to read, poorly tested, and undocumented will look exactly the same way in your codebase six months after the engagement ends. If the vendor refuses to show code samples citing NDA constraints, that is understandable — ask if they have open source contributions or can share a sanitized example. If they cannot produce anything reviewable, you cannot evaluate their technical craft. ### Documentation Quality Ask to see a technical specification or architecture document from a previous engagement. The quality of their documentation tells you a great deal about how they think and communicate, and it predicts what you will receive at handoff. Documentation that is vague, diagram-heavy with little explanatory text, or organized around the solution rather than the problem is a preview of what you will get at the end of your engagement. Clear, substantive technical writing that explains decisions and their rationale is a strong signal. --- ## Reference Checks That Surface Real Information Standard reference checks are almost entirely useless. Vendors only share references who will give positive reviews. Asking a reference "How was it working with them?" produces an answer that was scripted before the call. References are useful when you ask specific questions that require specifics in response: **"Describe the most difficult moment in the engagement and how the vendor handled it."** Every real engagement has a difficult moment — a technical dead end, a scope dispute, a missed milestone. A reference who cannot describe one is either not remembering the engagement honestly or the relationship is too surface-level to be useful. **"What would you do differently if you were starting the engagement again?"** This question surfaces the structural things the reference wishes had been true at the start — clearer scope, different contract structure, different team composition. The answer tells you what you should negotiate for before signing. **"Did the code they delivered live up to what you were shown in the sales process? What were the gaps?"** This directly addresses the demo-to-delivery gap that is the most common disappointment in software development engagements. **"Would you use them again, and for what type of work specifically?"** Some vendors are excellent for certain types of work and not others. The specificity of "yes, for X but not for Y" is far more useful than a generic recommendation. Try to find references beyond the list the vendor provides. LinkedIn, industry networks, and common customers are all valid paths to additional references. Vendors who have many satisfied customers are easy to find corroborating references for; vendors who only surface references they control are a yellow flag. --- ## Contract Structures The contract structure determines the alignment of incentives between you and the vendor. There is no universally correct structure — the right structure depends on how well you can define scope and how much risk each party can absorb. ### Time and Materials You pay for time actually worked at an agreed rate. Scope can change; cost tracks actual effort. The risk of overruns sits with you; the vendor has less incentive to be efficient with time. Time and materials is appropriate when scope is genuinely uncertain — early-stage exploration, research-heavy work, iterative product development where the requirements will evolve. It requires that you trust the vendor's time reporting and have enough visibility into the work to evaluate whether the effort is appropriate. Protect yourself in T&M contracts with: weekly or bi-weekly timesheet review, a defined escalation process for cost overruns, and milestone-based check-in points where both parties can reassess whether the engagement is on the right track. ### Fixed Price You pay a contracted amount for a defined deliverable. The vendor bears the risk of underestimating; you bear the risk of scope creep and the friction of formal change orders for anything outside the original scope. Fixed price is only appropriate when scope can be defined precisely enough to actually hold the contract to it. Attempting fixed price on a project with poorly understood requirements produces disputes, quality shortcuts, and adversarial dynamics as the vendor tries to stay profitable within scope and you try to get everything you expected. The contract language that matters in a fixed-price engagement: the definition of "done," the change order process (how scope changes are estimated and approved), and the acceptance testing criteria. Vague definitions of completion in a fixed-price contract will be exploited by either party when the engagement is under pressure. ### Milestone-Based A hybrid: fixed-price milestones with clearly defined deliverables at each stage. The advantage is that you pay for tangible progress rather than time, while limiting fixed-price exposure to a bounded scope at each milestone rather than the full project. Milestone-based contracts require that each milestone be defined precisely enough to determine whether it has been achieved. "Working authentication system" is not a milestone definition. "Users can register with email/password, receive a verification email, complete verification, and log in to the application — verified against the acceptance tests defined in Attachment A" is. ### IP Ownership and Source Code Regardless of the contract structure, clarify IP ownership explicitly. The work should be unambiguously yours — including all source code, documentation, databases, and configuration. Ensure the contract specifies: - Work product IP transfers to you upon payment (work-for-hire) - The vendor may not reuse your code in other engagements - Open source components are identified, and their licenses are compatible with your use - Source code is delivered in a repository you control, not the vendor's Some vendors retain ownership of "pre-existing IP" they bring into the engagement — generic frameworks, internal libraries, common utilities. This is reasonable to allow; make sure you have a perpetual, irrevocable license to use any pre-existing IP incorporated into your work product. **Source code escrow** is worth considering for critical systems where you are dependent on a vendor relationship. An escrow agreement deposits source code, documentation, and build instructions with a third-party escrow agent; the code is released to you under defined conditions (vendor insolvency, end of engagement, failure to maintain). The overhead is modest; the protection is real for systems where continuity is essential. --- ## Red Flags These are patterns that reliably correlate with poor engagement outcomes: **Guaranteed timelines presented before discovery.** Any vendor that quotes a completion date before understanding your requirements in depth is either working from a template that does not match your situation or telling you what you want to hear. Credible estimates require scope understanding. A rough order of magnitude before discovery is fine; a confident commitment is not. **No post-launch support plan.** Software development does not end at launch. Production systems require ongoing maintenance, bug fixes, dependency updates, security patches, and monitoring. A vendor who does not address post-launch support in their proposal is either expecting you to handle it (fine if planned for) or has not thought about it (not fine). **Vague deliverables.** Proposals that describe deliverables in terms of effort ("200 hours of development") rather than outcomes ("a working authentication system with documented test coverage") create misaligned expectations. Effort describes input; deliverables describe output. Hold vendors to outcome-defined deliverables. **The bait-and-switch team.** A vendor presents senior engineers in the sales process and then staffs the engagement with junior engineers under minimal senior oversight. This is common enough that it deserves explicit protection in the contract: name the key personnel on the engagement and require your approval for any substitution. **No questions about your users.** A vendor who does not ask about the actual users of the system, their workflows, and how they will interact with what is being built will often produce a system that is technically functional and practically unusable. Product thinking and user-centricity are not bonus features — they are part of building software that succeeds. **Offshore delivery presented as equivalent to onshore.** It may be equivalent for your situation, or it may not be. Offshore and nearshore models introduce coordination overhead, time zone friction, and sometimes quality variability that is not captured in the hourly rate comparison. We discuss this further below. --- ## Offshore vs. Nearshore vs. Domestic: An Honest Comparison The cost differential between offshore, nearshore, and domestic development is real. So are the tradeoffs. ### Offshore (India, Eastern Europe, Southeast Asia) **Advantages:** Lowest hourly rates, large talent pools, established firms with mature delivery processes for certain types of work. **Real tradeoffs:** Time zone overlap of 0-4 hours with US Eastern is a genuine coordination overhead. Asynchronous collaboration requires discipline from both sides. Code quality variance is wide — the best offshore shops produce excellent work; the market is crowded with firms that compete on price and deliver accordingly. Senior technical leadership is often thinner than represented, and the team you work with may change more frequently than a domestic arrangement. **Best fit:** Well-defined, execution-heavy work with stable requirements and strong technical oversight from your side. Not a fit for early-stage product exploration, domain-complex regulatory systems, or engagements where you need your development team to make architectural decisions independently. ### Nearshore (Latin America, Eastern Time Zone overlap) **Advantages:** Significant time zone overlap with the US (usually 1-3 hours off East Coast), lower cost than domestic, and a growing pool of strong engineering talent in Colombia, Brazil, Mexico, and Argentina. **Real tradeoffs:** The nearshore market is less mature than offshore; quality variance is also significant. The best nearshore shops are excellent; the market is not uniformly developed. English proficiency varies. **Best fit:** Teams that value real-time collaboration during the US business day but have cost constraints that make domestic rates difficult. The time zone alignment advantage over offshore is genuine and undervalued. ### Domestic (US-based) **Advantages:** Full time zone alignment, easier reference checking and relationship validation, stronger accountability mechanisms, often stronger product and communication skills, easier to integrate with your internal team. **Real tradeoffs:** Highest hourly rates. The US market for experienced engineers is expensive. **Best fit:** Complex, domain-intensive work in regulated industries. Systems where domain knowledge (healthcare, financial services, legal) is as important as raw technical skill. Engagements where real-time collaboration and rapid iteration are core to the process. Cases where the cost of poor architecture is higher than the cost of premium engineering rates. The decision is rarely as simple as a direct hourly rate comparison. Model the total cost including coordination overhead, quality risk, and the cost of rework before concluding that the lower hourly rate produces lower total cost. --- ## Questions to Ask in the Evaluation Process These are useful as a structured starting point for vendor discussions: - Who will be working on this engagement day-to-day? Can I meet the actual team, not the sales team? - How do you handle scope changes? Walk me through a specific example from a past engagement. - What does your testing practice look like? What test coverage do you target, and how do you handle regression? - What does handoff look like? What documentation will I have at the end of the engagement? - What is your on-call and incident response process for production issues during the engagement? - Have you worked in our regulatory environment before? What specific requirements did that create? - What is the one thing that is most likely to cause this engagement to be harder than expected, and how would we get ahead of it? The last question is the most revealing. A vendor who cannot answer it has not thought carefully about your engagement. A vendor who gives a thoughtful answer about specific risks and their mitigation demonstrates the kind of clear-eyed thinking you want building your system. --- ## Frequently Asked Questions ### How long should a vendor evaluation take? For a significant engagement — a new product build, a major platform migration — four to eight weeks from first contact to signed contract is reasonable if you move with intention. Shorter timelines often produce hasty decisions; longer ones often reflect unclear internal decision-making more than vendor complexity. ### Should we issue an RFP? RFPs are useful for commoditized work where you can specify requirements precisely and evaluate vendors on a comparable basis. For complex software development, RFPs often produce proposals that optimize for winning the evaluation rather than solving your problem. A structured conversation and a paid discovery engagement produce more useful information than an RFP response. ### What is a paid discovery engagement? A bounded, paid engagement — typically 2-4 weeks — where the vendor develops a detailed technical proposal, architecture recommendation, or product specification based on deep collaboration with your team. It costs money but produces something real: a concrete plan that you can evaluate, compare, and potentially take to a different vendor. It also reveals whether the working relationship functions before you commit to a larger engagement. ### How do we evaluate a vendor for a domain we don't know well? Bring in a technical advisor or consultant to assist with the evaluation — someone who can review code samples, conduct technical interviews, and assess architecture proposals with domain expertise. This is a much smaller investment than the cost of a poorly selected primary vendor. --- Selecting a software development partner is a significant decision, and the evaluation process deserves proportionate effort. The frameworks here are a starting point, not a complete system — every engagement has specifics that matter. If you are evaluating partners for a system in a regulated industry and want a direct conversation about whether we are a fit for your situation, [start here](/contact). We will tell you honestly if we are not. --- ## Next.js 16 Best Practices for Production Apps Source: https://tampadynamics.com/blog/nextjs-16-best-practices > Modern patterns and practices for building fast, maintainable Next.js 16 applications with React 19, Server Components, and the App Router. Date: 2025-11-28 Next.js 16 represents a maturation of the App Router paradigm introduced in Next.js 13. Combined with React 19's stable Server Components, it's now the default choice for production React applications. Here's what we've learned building regulated, production-grade apps with this stack. ## Server Components by Default The mental model shift is complete: components are Server Components unless you explicitly opt into client-side rendering. ### When to Use Server Components - **Data fetching** — Fetch directly in your components, no useEffect or client-side loading states - **Static content** — Marketing pages, documentation, blog posts - **Sensitive operations** — API keys and database queries stay on the server ```tsx // This runs on the server - no "use client" needed async function RecentPosts() { const posts = await db.posts.findMany({ take: 5 }) return (