911 BAA Model training Dispatch audio Ai transcription Asr

AI Dispatch Transcription — Hidden PHI in the Output

Steven Carlson·June 11, 2026

A vendor pitches automatic transcription for your dispatch audio. The demo looks clean, the ASR catches every word, and the LLM summarizes the call in seconds. Your chief wants it signed by next month.

Here is what the demo does not show you about the data pipeline. The transcription pipeline converts audio to text, sends that text to a cloud model for summarization, and stores the output somewhere you probably did not audit. The audio itself may live in a secure silo, but the text output often does not. And that text contains every piece of PHI the caller spoke aloud.

I have looked at three of these contracts in the last year. The gap between what the sales deck promises and what the technical architecture actually delivers is wide enough to drive a rig through.

PHI Exposure in AI Generated Dispatch Summaries

The pipeline looks like this. A 911 call comes in. The audio goes through automatic speech recognition that converts speech to text. That text gets passed to an LLM for summarization and tagging. The summary lands in a CAD note or a case management system.

The real problem sits at the LLM stage of the pipeline. The audio recording might be encrypted at rest in a compliant storage system. But the text output is generated by a model running in a cloud environment the agency does not control. If that model caches the prompt or logs the interaction, the PHI in that transcript is no longer under the agency's authority. The same is true if the vendor uses the data for model improvement.

Names, addresses, medical history, medication lists, psychiatric conditions all come through in high fidelity because modern ASR is good at its job. A caller describing a chest pain episode gives you their age, their medications, and their home address in the first thirty seconds. Their surgical history comes next. That is PHI, every word of it.

How to Prevent Dispatch Audio from Training AI Models

The most common mistake I see is assuming a signed BAA covers this. A Business Associate Agreement is a legal contract. It does not automatically disable the technical process of model training. Many vendors train their models on production traffic by default. The BAA may not explicitly prohibit it.

You need three things in the contract.

A zero-retention clause that requires the vendor to delete both the input audio and the output text after processing. No caching, no logging, and no storing transcripts for quality assurance unless the agency explicitly opts in.

An explicit opt-out of model training. The contract must state that no agency data, audio or text, will be used for training or fine-tuning the vendor's models. Do not accept language that says the vendor "may" use anonymized data. Anonymization in this context is a promise you cannot verify.

A data residency requirement that specifies where the LLM processing happens. If your agency operates under state-level data sovereignty rules, the model needs to run in a region that matches your jurisdiction. Some vendors offer on-premise deployment. That is the strongest option if you can get it.

I covered the BAA gap in more detail in my article on EMS Telemedicine Integration. The same principle applies here. A BAA is a starting point, not a finish line.

HIPAA BAA for AI Transcription Services

The BAA for an AI transcription service needs to address specific things a standard BAA template may not cover.

The first is subprocessor disclosure. Many AI transcription vendors do not run their own models. They call an API from OpenAI or Anthropic. That provider is a subprocessor under HIPAA. The BAA must name every subprocessor in the chain and require the same protections downstream.

The second is encryption key management. The vendor should support customer-managed keys (CMK) for the transcription output. If the vendor holds the keys, they hold the data. For dispatch audio that may intersect with law enforcement, CJIS requirements may also apply. I wrote about that intersection in CJIS Compliance for Fire and EMS.

The third is breach notification scope. A standard BAA defines a breach as unauthorized access to PHI. But with AI transcription, the breach vector may be a model inversion attack or a prompt injection that extracts data from the model's context window. The contract should define these scenarios explicitly.

Securing AI Transcription for Emergency Services

If you are evaluating an AI transcription vendor right now, here is the short list of technical questions to ask.

Start with the ASR. Does it run on-premise or in the cloud. If cloud, what region and what is the retention policy for the raw audio.

Where does the LLM run. Same question. If the vendor uses a third-party model API, ask for the subprocessor name and the data handling terms.

What happens to the prompt history. Some platforms log every prompt for debugging or improvement. That log contains the full transcript. Ask for it to be disabled by default.

Can the agency delete its data on demand. The contract should include a process for requesting deletion of all agency data from the vendor's systems. That includes backups and cached outputs.

Is there an audit log. The platform should log every API call that processes agency data. Each entry needs a timestamp, the model version, and the output destination.

The Redaction Problem Nobody Is Solving

Traditional audio redaction is bad. You mute a segment and hope you caught everything. Text redaction is worse because the text is searchable and the PHI is explicit. If a dispatch summary containing unredacted PHI gets posted to a CAD note visible to dispatchers and field crews, that is a disclosure to people who may not have a need to know. Supervisors seeing it does not make it better.

The agency also creates a new discoverable record. In litigation, those AI-generated summaries become part of the record. If they contain PHI that was never redacted, the agency has expanded its liability surface.

Frequently Asked Questions

Is a BAA enough to ensure my dispatch audio is not used to train an AI model?

A BAA provides a legal framework for HIPAA compliance but does not automatically disable model training. You need an explicit no-training clause in the contract that prohibits the vendor from using your data for model improvement or fine-tuning. The clause should cover any purpose beyond the specific transcription task.

Where does PHI usually leak in the AI transcription process?

The leak happens when audio converts to text and that text goes to a cloud LLM for summarization. If the vendor caches prompts or uses production data for model training, the PHI in those transcripts is stored in an environment the agency does not control. Logged interactions create the same exposure. The audio file may be secure, but the text output often is not.

What should an agency CIO look for in an AI transcription contract?

Look for a zero-retention clause, an explicit opt-out of model training, and a data residency requirement. Also verify subprocessor disclosure and customer-managed encryption keys. The breach notification scope should cover AI-specific attack vectors like prompt injection and model inversion.

Can an agency run AI transcription on-premise?

Some vendors offer on-premise deployment. This is the strongest option for data control because the audio and text never leave the agency's network. The tradeoff is higher cost and maintenance overhead. For agencies with strict data sovereignty requirements, it may be the only option that works.

Does CJIS apply to AI dispatch transcription?

If your dispatch center handles law enforcement calls, CJIS requirements apply to the storage and transmission of criminal justice information. The AI transcription pipeline must meet CJIS encryption standards and data residency rules. The intersection of HIPAA and CJIS in a shared CAD environment is a compliance area many agencies have not fully mapped.

---

The transcription vendors are coming. The efficiency gains are real. But the data pipeline between your dispatch audio and a cloud LLM is wider than most agencies realize. Audit the contract before you sign it. Ask the technical questions. And make sure the text output gets the same protection you give the audio.

-- Steven

Need help with your agency’s cybersecurity? Get in touch