Two Visions of Clinical AI: Dictation or Reasoning Partner?

This article reflects my observations about AI documentation tools and clinical practice workflows. Tool capabilities and compliance standards evolve rapidly. Verify HIPAA compliance, BAA availability, and feature sets with vendors directly before adopting any AI tool for clinical use.

I've been experimenting with AI documentation tools over the past few weeks, and I've noticed something interesting.

Most of these tools are designed around a single question: "How do I get this documented faster?"

That's valuable and for many clinicians, that's exactly the problem they need solved.

But I've started using AI to answer a different question: "Does my clinical reasoning hold up? Is this narrative defensible? Does this code match the complexity?"

Those are fundamentally different problems and they require fundamentally different tools.

I don't know how many clinicians are thinking about AI this way yet. But the distinction feels important enough to name; especially because I think many clinicians are trusting dictation tools to do reasoning work they weren't designed for.

The Split No One's Talking About

There are really two different visions of what AI should do in clinical practice, and they're solving fundamentally different problems.

Here's what's important to understand: the first category is everywhere. The second is just beginning to emerge.

AI as Dictation Device

This model treats documentation as a clerical bottleneck:

  • Convert speech to structured text

  • Auto-populate EHR fields

  • Reduce typing time

  • Suggest billing codes based on templates or session duration

This is the dominant model right now. Dozens of tools, major funding, heavy marketing. Clinicians using them report significant time savings, and they should β€” the efficiency gains are real.

But here's what these tools don't do: they don't verify your logic.

They suggest codes based on pattern matching (checkboxes, keywords, session length), not reasoning through whether your documentation actually supports the code.

They automate output. They don't stress-test judgment.

AI as Reasoning Partner

This model treats clinical thinking as the bottleneck:

  • Stress-test diagnostic logic

  • Refine complex narratives

  • Verify billing justification β€” not just suggest codes, but ask "does your documentation support this?"

  • Support defensibility in ambiguous cases

This category barely exists yet for solo clinicians. Most reasoning-capable AI (like ChatGPT or Claude) isn't HIPAA-compliant. Enterprise options require organizational contracts. Tools designed specifically as clinical reasoning partners β€” rather than dictation accelerators β€” are just starting to emerge.

These tools generate notes just as fast β€” often with higher narrative quality and more precise clinical language. But they also help you think through the hardest 10% of documentation: high-risk patients, mixed presentations, psychotherapy narratives, cases that will be scrutinized later.

The tradeoff isn't speed. It's workflow: you're typing or pasting rather than speaking, and you're copy-pasting into your EHR rather than direct integration.

But what you gain is flexibility, reasoning depth, and the ability to stress-test your clinical logic before it goes into the permanent record.

It doesn't replace judgment. It amplifies it.

The Problem: Misplaced Trust

Here's what concerns me:

I keep hearing clinicians say things like: "My AI tool helps me with coding. It suggests the right codes."

That sounds good. But I think many clinicians don't realize what's actually happening under the hood.

Dictation AI suggests codes based on inputs, not logic.

It's doing pattern matching: "You checked these boxes, so here's the most common code."

It's not asking: "Does your documentation actually support this code in an audit?"

That looks like reasoning, but it's not.

And because it feels helpful β€” because it speeds up workflow and the codes seem right β€” clinicians trust it.

But they're trusting it to do something it wasn't designed to do.

Why This Matters in Audits

Here's the scenario I'm worried about:

A clinician uses dictation AI that auto-suggests 99215 based on session length and a few checkbox inputs.

The note gets written. The code gets billed. Everything feels fine.

Six months later, an audit happens.

The auditor looks at the documentation and says: "Where's the MDM complexity that justifies 99215?"

The clinician says: "The AI suggested it."

The auditor says: "The AI suggested a code based on your inputs. But your documentation doesn't support it."

The tool did exactly what it was designed to do. It just wasn't designed to verify defensibility.

Reasoning AI would have caught that. It would have asked: "Is there actually enough complexity here? Can you point to the medical decision-making that justifies this level?"

That's not pattern matching. That's logic verification.

And in an audit, that distinction is everything.

The Concern About "Outsourcing Judgment"

I know what some of you are thinking:

"If we rely on AI to help us reason, won't clinicians stop thinking for themselves?"

It's a legitimate concern β€” and it should be taken seriously.

This is the same debate happening in writing, law, education, and every other knowledge-work field. And in every field, there are people who use AI responsibly and people who don't.

Here's the reality: reasoning AI can be used well or used badly.

Used well, it functions like a colleague consult:

You walk down the hall and say: "I'm seeing this patient with mixed anxiety and depression, comorbid ADHD, tried three SSRIs, now I'm thinking about augmentation β€” does 99215 feel right here?"

Your colleague doesn't make the decision. But the conversation sharpens your reasoning.

That's the responsible use case: you already know what you're doing, and you're using AI to stress-test your thinking.

Used badly, it becomes a shortcut:

A clinician asks: "What would you diagnose here?" and then blindly follows whatever the AI suggests without applying clinical judgment.

That's not reasoning support. That's outsourcing the decision itself.

And yes, some clinicians will do that.

Just like some clinicians copy-paste templates without reading them. Just like some clinicians trust dictation AI's code suggestions without verifying documentation. Just like some people use any tool as a crutch rather than a support.

The question isn't whether reasoning AI can be misused β€” it absolutely can.

The question is: does that mean we shouldn't use it at all, or does it mean we need to be clear about what responsible use looks like?

I think it's the latter.

Reasoning AI doesn't replace clinical judgment… but only if you don't let it.

If you're using it to verify thinking you've already done, it's a cognitive partner.
If you're using it to do your thinking for you, it's a clinical risk.

The tool itself doesn't enforce that distinction. You do.

And here's what often gets missed in this debate: when used well, AI doesn't just support judgment β€” it actively develops it.

When you ask AI a clinical question you don't know the answer to, challenge its reasoning, or test whether its logic holds up β€” you're not replacing judgment. You're engaging in the same kind of iterative learning you'd do with a supervisor, a case consultation, or a colleague with more experience in an area.

The difference between "outsourcing" and "developing" judgment isn't about whether you learn from AI. It's about whether you engage critically or accept blindly.

The risk isn't just that clinicians will rely on reasoning AI too much. The risk is twofold:

  1. Some clinicians will misuse reasoning AI by outsourcing decisions instead of verifying them

  2. Other clinicians will trust dictation AI to do reasoning work it was never designed for

Both risks are real. Both need to be named.

What Gets Lost When We Stop at Dictation

If AI in healthcare becomes primarily a transcription tool β€” converting speech to text, auto-populating fields, suggesting codes based on templates β€” we've automated the clerical parts of practice but left the cognitive parts untouched.

That's useful. But it's also a missed opportunity.

And more concerning: it creates a false sense of security.

When dictation AI suggests a billing code, generates a narrative, or completes a note quickly, it feels like the hard work is done. The note looks professional. The code seems right. Everything appears defensible.

But speed and polish aren't the same as accuracy or defensibility.

The AI didn't verify whether your MDM justifies the code. It didn't catch diagnostic inconsistencies. It didn't ask whether your psychotherapy narrative actually supports the add-on you billed.

It just made everything look good β€” and that can be more dangerous than having no AI at all, because it removes the friction that would have made you double-check your own reasoning.

The harder challenges in clinical practice aren't about typing speed. They're about:

  • Reasoning through diagnostic uncertainty

  • Documenting nuanced psychotherapy defensibly

  • Justifying complex billing decisions under audit scrutiny

  • Preserving clinical judgment when working alone

Solo practitioners especially carry this cognitive load without the informal hallway consults that happen in group settings.

Dictation tools don't address that. And worse, they can mask the fact that the cognitive work still needs to be done.

Reasoning tools, by contrast, make the cognitive work visible. They force you to engage with the logic, not just the output.

The Real Question

This isn't about which AI tool is "better."

It's about what you're trusting the tool to do.

If you're using AI to finish notes faster and reduce typing time, dictation tools are great. They do exactly what they're designed to do.

But if you're trusting AI to verify your billing logic, stress-test your diagnostic reasoning, or ensure your documentation is defensible in an audit β€” make sure you're using a tool designed to do that work.

Pattern matching looks like reasoning, but it's not.

And in high-stakes documentation, the difference matters.

What Comes Next

I don't think this is an either/or.

Some clinicians will always prefer dictation-first tools β€” especially in high-volume, highly standardized workflows where EHR integration and speed matter most.

Others will find that once they experience reasoning support β€” especially for complex cases where defensibility matters more than convenience β€” they won't want to go back.

Both approaches are valid. But the distinction is real.

And I think we're going to see this split more clearly over the next few years:

AI that makes clinicians faster.
AI that makes clinicians better.

Those will increasingly feel like different products, serving different needs, built on different philosophies of what clinical work actually is.

A Final Thought

The best tools β€” in any field β€” don't just automate tasks. They extend capability.

Dictation tools automate tasks.
Reasoning tools extend capability.

That's not a value judgment. It's a description of different design goals.

The question isn't which is better. The question is: what are you trusting the tool to do?

For most clinicians right now, dictation tools are solving the problem they know they have: too much typing.

But I think there's a second problem β€” cognitive load, judgment support, defensibility under scrutiny β€” that most clinicians haven't realized AI could help with yet.

And more importantly, I think some clinicians are trusting dictation tools to solve that second problem when those tools were never designed to do it.

That's the space worth paying attention to.

If this article made you think, β€œI wish I had someone to sanity-check this with,” that’s exactly what the Think Beyond Practice forum is for.

Members bring real cases, draft notes, and judgment calls into a space where other experienced clinicians help refine themβ€”without hype or fear-based compliance.

Start your 7-day free trial

Related clinical insights