AI Scribe Coding Concerns 2026: The Upcoding Risk Healthcare Providers Need to Understand

Written by SOAPNoteAI Editorial Team · Updated May 2026

Ambient AI scribes have delivered real benefits: reduced burnout, faster documentation, more time with patients. But a significant and underreported risk has emerged: AI-generated notes are systematically driving higher billing codes — and the gap between documented complexity and actual clinical work is creating serious compliance exposure.

This guide explains what the research shows about AI scribe upcoding, the federal legal risk, and practical steps every practice should take to protect itself.

Create Your Clinical Documentation in 2 Minutes

Start with 20 free SOAP notes. No credit card required.

Try Free on Web

Plans from $29/mo · HIPAA-compliant with signed BAA · 1-click sync to SimplePractice

The Problem: AI Documentation Can Outrun Your Clinical Work

AI scribes are designed to capture everything said during a patient encounter and generate comprehensive, complete-sounding clinical notes. This is their strength. It is also the source of the coding problem.

What the Research Shows

A 2025 policy brief published in npj Digital Medicine by researchers at Johns Hopkins University and Harvard Medical School analyzed coding patterns across multiple health systems before and after ambient AI scribe deployment. The findings were striking:

Consistent upward shift in E/M coding across all organizations studied
High-complexity new patient visits (99205) increased by 12–20 percentage points at most organizations
At one health system, coding for the highest-complexity new patient E/M reached 80% of all new patient visits
Similar upward trends in established patient visits (99215)

A follow-up analysis by TechTarget's Revenue Cycle Management division found that average claim amounts grew 12–18% after ambient AI scribe adoption, with payers already identifying AI documentation patterns as a fraud indicator.

Why AI Documentation Inflates Complexity

The mechanism is straightforward:

AI captures more than you would type. Providers dictating notes manually tend to summarize; AI listening in real time captures every statement the patient makes, every differential diagnosis mentioned aloud, every management option discussed.
More documentation = higher MDM. Under the 2021 E/M guidelines, Medical Decision Making complexity is determined by problems addressed, data reviewed, and risk level. A comprehensive AI note often appears to document higher-complexity MDM than a manually typed note of the same encounter.
Cognitive default to acceptance. When a note appears complete and well-documented, clinicians reviewing it for signature tend to accept the coded complexity rather than independently assess whether it reflects the actual visit. The note looks like a high-complexity encounter.
Commercial incentives. AI vendors are often evaluated partly on documentation completeness and revenue impact. Some tools are explicitly marketed based on revenue lift — creating an incentive structure misaligned with accurate coding.

The Legal Risk: False Claims Act Exposure

Upcoding — billing for a more complex or resource-intensive service than was actually provided — is a federal offense under the False Claims Act (31 U.S.C. §§ 3729–3733).

What "Knowingly" Means

The False Claims Act applies when a provider "knowingly" submits a false claim. Courts have interpreted "knowingly" broadly — it includes:

Actual knowledge that the claim is false
Deliberate ignorance of whether the claim is false (e.g., not reviewing AI-generated notes before signing)
Reckless disregard for the truth (e.g., knowing your AI tool tends to inflate complexity but not auditing)

This means signing AI-generated notes without adequate review is not a safe harbor. The OIG has stated that providers cannot delegate their billing responsibility to AI tools.

Penalties

False Claims Act violations can result in:

Civil penalties of $13,946–$27,894 per false claim (2026 figures, adjusted annually)
Treble damages (three times the amount of the false claim)
Exclusion from Medicare, Medicaid, and other federal programs
Criminal charges in cases of intentional fraud

For high-volume AI documentation practices, even a modest percentage of upcoded claims can represent substantial liability.

OIG and CMS Guidance

The Office of Inspector General (OIG) and CMS have both emphasized that providers remain responsible for the accuracy of claims regardless of how documentation was created. CMS's 2026 guidance on AI documentation notes that:

"The use of AI tools in clinical documentation does not alter the provider's responsibility to ensure that claims accurately reflect the services rendered and are supported by the medical record."

How to Protect Your Practice

1. Audit Your Coding Distribution Before and After AI Deployment

The single most important protective step is to establish your baseline coding distribution before deploying AI documentation, then audit your distribution 30 and 60 days after deployment.

If your distribution shifts significantly upward — particularly if your 99205/99215 percentages increase substantially — investigate whether your AI tool's documentation is driving inflated complexity.

AI Documentation Coding Audit Checklist BASELINE (complete before AI deployment): [ ] Pull E/M coding distribution for last 90 days [ ] Record % of 99201-99205 (new patient) at each level [ ] Record % of 99211-99215 (established patient) at each level [ ] Note average RVU per visit [ ] Document your specialty's benchmark distributions (available from AMA) POST-DEPLOYMENT (review at 30 and 60 days): [ ] Compare current distribution to pre-AI baseline [ ] Flag any shift >5 percentage points toward higher complexity codes [ ] Pull 10-20 high-complexity notes for manual MDM review [ ] Verify documented MDM actually reflects visit complexity [ ] Compare your distribution to published specialty benchmarks ONGOING: [ ] Quarterly E/M distribution review [ ] Annual external coding audit [ ] Review any AI vendor updates that affect documentation completeness [ ] Track RAC or commercial audit results

2. Implement Mandatory Note Review Protocols

Every AI-generated note should receive a focused review of the Assessment and Plan sections before signing, specifically evaluating:

Number of problems addressed: Does the AI note accurately reflect the conditions actually managed during the visit?
Data reviewed: Does the AI note accurately capture the external records, tests, or independent interpretation performed?
Risk level: Is the documented risk level (low/moderate/high) accurate for this encounter?
Overall MDM level: Does the documented MDM complexity match what you actually did?

3. Train Providers on E/M Documentation Under AI

Most providers have not received training on how AI documentation interacts with E/M coding guidelines. Key principles to reinforce:

The note supports your coding; your coding reflects your clinical work — not the note's length
AI-generated comprehensive documentation does not automatically justify a higher-complexity code
You are responsible for the accuracy of MDM documented in AI-generated notes
When in doubt, code to what you actually did and adjust the note accordingly

4. Evaluate Your AI Vendor's Documentation Approach

Ask your AI vendor directly:

Is your tool designed to maximize documentation completeness? If yes, understand how this affects MDM documentation.
Does your tool suggest coding levels? If so, does it provide conservative suggestions or is it optimized for revenue capture?
What is your tool's approach to E/M guidelines? Does it apply the 2021 CMS E/M guidelines appropriately?
Do you provide coding distribution analytics? Good vendors help you monitor for coding drift.

5. Consider Periodic External Audits

For practices with high AI documentation volume or those in higher-risk specialties (primary care, internal medicine, hospitalist medicine), annual external coding audits are prudent. Establish a relationship with a healthcare compliance consultant who understands AI documentation, not just traditional coding patterns.

What Appropriate AI Documentation Looks Like

The goal is not to suppress AI documentation quality — it is to ensure the documentation accurately reflects the clinical encounter. Here is the difference:

Over-Documentation (Red Flag)

A 10-minute established patient visit for hypertension follow-up generates a note with:

Extensive past medical history review of every problem in the chart
Review of 12 systems with positive and negative findings for each
Comprehensive physical exam documentation
Extensive assessment discussing multiple diagnostic possibilities
Documentation suggesting high-complexity MDM

If the actual visit was a simple follow-up to renew antihypertensive medication, this note does not accurately represent the encounter — even if everything in it is technically accurate.

Appropriate Documentation

The same visit generates a note with:

Chief complaint and current medication response
Relevant ROS for hypertension management
Pertinent physical exam (vitals, relevant findings)
Assessment: HTN, well-controlled on current regimen
Plan: Continue medication, next visit in 3 months
MDM: Straightforward or low complexity

This note accurately reflects a low-to-moderate complexity visit and supports appropriate coding.

The Path Forward

AI documentation tools offer genuine benefits for provider well-being and care quality. The coding concerns associated with AI scribes are real but manageable with appropriate oversight.

The practices that will navigate this well are those that:

Maintain human review of AI-generated notes before signing
Audit their coding distributions proactively
Train providers on the relationship between documentation and coding
Choose AI tools that prioritize accuracy over revenue optimization
Build compliance programs that specifically address AI documentation risks

For documentation tools focused on accuracy rather than billing optimization, see SOAPNoteAI.com — built to help clinicians document accurately and efficiently.

Frequently Asked Questions

AI scribe upcoding refers to the tendency of AI-generated clinical notes to support higher-complexity billing codes than the actual clinical encounter warrants. Research published in npj Digital Medicine in 2025 found that after deploying ambient AI scribes, health systems consistently shifted toward higher-acuity E/M code distributions — with high-complexity new patient visits increasing by 12–20 percentage points at multiple organizations, and as high as 80% at one system. This matters because upcoding — billing for a more complex service than was provided — is a federal offense under the False Claims Act and can result in repayment demands, penalties, and exclusion from Medicare and Medicaid.

The clinician, not the AI, is responsible for coding decisions. If you sign a note knowing that the documented complexity does not reflect the actual encounter, that is upcoding regardless of whether a human or AI wrote the note. The AI's tendency to generate comprehensive, complex-appearing documentation is not a legal defense. CMS, OIG, and the False Claims Act all apply equally to AI-assisted documentation. The practical risk: AI tools optimized for 'completeness' may generate notes that appear to support higher MDM complexity than the visit warranted. Providers must independently verify that the coded complexity reflects the actual clinical decision-making.

Five steps to protect yourself: (1) Never sign an AI-generated note without reviewing the assessment and coding complexity section; (2) Ensure the Medical Decision Making (MDM) documented by the AI actually reflects what you did — the number and complexity of problems addressed, amount of data reviewed, and risk of complications; (3) Run a coding audit 30–60 days after deploying any new AI documentation tool and compare your E/M distribution to your pre-AI baseline; (4) Ask your AI vendor if their tool is designed to maximize coding completeness, and understand how this affects your notes; (5) Consider periodic external audits if you use high-volume AI documentation.

The 'coding arms race' is a term used by researchers at Johns Hopkins University and Harvard Medical School (published in npj Digital Medicine) to describe an emergent dynamic where AI scribes drive more complete documentation, which supports higher complexity codes, which produces higher reimbursements — incentivizing health systems to prioritize AI tools that maximize coding rather than tools that prioritize documentation accuracy. Meanwhile, payers are responding by tightening audit criteria and scrutinizing AI-generated notes more closely. The concern is that this creates a systematic inflation of healthcare costs without corresponding improvement in care quality.

As of 2026, major commercial payers and CMS are developing enhanced audit protocols specifically for AI-generated documentation. Key audit triggers include: consistent patterns of high-complexity E/M coding across a provider's entire panel; documentation that appears templated or identical across multiple patients; comprehensive note content that does not match the appointment type (e.g., a 10-minute follow-up generating a high-complexity note); and rapid shifts in coding distribution shortly after a new documentation tool is deployed. Medicare Advantage plans are particularly active in auditing AI documentation quality.

Recovery Audit Contractors (RACs) perform post-payment reviews and can demand repayment if documentation does not support the billed service level. AI-generated notes that appear comprehensive but contain generic or templated language — rather than specific, individualized clinical details — are increasingly being flagged in RAC audits as not supporting the billed complexity. The fact that an AI generated the note does not protect against a RAC demand; the responsibility remains with the billing provider. Practices with high volumes of AI documentation should ensure their compliance programs specifically address AI-related audit risks.

Yes — but only if properly configured and used correctly. Some AI documentation tools include built-in coding guidance that helps providers select the appropriate E/M level based on documented MDM elements. When used responsibly, these features can actually improve coding accuracy in both directions — helping providers who undercode (common in primary care) capture appropriate complexity while also flagging notes where the AI's documentation may outrun the clinical encounter. The key is provider education: understanding the relationship between your clinical work, the AI's documentation, and the resulting billing codes.

Medical Disclaimer: This content is for educational purposes only and should not replace professional medical judgment. Always consult current clinical guidelines and your institution's policies.