MEDI PL

2 Squirrels AI

Technology Stack

OCR Provider Azure Doc Intelligence
OCR Provider Google Doc AI
Vision Model DeepSeek-OCR
Domain Medicare Forms
Architecture Multi-Provider Pipeline

Architecture Note: Medicare form OCR pipeline using multi-provider fusion for maximum accuracy. Combines Azure, Google, and DeepSeek for robust document extraction.

OCR Pipeline Components

📄
Form Ingestion
Medicare Document Input
☁️
Azure Doc Intelligence
Structured Field Extraction
🔍
Google Doc AI
Text + Layout Analysis
🧠
DeepSeek VL
Vision-Based OCR
🔀
Result Fusion
Best Result Selection
Validation Engine
Rule-Based Verification

Azure Doc Intelligence

%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#C17852', 'primaryTextColor': '#F0F6FC', 'primaryBorderColor': '#4A5E32', 'lineColor': '#E6C98F', 'secondaryColor': '#161B22', 'tertiaryColor': '#0D1117', 'background': '#0D1117', 'mainBkg': '#161B22', 'nodeBorder': '#4A5E32', 'clusterBkg': '#161B22', 'clusterBorder': '#4A5E32', 'titleColor': '#E6C98F', 'edgeLabelBackground': '#161B22'}}}%%
flowchart LR
    subgraph Input["📥 Document Input"]
        MF[("📄 Medicare Form
PDF/Image")] end subgraph Azure["☁️ Azure Doc Intelligence"] UP["📤 Upload to Azure"] AN["🔍 Analyze Document"] EX["📊 Extract Fields"] end subgraph Fields["📋 Structured Output"] PN["👤 Patient Name"] DOB["📅 Date of Birth"] MID["🔢 Medicare ID"] SVC["🏥 Service Codes"] DT["📆 Service Dates"] end subgraph Output["📤 Result"] JSON[("📝 Structured JSON")] end MF --> UP UP --> AN AN --> EX EX --> PN EX --> DOB EX --> MID EX --> SVC EX --> DT PN --> JSON DOB --> JSON MID --> JSON SVC --> JSON DT --> JSON

Google Doc AI Extraction

%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#C17852', 'primaryTextColor': '#F0F6FC', 'primaryBorderColor': '#4A5E32', 'lineColor': '#E6C98F', 'secondaryColor': '#161B22', 'tertiaryColor': '#0D1117', 'background': '#0D1117', 'mainBkg': '#161B22', 'nodeBorder': '#4A5E32', 'clusterBkg': '#161B22', 'clusterBorder': '#4A5E32', 'titleColor': '#E6C98F', 'edgeLabelBackground': '#161B22'}}}%%
flowchart TB
    subgraph Input["📥 Document"]
        DOC["📄 Medicare Document"]
    end

    subgraph Google["🔍 Google Doc AI"]
        GCS["☁️ Upload to GCS"]
        PROC["⚙️ Document Processor"]
        OCR["📝 OCR Engine"]
    end

    subgraph Analysis["🔬 Layout Analysis"]
        TEXT["📄 Raw Text"]
        BBOX["📐 Bounding Boxes"]
        TAB["📊 Table Detection"]
        FORM["📋 Form Fields"]
    end

    subgraph Entity["🏷️ Entity Extraction"]
        ENT["🔍 Entity Recognition"]
        KV["🔑 Key-Value Pairs"]
    end

    subgraph Output["📤 Output"]
        STRUCT["📝 Structured Data"]
        CONF["📊 Confidence Scores"]
    end

    DOC --> GCS
    GCS --> PROC
    PROC --> OCR
    OCR --> TEXT
    OCR --> BBOX
    OCR --> TAB
    OCR --> FORM
    TEXT --> ENT
    FORM --> KV
    ENT --> STRUCT
    KV --> STRUCT
    STRUCT --> CONF
                

DeepSeek Vision OCR

%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#C17852', 'primaryTextColor': '#F0F6FC', 'primaryBorderColor': '#4A5E32', 'lineColor': '#E6C98F', 'secondaryColor': '#161B22', 'tertiaryColor': '#0D1117', 'background': '#0D1117', 'mainBkg': '#161B22', 'nodeBorder': '#4A5E32', 'clusterBkg': '#161B22', 'clusterBorder': '#4A5E32', 'titleColor': '#E6C98F', 'edgeLabelBackground': '#161B22'}}}%%
flowchart LR
    subgraph Input["📥 Complex Forms"]
        HW["✍️ Handwritten Forms"]
        LOW["📸 Low-Quality Scans"]
        MIX["📋 Mixed Content"]
    end

    subgraph PreProcess["⚙️ Pre-Processing"]
        IMG["🖼️ Image Enhancement"]
        NORM["📐 Normalization"]
    end

    subgraph DeepSeek["🧠 DeepSeek VL Model"]
        VIS["👁️ Vision Encoder"]
        LLM["🤖 Language Model"]
        FUSE["🔗 Vision-Language Fusion"]
    end

    subgraph Extract["📤 Extraction"]
        TXT["📄 Text Content"]
        LAY["📐 Layout Understanding"]
        SEM["💡 Semantic Analysis"]
    end

    subgraph Output["✅ Output"]
        RESULT["📝 Extracted Data"]
    end

    HW --> IMG
    LOW --> IMG
    MIX --> IMG
    IMG --> NORM
    NORM --> VIS
    VIS --> FUSE
    LLM --> FUSE
    FUSE --> TXT
    FUSE --> LAY
    FUSE --> SEM
    TXT --> RESULT
    LAY --> RESULT
    SEM --> RESULT
                

Multi-Provider Fusion

%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#C17852', 'primaryTextColor': '#F0F6FC', 'primaryBorderColor': '#4A5E32', 'lineColor': '#E6C98F', 'secondaryColor': '#161B22', 'tertiaryColor': '#0D1117', 'background': '#0D1117', 'mainBkg': '#161B22', 'nodeBorder': '#4A5E32', 'clusterBkg': '#161B22', 'clusterBorder': '#4A5E32', 'titleColor': '#E6C98F', 'edgeLabelBackground': '#161B22'}}}%%
flowchart TB
    subgraph Input["📥 Medicare Form"]
        DOC["📄 Input Document"]
    end

    subgraph Parallel["⚡ Parallel OCR Processing"]
        direction TB
        AZ["☁️ Azure Doc Intelligence"]
        GO["🔍 Google Doc AI"]
        DS["🧠 DeepSeek VL"]
    end

    subgraph Results["📊 OCR Results"]
        R1["📝 Azure Result
+ Confidence"] R2["📝 Google Result
+ Confidence"] R3["📝 DeepSeek Result
+ Confidence"] end subgraph Fusion["🔀 Result Fusion"] ALIGN["🔗 Field Alignment"] VOTE["🗳️ Confidence Voting"] SELECT["🎯 Best Result Selection"] end subgraph Output["✅ Final Output"] BEST["📝 Optimal Extraction"] META["📊 Source Attribution"] end DOC --> AZ DOC --> GO DOC --> DS AZ --> R1 GO --> R2 DS --> R3 R1 --> ALIGN R2 --> ALIGN R3 --> ALIGN ALIGN --> VOTE VOTE --> SELECT SELECT --> BEST SELECT --> META

Form Validation Pipeline

%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#C17852', 'primaryTextColor': '#F0F6FC', 'primaryBorderColor': '#4A5E32', 'lineColor': '#E6C98F', 'secondaryColor': '#161B22', 'tertiaryColor': '#0D1117', 'background': '#0D1117', 'mainBkg': '#161B22', 'nodeBorder': '#4A5E32', 'clusterBkg': '#161B22', 'clusterBorder': '#4A5E32', 'titleColor': '#E6C98F', 'edgeLabelBackground': '#161B22'}}}%%
flowchart TB
    subgraph Input["📥 Extracted Data"]
        EXT["📝 OCR Output"]
    end

    subgraph Rules["📋 Rule Engine"]
        FMT["📏 Format Validation
Medicare ID Pattern"] REQ["✔️ Required Fields
Completeness Check"] RANGE["📊 Range Validation
Date Bounds"] CROSS["🔗 Cross-Field Rules
Logical Consistency"] end subgraph Decision{"🎯 Validation Result"} PASS["✅ PASS"] FAIL["❌ FAIL"] end subgraph Output["📤 Final Output"] VALID["📝 Validated Data"] ERR["⚠️ Error Report"] FLAG["🚩 Manual Review"] end EXT --> FMT EXT --> REQ EXT --> RANGE EXT --> CROSS FMT --> PASS FMT --> FAIL REQ --> PASS REQ --> FAIL RANGE --> PASS RANGE --> FAIL CROSS --> PASS CROSS --> FAIL PASS --> VALID FAIL --> ERR FAIL --> FLAG

Full Medicare OCR Pipeline

%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#C17852', 'primaryTextColor': '#F0F6FC', 'primaryBorderColor': '#4A5E32', 'lineColor': '#E6C98F', 'secondaryColor': '#161B22', 'tertiaryColor': '#0D1117', 'background': '#0D1117', 'mainBkg': '#161B22', 'nodeBorder': '#4A5E32', 'clusterBkg': '#161B22', 'clusterBorder': '#4A5E32', 'titleColor': '#E6C98F', 'edgeLabelBackground': '#161B22'}}}%%
flowchart TB
    subgraph Ingest["📥 Document Ingestion"]
        UP["📤 Upload Portal"]
        CLASS["🏷️ Form Classifier"]
    end

    subgraph PreProc["⚙️ Pre-Processing"]
        QUAL["📊 Quality Assessment"]
        ENH["🖼️ Image Enhancement"]
        SPLIT["✂️ Page Splitting"]
    end

    subgraph OCR["🔍 Multi-Model OCR Layer"]
        AZ["☁️ Azure Doc Intelligence"]
        GO["🔍 Google Doc AI"]
        DS["🧠 DeepSeek VL"]
    end

    subgraph Fusion["🔀 Fusion Layer"]
        MERGE["🔗 Result Merging"]
        CONF["📊 Confidence Scoring"]
        SELECT["🎯 Best Selection"]
    end

    subgraph Validate["✅ Validation Layer"]
        RULES["📋 Rule Engine"]
        VERIFY["✔️ Field Verification"]
    end

    subgraph Output["📤 Output"]
        API["🔌 API Response"]
        DB[("💾 Data Store")]
        REVIEW["👁️ Manual Review Queue"]
    end

    UP --> CLASS
    CLASS --> QUAL
    QUAL --> ENH
    ENH --> SPLIT
    SPLIT --> AZ
    SPLIT --> GO
    SPLIT --> DS
    AZ --> MERGE
    GO --> MERGE
    DS --> MERGE
    MERGE --> CONF
    CONF --> SELECT
    SELECT --> RULES
    RULES --> VERIFY
    VERIFY --> API
    VERIFY --> DB
    VERIFY --> REVIEW