Transcribe Integration Examples

This guide provides complete integration examples for the /transcribe endpoint, showing how to implement audio transcription to clinical notes in different programming languages.

Basic Workflow

The basic workflow for using the transcribe endpoint is:

Capture or load an audio file containing medical conversation
Send the audio file to the /transcribe endpoint
Process the structured clinical note response
Display or store the formatted note

Complete Examples

Python Example

This example shows a complete Python application that:

Takes an audio file path as input
Sends it to the transcribe endpoint
Formats and displays the resulting clinical note

import requests
import argparse
import os
import json
from typing import Dict, Any, List

def transcribe_audio(api_key: str, file_path: str, note_type: str = "PROGRESS_NOTE", language: str = "en") -> Dict[str, Any]:
    """
    Transcribe an audio file to a structured clinical note.
    
    Args:
        api_key: Your Knidian AI API key
        file_path: Path to the audio file
        note_type: Type of clinical note to generate
        language: Language of the audio
        
    Returns:
        The structured clinical note response
    """
    if not os.path.exists(file_path):
        raise FileNotFoundError(f"Audio file not found: {file_path}")
    
    # Get file name and extension
    file_name = os.path.basename(file_path)
    
    # Determine content type based on file extension
    content_type = "audio/mpeg"  # Default to MP3
    if file_path.lower().endswith(".wav"):
        content_type = "audio/wav"
    elif file_path.lower().endswith(".m4a"):
        content_type = "audio/mp4"
    elif file_path.lower().endswith(".ogg"):
        content_type = "audio/ogg"
    elif file_path.lower().endswith(".flac"):
        content_type = "audio/flac"
    
    url = "https://api.knidian.ai/transcribe"
    headers = {
        "x-api-key": api_key
    }
    
    files = {
        "file": (file_name, open(file_path, "rb"), content_type)
    }
    
    data = {
        "note_type": note_type,
        "language": language
    }
    
    print(f"Transcribing {file_name}...")
    response = requests.post(url, headers=headers, files=files, data=data)
    
    if response.status_code != 200:
        print(f"Error: {response.status_code}")
        print(response.text)
        raise Exception(f"API request failed with status code {response.status_code}: {response.text}")
    
    return response.json()

def format_note(note_data: Dict[str, Any]) -> str:
    """Format the note data into a readable string."""
    note = note_data["note"]
    sections = note["sections"]
    
    formatted_note = f"# {note['title']}\n\n"
    
    for section in sections:
        formatted_note += f"## {section['title']}\n"
        formatted_note += f"{section['content']}\n\n"
    
    return formatted_note

def main():
    parser = argparse.ArgumentParser(description="Transcribe medical audio to structured clinical notes")
    parser.add_argument("file_path", help="Path to the audio file")
    parser.add_argument("--note-type", default="PROGRESS_NOTE", help="Type of clinical note to generate")
    parser.add_argument("--language", default="en", help="Language of the audio")
    parser.add_argument("--output", help="Output file path (optional)")
    args = parser.parse_args()
    
    # Get API key from environment variable
    api_key = os.environ.get("KNIDIAN_API_KEY")
    if not api_key:
        raise ValueError("KNIDIAN_API_KEY environment variable not set")
    
    try:
        result = transcribe_audio(api_key, args.file_path, args.note_type, args.language)
        
        # Format the note
        formatted_note = format_note(result)
        
        # Output the formatted note
        if args.output:
            with open(args.output, "w") as f:
                f.write(formatted_note)
            print(f"Note saved to {args.output}")
        else:
            print("\n" + formatted_note)
            
    except Exception as e:
        print(f"Error: {str(e)}")

if __name__ == "__main__":
    main()

To use this script:

Save it as transcribe.py
Set your API key as an environment variable:
```
export KNIDIAN_API_KEY=your_api_key_here
```

Run the script with an audio file:

python transcribe.py patient_recording.mp3 --note-type=ED_NOTE --language=en

Node.js Example

This example shows a complete Node.js application that transcribes audio files to clinical notes:

const axios = require('axios');
const FormData = require('form-data');
const fs = require('fs');
const path = require('path');
const yargs = require('yargs/yargs');
const { hideBin } = require('yargs/helpers');

// Parse command line arguments
const argv = yargs(hideBin(process.argv))
  .option('file', {
    alias: 'f',
    description: 'Path to the audio file',
    type: 'string',
    demandOption: true
  })
  .option('noteType', {
    alias: 't',
    description: 'Type of clinical note to generate',
    type: 'string',
    default: 'PROGRESS_NOTE'
  })
  .option('language', {
    alias: 'l',
    description: 'Language of the audio',
    type: 'string',
    default: 'en'
  })
  .option('output', {
    alias: 'o',
    description: 'Output file path (optional)',
    type: 'string'
  })
  .help()
  .alias('help', 'h')
  .argv;

// Get content type based on file extension
function getContentType(filePath) {
  const ext = path.extname(filePath).toLowerCase();
  const contentTypes = {
    '.mp3': 'audio/mpeg',
    '.wav': 'audio/wav',
    '.m4a': 'audio/mp4',
    '.ogg': 'audio/ogg',
    '.flac': 'audio/flac',
    '.aac': 'audio/aac'
  };
  
  return contentTypes[ext] || 'audio/mpeg';
}

// Format the note data into a readable string
function formatNote(noteData) {
  const note = noteData.note;
  const sections = note.sections;
  
  let formattedNote = `# ${note.title}\n\n`;
  
  for (const section of sections) {
    formattedNote += `## ${section.title}\n`;
    formattedNote += `${section.content}\n\n`;
  }
  
  return formattedNote;
}

async function transcribeAudio() {
  // Check if API key is set
  const apiKey = process.env.KNIDIAN_API_KEY;
  if (!apiKey) {
    console.error('Error: KNIDIAN_API_KEY environment variable not set');
    process.exit(1);
  }
  
  // Check if file exists
  if (!fs.existsSync(argv.file)) {
    console.error(`Error: Audio file not found: ${argv.file}`);
    process.exit(1);
  }
  
  const form = new FormData();
  form.append('file', fs.createReadStream(argv.file));
  form.append('note_type', argv.noteType);
  form.append('language', argv.language);
  
  console.log(`Transcribing ${path.basename(argv.file)}...`);
  
  try {
    const response = await axios.post('https://api.knidian.ai/transcribe', form, {
      headers: {
        ...form.getHeaders(),
        'x-api-key': apiKey
      }
    });
    
    // Format the note
    const formattedNote = formatNote(response.data);
    
    // Output the formatted note
    if (argv.output) {
      fs.writeFileSync(argv.output, formattedNote);
      console.log(`Note saved to ${argv.output}`);
    } else {
      console.log('\n' + formattedNote);
    }
    
  } catch (error) {
    console.error('Error:', error.response ? error.response.data : error.message);
    process.exit(1);
  }
}

transcribeAudio();

To use this script:

Create a new Node.js project:

mkdir transcribe-example
cd transcribe-example
npm init -y
npm install axios form-data yargs

Save the script as transcribe.js
Set your API key as an environment variable:
```
export KNIDIAN_API_KEY=your_api_key_here
```

Run the script with an audio file:

node transcribe.js --file=patient_recording.mp3 --noteType=CONSULT_NOTE

Web Application Example

Here's a simple HTML/JavaScript example that allows users to upload an audio file and view the transcribed clinical note:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Medical Audio Transcription</title>
    <style>
        body {
            font-family: Arial, sans-serif;
            max-width: 800px;
            margin: 0 auto;
            padding: 20px;
        }
        .container {
            display: flex;
            flex-direction: column;
            gap: 20px;
        }
        .form-group {
            margin-bottom: 15px;
        }
        label {
            display: block;
            margin-bottom: 5px;
            font-weight: bold;
        }
        select, input[type="file"] {
            width: 100%;
            padding: 8px;
            border: 1px solid #ddd;
            border-radius: 4px;
        }
        button {
            background-color: #4CAF50;
            color: white;
            padding: 10px 15px;
            border: none;
            border-radius: 4px;
            cursor: pointer;
        }
        button:hover {
            background-color: #45a049;
        }
        button:disabled {
            background-color: #cccccc;
            cursor: not-allowed;
        }
        .result {
            border: 1px solid #ddd;
            padding: 20px;
            border-radius: 4px;
            background-color: #f9f9f9;
        }
        .section {
            margin-bottom: 20px;
        }
        .section h3 {
            margin-bottom: 5px;
            color: #333;
        }
        .loading {
            display: none;
            text-align: center;
            padding: 20px;
        }
        .spinner {
            border: 4px solid rgba(0, 0, 0, 0.1);
            width: 36px;
            height: 36px;
            border-radius: 50%;
            border-left-color: #09f;
            animation: spin 1s linear infinite;
            margin: 0 auto;
        }
        @keyframes spin {
            0% { transform: rotate(0deg); }
            100% { transform: rotate(360deg); }
        }
    </style>
</head>
<body>
    <div class="container">
        <h1>Medical Audio Transcription</h1>
        
        <div class="form-container">
            <div class="form-group">
                <label for="apiKey">API Key:</label>
                <input type="password" id="apiKey" placeholder="Enter your Knidian API key">
            </div>
            
            <div class="form-group">
                <label for="audioFile">Audio File:</label>
                <input type="file" id="audioFile" accept="audio/*">
            </div>
            
            <div class="form-group">
                <label for="noteType">Note Type:</label>
                <select id="noteType">
                    <option value="PROGRESS_NOTE">Progress Note</option>
                    <option value="ADMISSION_NOTE">Admission Note</option>
                    <option value="ED_NOTE">Emergency Department Note</option>
                    <option value="DISCHARGE_SUMMARY">Discharge Summary</option>
                    <option value="CONSULT_NOTE">Consultation Note</option>
                    <option value="NURSING_NOTE">Nursing Note</option>
                    <option value="BEHAVIORAL_HEALTH_NOTE">Behavioral Health Note</option>
                    <option value="DIETITIAN_NOTE">Dietitian Note</option>
                </select>
            </div>
            
            <div class="form-group">
                <label for="language">Language:</label>
                <select id="language">
                    <option value="en">English</option>
                    <option value="es">Spanish</option>
                    <option value="pt">Portuguese</option>
                    <option value="fr">French</option>
                    <option value="de">German</option>
                    <option value="it">Italian</option>
                    <option value="ja">Japanese</option>
                    <option value="ko">Korean</option>
                    <option value="zh">Chinese</option>
                </select>
            </div>
            
            <button id="transcribeBtn">Transcribe Audio</button>
        </div>
        
        <div id="loading" class="loading">
            <div class="spinner"></div>
            <p>Transcribing audio... This may take a few minutes.</p>
        </div>
        
        <div id="result" class="result" style="display: none;">
            <h2 id="noteTitle"></h2>
            <div id="noteSections"></div>
        </div>
    </div>
    
    <script>
        document.getElementById('transcribeBtn').addEventListener('click', async function() {
            const apiKey = document.getElementById('apiKey').value;
            const audioFile = document.getElementById('audioFile').files[0];
            const noteType = document.getElementById('noteType').value;
            const language = document.getElementById('language').value;
            
            if (!apiKey) {
                alert('Please enter your API key');
                return;
            }
            
            if (!audioFile) {
                alert('Please select an audio file');
                return;
            }
            
            // Show loading indicator
            document.getElementById('loading').style.display = 'block';
            document.getElementById('result').style.display = 'none';
            document.getElementById('transcribeBtn').disabled = true;
            
            const formData = new FormData();
            formData.append('file', audioFile);
            formData.append('note_type', noteType);
            formData.append('language', language);
            
            try {
                const response = await fetch('https://api.knidian.ai/transcribe', {
                    method: 'POST',
                    headers: {
                        'x-api-key': apiKey
                    },
                    body: formData
                });
                
                if (!response.ok) {
                    const errorData = await response.json();
                    throw new Error(errorData.message || `Error: ${response.status}`);
                }
                
                const data = await response.json();
                
                // Display the result
                document.getElementById('noteTitle').textContent = data.note.title;
                
                const sectionsContainer = document.getElementById('noteSections');
                sectionsContainer.innerHTML = '';
                
                data.note.sections.forEach(section => {
                    const sectionDiv = document.createElement('div');
                    sectionDiv.className = 'section';
                    
                    const title = document.createElement('h3');
                    title.textContent = section.title;
                    
                    const content = document.createElement('p');
                    content.textContent = section.content;
                    
                    sectionDiv.appendChild(title);
                    sectionDiv.appendChild(content);
                    sectionsContainer.appendChild(sectionDiv);
                });
                
                document.getElementById('result').style.display = 'block';
                
            } catch (error) {
                alert(`Transcription failed: ${error.message}`);
                console.error('Error:', error);
            } finally {
                document.getElementById('loading').style.display = 'none';
                document.getElementById('transcribeBtn').disabled = false;
            }
        });
    </script>
</body>
</html>

To use this web application:

Save the HTML code to a file named index.html
Open the file in a web browser
Enter your API key, select an audio file, choose the note type and language, and click "Transcribe Audio"

Best Practices

When integrating the transcribe endpoint into your application, consider these best practices:

Handle large files efficiently: For files approaching the 100MB limit, consider implementing a progress indicator during upload.
Implement error handling: Always handle potential errors from the API, including authentication issues, file format problems, and server errors.
Validate input files: Check file types and sizes before sending to avoid unnecessary API calls.
Secure API keys: Never expose your API key in client-side code. For web applications, use a backend proxy to make API calls.
Consider user experience: Transcription may take time for longer audio files, so implement appropriate loading indicators and feedback.
Process the response appropriately: Format the structured note data in a way that's useful for your specific application.
Implement retry logic: For production applications, implement retry logic with exponential backoff for transient errors.

Basic Workflow​

Complete Examples​

Python Example​

Node.js Example​

Web Application Example​

Best Practices​