API Reference

This section provides comprehensive documentation for the WTF Transcript Converter API.

Core Models

Core WTF data models.

This module contains Pydantic models for the World Transcription Format (WTF).

class wtf_transcript_converter.core.models.WTFTranscript(**data)[source]

Bases: BaseModel

Core transcript information following WTF specification.

Parameters:: data (Any)

text: str

language: str

duration: float

confidence: float

classmethod validate_language_code(v)[source]

Validate BCP-47 language code format.

Parameters:: v (str)
Return type:: str

classmethod validate_text(v)[source]

Validate and clean transcript text.

Parameters:: v (str)
Return type:: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class wtf_transcript_converter.core.models.WTFSegment(**data)[source]

Bases: BaseModel

Individual transcript segment with timing information.

Parameters:: data (Any)

id: int

start: float

end: float

text: str

confidence: float

speaker: int | str | None

words: List[int] | None

validate_timing()[source]

Validate that end time is after start time.

Return type:: WTFSegment

classmethod validate_text(v)[source]

Validate and clean segment text.

Parameters:: v (str)
Return type:: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class wtf_transcript_converter.core.models.WTFWord(**data)[source]

Bases: BaseModel

Word-level transcription data.

Parameters:: data (Any)

id: int

start: float

end: float

text: str

confidence: float

speaker: int | str | None

is_punctuation: bool | None

validate_timing()[source]

Validate that end time is after start time.

Return type:: WTFWord

classmethod validate_text(v)[source]

Validate and clean word text.

Parameters:: v (str)
Return type:: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class wtf_transcript_converter.core.models.WTFSpeaker(**data)[source]

Bases: BaseModel

Speaker information for diarization.

Parameters:: data (Any)

id: int | str

label: str

segments: List[int]

total_time: float

confidence: float

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class wtf_transcript_converter.core.models.WTFAudio(**data)[source]

Bases: BaseModel

Audio metadata information.

Parameters:: data (Any)

duration: float

sample_rate: int | None

channels: int | None

format: str | None

bitrate: int | None

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class wtf_transcript_converter.core.models.WTFMetadata(**data)[source]

Bases: BaseModel

Processing metadata information.

Parameters:: data (Any)

created_at: str

processed_at: str

provider: str

model: str

processing_time: float | None

audio: WTFAudio

options: Dict[str, Any]

classmethod validate_timestamp(v)[source]

Validate ISO 8601 timestamp format.

Parameters:: v (str)
Return type:: str

classmethod validate_provider(v)[source]

Validate and normalize provider name.

Parameters:: v (str)
Return type:: str

classmethod validate_model(v)[source]

Validate model identifier.

Parameters:: v (str)
Return type:: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class wtf_transcript_converter.core.models.WTFQuality(**data)[source]

Bases: BaseModel

Quality metrics for the transcription.

Parameters:: data (Any)

audio_quality: str | None

background_noise: float | None

multiple_speakers: bool | None

overlapping_speech: bool | None

silence_ratio: float | None

average_confidence: float | None

low_confidence_words: int | None

processing_warnings: List[str]

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class wtf_transcript_converter.core.models.WTFExtensions(**data)[source]

Bases: BaseModel

Provider-specific extensions.

Parameters:: data (Any)

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class wtf_transcript_converter.core.models.WTFDocument(**data)[source]

Bases: BaseModel

Complete WTF document structure.

Parameters:: data (Any)

transcript: WTFTranscript

segments: List[WTFSegment]

metadata: WTFMetadata

words: List[WTFWord] | None

speakers: Dict[str, WTFSpeaker] | None

alternatives: List[Dict[str, Any]] | None

enrichments: Dict[str, Any] | None

extensions: Dict[str, Any] | None

quality: WTFQuality | None

streaming: Dict[str, Any] | None

validate_document_consistency()[source]

Validate document-level consistency.

Return type:: WTFDocument

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class wtf_transcript_converter.core.models.VConWTFAttachment(**data)[source]

Bases: BaseModel

Deprecated placeholder. WTF results go in analysis[], not attachments[].

Parameters:: data (Any)

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

WTFDocument

class wtf_transcript_converter.core.models.WTFDocument(**data)[source]

Complete WTF document structure.

Parameters:: data (Any)

transcript: WTFTranscript

segments: List[WTFSegment]

metadata: WTFMetadata

words: List[WTFWord] | None

speakers: Dict[str, WTFSpeaker] | None

alternatives: List[Dict[str, Any]] | None

enrichments: Dict[str, Any] | None

extensions: Dict[str, Any] | None

quality: WTFQuality | None

streaming: Dict[str, Any] | None

validate_document_consistency()[source]

Validate document-level consistency.

Return type:: WTFDocument

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

WTFTranscript

class wtf_transcript_converter.core.models.WTFTranscript(**data)[source]

Core transcript information following WTF specification.

Parameters:: data (Any)

text: str

language: str

duration: float

confidence: float

classmethod validate_language_code(v)[source]

Validate BCP-47 language code format.

Parameters:: v (str)
Return type:: str

classmethod validate_text(v)[source]

Validate and clean transcript text.

Parameters:: v (str)
Return type:: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

WTFSegment

class wtf_transcript_converter.core.models.WTFSegment(**data)[source]

Individual transcript segment with timing information.

Parameters:: data (Any)

id: int

start: float

end: float

text: str

confidence: float

speaker: int | str | None

words: List[int] | None

validate_timing()[source]

Validate that end time is after start time.

Return type:: WTFSegment

classmethod validate_text(v)[source]

Validate and clean segment text.

Parameters:: v (str)
Return type:: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

WTFWord

class wtf_transcript_converter.core.models.WTFWord(**data)[source]

Word-level transcription data.

Parameters:: data (Any)

id: int

start: float

end: float

text: str

confidence: float

speaker: int | str | None

is_punctuation: bool | None

validate_timing()[source]

Validate that end time is after start time.

Return type:: WTFWord

classmethod validate_text(v)[source]

Validate and clean word text.

Parameters:: v (str)
Return type:: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

WTFMetadata

class wtf_transcript_converter.core.models.WTFMetadata(**data)[source]

Processing metadata information.

Parameters:: data (Any)

created_at: str

processed_at: str

provider: str

model: str

processing_time: float | None

audio: WTFAudio

options: Dict[str, Any]

classmethod validate_timestamp(v)[source]

Validate ISO 8601 timestamp format.

Parameters:: v (str)
Return type:: str

classmethod validate_provider(v)[source]

Validate and normalize provider name.

Parameters:: v (str)
Return type:: str

classmethod validate_model(v)[source]

Validate model identifier.

Parameters:: v (str)
Return type:: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

WTFAudio

class wtf_transcript_converter.core.models.WTFAudio(**data)[source]

Audio metadata information.

Parameters:: data (Any)

duration: float

sample_rate: int | None

channels: int | None

format: str | None

bitrate: int | None

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

WTFQuality

class wtf_transcript_converter.core.models.WTFQuality(**data)[source]

Quality metrics for the transcription.

Parameters:: data (Any)

audio_quality: str | None

background_noise: float | None

multiple_speakers: bool | None

overlapping_speech: bool | None

silence_ratio: float | None

average_confidence: float | None

low_confidence_words: int | None

processing_warnings: List[str]

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

WTFExtensions

class wtf_transcript_converter.core.models.WTFExtensions(**data)[source]

Provider-specific extensions.

Parameters:: data (Any)

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

Core Validator

WTF document validation functions.

This module provides validation functions for WTF documents and their components.

wtf_transcript_converter.core.validator.validate_wtf_document(doc)[source]

Validate a WTF document for compliance with the specification.

Parameters:: doc (WTFDocument) – WTF document to validate
Return type:: Tuple[bool, List[str]]
Returns:: Tuple of (is_valid, list_of_errors)

wtf_transcript_converter.core.validator.validate_confidence_score(confidence, context='')[source]

Validate that a confidence score is in the valid range [0.0, 1.0].

Parameters:

confidence (float) – Confidence score to validate
context (str) – Optional context for error messages

Return type:

bool

Returns:

True if valid, False otherwise

wtf_transcript_converter.core.validator.validate_timestamp(timestamp)[source]

Validate ISO 8601 timestamp format.

Parameters:: timestamp (str) – Timestamp string to validate
Return type:: bool
Returns:: True if valid, False otherwise

wtf_transcript_converter.core.validator.validate_language_code(language_code)[source]

Validate BCP-47 language code format.

Parameters:: language_code (str) – Language code to validate
Return type:: bool
Returns:: True if valid, False otherwise

WTFValidator

Core Converter

Base converter framework for WTF transcript conversion.

This module provides abstract base classes for converting between different transcript formats and WTF.

class wtf_transcript_converter.core.converter.BaseConverter[source]

Bases: ABC

Abstract base class for all converters.

abstractmethod convert(data)[source]

Convert data from one format to another.

Parameters:: data (Any)
Return type:: Any

class wtf_transcript_converter.core.converter.ToWTFConverter[source]

Bases: BaseConverter

Abstract base class for converters that convert TO WTF format.

abstractmethod convert(data)[source]

Convert provider-specific data to WTF format.

Parameters:: data (Dict[str, Any])
Return type:: WTFDocument

class wtf_transcript_converter.core.converter.FromWTFConverter[source]

Bases: BaseConverter

Abstract base class for converters that convert FROM WTF format.

abstractmethod convert(wtf_doc)[source]

Convert WTF document to provider-specific format.

Parameters:: wtf_doc (WTFDocument)
Return type:: Dict[str, Any]

BaseProviderConverter

Provider Converters

Whisper Converter

Whisper provider converter.

This module provides conversion between Whisper JSON format and WTF format.

class wtf_transcript_converter.providers.whisper.WhisperConverter[source]

Bases: ToWTFConverter, FromWTFConverter

Converter for Whisper JSON format to/from WTF format.

__init__()[source]

convert_to_wtf(whisper_data)[source]

Convert Whisper JSON data to WTF format.

Parameters:: whisper_data (Dict[str, Any]) – Whisper JSON data structure
Return type:: WTFDocument
Returns:: WTF document

convert_from_wtf(wtf_doc)[source]

Convert WTF document to Whisper JSON format.

Parameters:: wtf_doc (WTFDocument) – WTF document
Return type:: Dict[str, Any]
Returns:: Whisper JSON data structure

convert(data)[source]

Generic convert method - determines direction based on data type.

Parameters:: data (Any)
Return type:: Any

WhisperConverter

class wtf_transcript_converter.providers.whisper.WhisperConverter[source]

Converter for Whisper JSON format to/from WTF format.

__init__()[source]

convert_to_wtf(whisper_data)[source]

Convert Whisper JSON data to WTF format.

Parameters:: whisper_data (Dict[str, Any]) – Whisper JSON data structure
Return type:: WTFDocument
Returns:: WTF document

convert_from_wtf(wtf_doc)[source]

Convert WTF document to Whisper JSON format.

Parameters:: wtf_doc (WTFDocument) – WTF document
Return type:: Dict[str, Any]
Returns:: Whisper JSON data structure

convert(data)[source]

Generic convert method - determines direction based on data type.

Parameters:: data (Any)
Return type:: Any

Deepgram Converter

Deepgram provider converter.

This module provides conversion between Deepgram JSON format and WTF format.

class wtf_transcript_converter.providers.deepgram.DeepgramConverter[source]

Bases: ToWTFConverter, FromWTFConverter

Converter for Deepgram JSON format to/from WTF format.

__init__()[source]

convert_to_wtf(deepgram_data)[source]

Convert Deepgram JSON data to WTF format.

Parameters:: deepgram_data (Dict[str, Any]) – Deepgram JSON data structure
Return type:: WTFDocument
Returns:: WTF document

convert_from_wtf(wtf_doc)[source]

Convert WTF document to Deepgram JSON format.

Parameters:: wtf_doc (WTFDocument) – WTF document
Return type:: Dict[str, Any]
Returns:: Deepgram JSON data structure

convert(data)[source]

Generic convert method - determines direction based on data type.

Parameters:: data (Any)
Return type:: Any

DeepgramConverter

class wtf_transcript_converter.providers.deepgram.DeepgramConverter[source]

Converter for Deepgram JSON format to/from WTF format.

__init__()[source]

convert_to_wtf(deepgram_data)[source]

Convert Deepgram JSON data to WTF format.

Parameters:: deepgram_data (Dict[str, Any]) – Deepgram JSON data structure
Return type:: WTFDocument
Returns:: WTF document

convert_from_wtf(wtf_doc)[source]

Convert WTF document to Deepgram JSON format.

Parameters:: wtf_doc (WTFDocument) – WTF document
Return type:: Dict[str, Any]
Returns:: Deepgram JSON data structure

convert(data)[source]

Generic convert method - determines direction based on data type.

Parameters:: data (Any)
Return type:: Any

AssemblyAI Converter

AssemblyAI provider converter.

This module provides conversion between AssemblyAI JSON format and WTF format.

class wtf_transcript_converter.providers.assemblyai.AssemblyAIConverter[source]

Bases: ToWTFConverter, FromWTFConverter

Converter for AssemblyAI JSON format to/from WTF format.

__init__()[source]

convert_to_wtf(assemblyai_data)[source]

Convert AssemblyAI JSON data to WTF format.

Parameters:: assemblyai_data (Dict[str, Any]) – AssemblyAI JSON data structure
Return type:: WTFDocument
Returns:: WTF document

convert_from_wtf(wtf_doc)[source]

Convert WTF document to AssemblyAI JSON format.

Parameters:: wtf_doc (WTFDocument) – WTF document
Return type:: Dict[str, Any]
Returns:: AssemblyAI JSON data structure

convert(data)[source]

Generic convert method - determines direction based on data type.

Parameters:: data (Any)
Return type:: Any

AssemblyAIConverter

class wtf_transcript_converter.providers.assemblyai.AssemblyAIConverter[source]

Converter for AssemblyAI JSON format to/from WTF format.

__init__()[source]

convert_to_wtf(assemblyai_data)[source]

Convert AssemblyAI JSON data to WTF format.

Parameters:: assemblyai_data (Dict[str, Any]) – AssemblyAI JSON data structure
Return type:: WTFDocument
Returns:: WTF document

convert_from_wtf(wtf_doc)[source]

Convert WTF document to AssemblyAI JSON format.

Parameters:: wtf_doc (WTFDocument) – WTF document
Return type:: Dict[str, Any]
Returns:: AssemblyAI JSON data structure

convert(data)[source]

Generic convert method - determines direction based on data type.

Parameters:: data (Any)
Return type:: Any

Rev.ai Converter

Rev.ai provider converter for WTF transcript format.

This module provides conversion between Rev.ai transcription format and WTF format.

class wtf_transcript_converter.providers.rev_ai.RevAIConverter[source]

Bases: BaseProviderConverter

Converter for Rev.ai JSON format to/from WTF format.

__init__()[source]

provider_name: str = 'rev_ai'

description: str = 'Rev.ai transcription service'

status: str = 'Implemented'

convert_to_wtf(rev_ai_data)[source]

Convert Rev.ai JSON data to WTF format.

Parameters:: rev_ai_data (Dict[str, Any]) – Rev.ai JSON data structure
Return type:: WTFDocument
Returns:: WTF document

convert_from_wtf(wtf_doc)[source]

Convert WTF document to Rev.ai JSON format.

Parameters:: wtf_doc (WTFDocument) – WTF document
Return type:: Dict[str, Any]
Returns:: Rev.ai JSON data structure

RevAIConverter

class wtf_transcript_converter.providers.rev_ai.RevAIConverter[source]

Converter for Rev.ai JSON format to/from WTF format.

__init__()[source]

provider_name: str = 'rev_ai'

description: str = 'Rev.ai transcription service'

status: str = 'Implemented'

convert_to_wtf(rev_ai_data)[source]

Convert Rev.ai JSON data to WTF format.

Parameters:: rev_ai_data (Dict[str, Any]) – Rev.ai JSON data structure
Return type:: WTFDocument
Returns:: WTF document

convert_from_wtf(wtf_doc)[source]

Convert WTF document to Rev.ai JSON format.

Parameters:: wtf_doc (WTFDocument) – WTF document
Return type:: Dict[str, Any]
Returns:: Rev.ai JSON data structure

Canary Converter

Canary provider converter implementation.

This module provides conversion between Canary (NVIDIA NeMo) transcription format and WTF.

class wtf_transcript_converter.providers.canary.CanaryConverter(provider_name='canary', model_name='nvidia/canary-1b-v2')[source]

Bases: BaseProviderConverter

Converter for Canary (NVIDIA NeMo) transcription format to/from WTF.

Parameters:

provider_name (str)
model_name (str)

provider_name: str = 'canary'

description: str = 'NVIDIA Canary speech recognition via Hugging Face'

status: str = 'Implemented'

__init__(provider_name='canary', model_name='nvidia/canary-1b-v2')[source]

Parameters:

provider_name (str)
model_name (str)

transcribe_audio(audio_path, language='en')[source]

Transcribe audio file using Canary model.

Parameters:

audio_path (str) – Path to audio file
language (str) – Language code (e.g., ‘en’, ‘es’, ‘fr’)

Return type:

Dict[str, Any]

Returns:

Canary transcription result

convert_to_wtf(canary_data)[source]

Convert Canary JSON data to WTF format.

Parameters:: canary_data (Dict[str, Any]) – Canary JSON data structure
Return type:: WTFDocument
Returns:: WTF document

convert_from_wtf(wtf_doc)[source]

Convert WTF document to Canary JSON format.

Parameters:: wtf_doc (WTFDocument) – WTF document
Return type:: Dict[str, Any]
Returns:: Canary JSON data structure

CanaryConverter

class wtf_transcript_converter.providers.canary.CanaryConverter(provider_name='canary', model_name='nvidia/canary-1b-v2')[source]

Converter for Canary (NVIDIA NeMo) transcription format to/from WTF.

Parameters:

provider_name (str)
model_name (str)

provider_name: str = 'canary'

description: str = 'NVIDIA Canary speech recognition via Hugging Face'

status: str = 'Implemented'

__init__(provider_name='canary', model_name='nvidia/canary-1b-v2')[source]

Parameters:

provider_name (str)
model_name (str)

transcribe_audio(audio_path, language='en')[source]

Transcribe audio file using Canary model.

Parameters:

audio_path (str) – Path to audio file
language (str) – Language code (e.g., ‘en’, ‘es’, ‘fr’)

Return type:

Dict[str, Any]

Returns:

Canary transcription result

convert_to_wtf(canary_data)[source]

Convert Canary JSON data to WTF format.

Parameters:: canary_data (Dict[str, Any]) – Canary JSON data structure
Return type:: WTFDocument
Returns:: WTF document

convert_from_wtf(wtf_doc)[source]

Convert WTF document to Canary JSON format.

Parameters:: wtf_doc (WTFDocument) – WTF document
Return type:: Dict[str, Any]
Returns:: Canary JSON data structure

Parakeet Converter

Parakeet provider converter implementation.

This module provides conversion between Parakeet (NVIDIA NeMo) transcription format and WTF.

class wtf_transcript_converter.providers.parakeet.ParakeetConverter(provider_name='parakeet', model_name='nvidia/parakeet-tdt-0.6b-v3')[source]

Bases: BaseProviderConverter

Converter for Parakeet (NVIDIA NeMo) transcription format to/from WTF.

Parameters:

provider_name (str)
model_name (str)

provider_name: str = 'parakeet'

description: str = 'NVIDIA Parakeet speech recognition via Hugging Face'

status: str = 'Implemented'

__init__(provider_name='parakeet', model_name='nvidia/parakeet-tdt-0.6b-v3')[source]

Parameters:

provider_name (str)
model_name (str)

transcribe_audio(audio_path, language='en')[source]

Transcribe audio file using Parakeet model.

Parameters:

audio_path (str) – Path to audio file
language (str) – Language code (e.g., ‘en’, ‘es’, ‘fr’)

Return type:

Dict[str, Any]

Returns:

Parakeet transcription result

convert_to_wtf(parakeet_data)[source]

Convert Parakeet JSON data to WTF format.

Parameters:: parakeet_data (Dict[str, Any]) – Parakeet JSON data structure
Return type:: WTFDocument
Returns:: WTF document

convert_from_wtf(wtf_doc)[source]

Convert WTF document to Parakeet JSON format.

Parameters:: wtf_doc (WTFDocument) – WTF document
Return type:: Dict[str, Any]
Returns:: Parakeet JSON data structure

ParakeetConverter

class wtf_transcript_converter.providers.parakeet.ParakeetConverter(provider_name='parakeet', model_name='nvidia/parakeet-tdt-0.6b-v3')[source]

Converter for Parakeet (NVIDIA NeMo) transcription format to/from WTF.

Parameters:

provider_name (str)
model_name (str)

provider_name: str = 'parakeet'

description: str = 'NVIDIA Parakeet speech recognition via Hugging Face'

status: str = 'Implemented'

__init__(provider_name='parakeet', model_name='nvidia/parakeet-tdt-0.6b-v3')[source]

Parameters:

provider_name (str)
model_name (str)

transcribe_audio(audio_path, language='en')[source]

Transcribe audio file using Parakeet model.

Parameters:

audio_path (str) – Path to audio file
language (str) – Language code (e.g., ‘en’, ‘es’, ‘fr’)

Return type:

Dict[str, Any]

Returns:

Parakeet transcription result

convert_to_wtf(parakeet_data)[source]

Convert Parakeet JSON data to WTF format.

Parameters:: parakeet_data (Dict[str, Any]) – Parakeet JSON data structure
Return type:: WTFDocument
Returns:: WTF document

convert_from_wtf(wtf_doc)[source]

Convert WTF document to Parakeet JSON format.

Parameters:: wtf_doc (WTFDocument) – WTF document
Return type:: Dict[str, Any]
Returns:: Parakeet JSON data structure

Cross-Provider Testing

Consistency Testing

CrossProviderConsistencyTester

class wtf_transcript_converter.cross_provider.consistency.CrossProviderConsistencyTester[source]

Test consistency across multiple transcription providers.

__init__()[source]

test_consistency_with_sample_data(sample_data)[source]

Test consistency across providers using sample JSON data.

Parameters:: sample_data (Dict[str, Any]) – Sample transcription data in provider format
Return type:: List[ConsistencyResult]
Returns:: List of consistency results for each provider

analyze_consistency(results)[source]

Analyze consistency across provider results.

Parameters:: results (List[ConsistencyResult]) – List of consistency results
Return type:: Dict[str, Any]
Returns:: Analysis report

generate_consistency_report(results)[source]

Generate a human-readable consistency report.

Parameters:: results (List[ConsistencyResult])
Return type:: str

Performance Benchmarking

PerformanceBenchmark

Quality Comparison

QualityComparator

Utilities

Confidence Utils

Confidence score utility functions for WTF transcript converter.

This module provides utilities for confidence score normalization and quality metrics.

wtf_transcript_converter.utils.confidence_utils.normalize_confidence(confidence, provider)[source]

Normalize confidence scores to [0.0, 1.0] range based on provider.

Parameters:

confidence (float) – Raw confidence score
provider (str) – Provider name

Return type:

float

Returns:

Normalized confidence score [0.0, 1.0]

wtf_transcript_converter.utils.confidence_utils.calculate_quality_metrics(confidences)[source]

Calculate quality metrics from confidence scores.

Parameters:: confidences (List[float]) – List of confidence scores
Return type:: Dict[str, Any]
Returns:: Dictionary of quality metrics

Language Utils

Language utility functions for WTF transcript converter.

This module provides utilities for language code validation and normalization.

wtf_transcript_converter.utils.language_utils.is_valid_bcp47(language_code)[source]

Validate BCP-47 language code format.

Parameters:: language_code (str) – Language code to validate
Return type:: bool
Returns:: True if valid BCP-47 format

wtf_transcript_converter.utils.language_utils.normalize_language_code(language_code)[source]

Normalize language code to standard format.

Parameters:: language_code (str) – Language code to normalize
Return type:: str
Returns:: Normalized language code

Time Utils

Time utility functions for WTF transcript converter.

This module provides utilities for timestamp conversion and validation.

wtf_transcript_converter.utils.time_utils.convert_timestamp(timestamp)[source]

Convert various timestamp formats to floating-point seconds.

Parameters:: timestamp (Union[float, int, str]) – Timestamp in various formats
Return type:: float
Returns:: Timestamp as floating-point seconds

wtf_transcript_converter.utils.time_utils.validate_timing(start, end)[source]

Validate that timing is consistent (end > start).

Parameters:

start (float) – Start time in seconds
end (float) – End time in seconds

Return type:

bool

Returns:

True if timing is valid

wtf_transcript_converter.utils.time_utils.get_current_iso_timestamp()[source]

Get current UTC timestamp in ISO 8601 format.

Return type:: str
Returns:: ISO 8601 timestamp string

Command Line Interface

Main CLI

Command-line interface for vCon WTF.

This module provides the main CLI entry point for converting transcript formats and managing WTF documents.

Cross-Provider CLI

Cross-provider testing CLI commands.

This module provides CLI commands for testing consistency, performance, and quality across multiple transcription providers.

Exceptions

Custom exceptions for the WTF Transcript Converter library.

exception wtf_transcript_converter.exceptions.ConversionError(message, provider=None, original_error=None, context=None)[source]

Bases: Exception

Raised when a conversion operation fails.

message: Error message describing what went wrong

provider: Name of the provider that caused the error

original_error: The original exception that caused this error

context: Additional context information about the error

Parameters:

message (str)
provider (Optional[str])
original_error (Optional[Exception])
context (Optional[Dict[str, Any]])

__init__(message, provider=None, original_error=None, context=None)[source]

Parameters:

message (str)
provider (Optional[str])
original_error (Optional[Exception])
context (Optional[Dict[str, Any]])

__str__()[source]

Return string representation of the error.

Return type:: str

exception wtf_transcript_converter.exceptions.ValidationError(message, field=None, value=None, errors=None)[source]

Bases: Exception

Raised when validation of WTF data fails.

message: Error message describing the validation failure

field: The field that failed validation

value: The value that failed validation

errors: List of specific validation errors

Parameters:

message (str)
field (Optional[str])
value (Optional[Any])
errors (Optional[list[str]])

__init__(message, field=None, value=None, errors=None)[source]

Parameters:

message (str)
field (Optional[str])
value (Optional[Any])
errors (Optional[list[str]])

__str__()[source]

Return string representation of the validation error.

Return type:: str

exception wtf_transcript_converter.exceptions.ProviderError(message, provider, operation=None, status_code=None, response_data=None)[source]

Bases: Exception

Raised when a provider-specific operation fails.

message: Error message describing the provider error

provider: Name of the provider that caused the error

operation: The operation that failed

status_code: HTTP status code if applicable

response_data: Response data from the provider if applicable

Parameters:

message (str)
provider (str)
operation (Optional[str])
status_code (Optional[int])
response_data (Optional[Dict[str, Any]])

__init__(message, provider, operation=None, status_code=None, response_data=None)[source]

Parameters:

message (str)
provider (str)
operation (Optional[str])
status_code (Optional[int])
response_data (Optional[Dict[str, Any]])

__str__()[source]

Return string representation of the provider error.

Return type:: str

exception wtf_transcript_converter.exceptions.ConfigurationError(message, setting=None, value=None)[source]

Bases: Exception

Raised when there’s a configuration issue.

message: Error message describing the configuration issue

setting: The configuration setting that caused the error

value: The invalid value

Parameters:

message (str)
setting (Optional[str])
value (Optional[Any])

__init__(message, setting=None, value=None)[source]

Parameters:

message (str)
setting (Optional[str])
value (Optional[Any])

__str__()[source]

Return string representation of the configuration error.

Return type:: str

exception wtf_transcript_converter.exceptions.AudioProcessingError(message, file_path=None, format=None, original_error=None)[source]

Bases: Exception

Raised when audio processing fails.

message: Error message describing the audio processing failure

file_path: Path to the audio file that caused the error

format: Audio format that caused the error

original_error: The original exception that caused this error

Parameters:

message (str)
file_path (Optional[str])
format (Optional[str])
original_error (Optional[Exception])

__init__(message, file_path=None, format=None, original_error=None)[source]

Parameters:

message (str)
file_path (Optional[str])
format (Optional[str])
original_error (Optional[Exception])

__str__()[source]

Return string representation of the audio processing error.

Return type:: str

ConversionError

exception wtf_transcript_converter.exceptions.ConversionError(message, provider=None, original_error=None, context=None)[source]

Raised when a conversion operation fails.

message: Error message describing what went wrong

provider: Name of the provider that caused the error

original_error: The original exception that caused this error

context: Additional context information about the error

Parameters:

message (str)
provider (Optional[str])
original_error (Optional[Exception])
context (Optional[Dict[str, Any]])

__init__(message, provider=None, original_error=None, context=None)[source]

Parameters:

message (str)
provider (Optional[str])
original_error (Optional[Exception])
context (Optional[Dict[str, Any]])

__str__()[source]

Return string representation of the error.

Return type:: str

ValidationError

exception wtf_transcript_converter.exceptions.ValidationError(message, field=None, value=None, errors=None)[source]

Raised when validation of WTF data fails.

message: Error message describing the validation failure

field: The field that failed validation

value: The value that failed validation

errors: List of specific validation errors

Parameters:

message (str)
field (Optional[str])
value (Optional[Any])
errors (Optional[list[str]])

__init__(message, field=None, value=None, errors=None)[source]

Parameters:

message (str)
field (Optional[str])
value (Optional[Any])
errors (Optional[list[str]])

__str__()[source]

Return string representation of the validation error.

Return type:: str

ProviderError

exception wtf_transcript_converter.exceptions.ProviderError(message, provider, operation=None, status_code=None, response_data=None)[source]

Raised when a provider-specific operation fails.

message: Error message describing the provider error

provider: Name of the provider that caused the error

operation: The operation that failed

status_code: HTTP status code if applicable

response_data: Response data from the provider if applicable

Parameters:

message (str)
provider (str)
operation (Optional[str])
status_code (Optional[int])
response_data (Optional[Dict[str, Any]])

__init__(message, provider, operation=None, status_code=None, response_data=None)[source]

Parameters:

message (str)
provider (str)
operation (Optional[str])
status_code (Optional[int])
response_data (Optional[Dict[str, Any]])

__str__()[source]

Return string representation of the provider error.

Return type:: str