SpiderAI Speech to Text

Overview

The OpenAI Speech to Text app in SpiderAI allows users to upload audio files and produce accurate transcriptions using OpenAI’s Whisper model. The tool supports multiple audio formats, language selection, optional prompts, formatted output, and downloadable text files. This article explains how to install the Speech to Text app, how to use it, and how to refine transcription results.

Details / Instructions

 

Installing the Speech to Text App (First-Time Setup)

Before using the Speech to Text tool, you must add it to your SpiderAI workspace.

  1. From the top navigation bar, select My Apps.

  2. Click the plus (+) icon to browse the SpiderAI App Store.

  3. Locate OpenAI Speech to Text in the store.

  4. Select it and click Install

 

After installation, the app will appear in your My Apps list


Supported Audio Formats

You may upload any of the following formats:

  • mp3, mp4, mpeg, mpga, m4a, wav, webm

The app displays the file name and size after upload.


Transcribing an Audio File

Upload Audio

  • Click Upload an audio file.

  • Select a supported file from your computer.

 

Adjust Optional Sidebar Settings

  • Model

    • Currently fixed to Whisper-1, the latest OpenAI speech-to-text model.

  • Language Settings

    • Expand Language Settings to optionally select the language spoken in the audio.

    • Leave blank to enable auto-detection.

    • A wide selection of languages is supported.

  • Prompt (Optional)

    • Provide contextual information to improve accuracy, such as:

      • “This audio is an interview between two speakers.”

      • “Background noise is present; prioritize clarity.”

  • Reset Button

    • Clears previously uploaded files and transcripts.


Step 3: Transcribe

  • Click Transcribe Audio.

  • The app will process the file and display a success message when complete.

If errors occur, the system logs them and displays an error reference ID.


Viewing and Saving the Transcript

View Transcript

The transcription appears in a text area under Transcription Result.

Download Transcript

Use Download Transcription to save a .txt file of the raw output.


Formatting the Transcript (Optional)

To improve readability:

  1. Click Format this transcription.

  2. The app will process the transcription text and stream a polished version with:

    • Proper punctuation and capitalization

    • Paragraph breaks

    • Corrections to obvious transcription errors

  3. Download the improved version as a .txt file.


Tips for Best Results

  • Select the correct language for multilingual audio.

  • Provide context in the optional prompt to reduce ambiguity.

  • Ensure audio is clear to maximize Whisper accuracy.

  • Use the formatting feature for professional-quality output.

 

See Also