Backed by

FAQs, Recording Tips & Tricks

Pricing & Plans

FREE Plan

Perfect for users who want to try Auribus with essential features.

  • Available voices: 6
    Get access to six high-quality AI-generated voices, allowing you to explore different vocal tones for your projects.
  • Upload time: 3 minutes
    You can upload up to a total of three minutes of audio. This is ideal for short clips, testing voice swaps, or experimenting with the platform.
  • Normal conversion speed
    Standard processing time is applied, meaning your files may take a bit longer to convert compared to paid tiers.
  • Storage (1 month)
    Your converted files will be stored for up to one month before being automatically deleted.
  • Rollover (no)
    Unused upload time does not carry ov er to the next month.
  • Plugin version for your DAW
    Access the Auribus plugin for your Digital Audio Workstation (DAW), allowing you to experiment with voice swapping directly inside your music production software.
STANDARD Plan

Designed for creators who need more flexibility and voice options.

  • Available voices: 12
    Unlock access to a broader selection of voices, including more unique vocal styles and tones.
  • Upload time: 30 minutes
    Upload and process up to 30 minutes of audio per month, providing greater flexibility for longer recordings and projects.
  • Rollover minutes (capped at three months)
    Any unused upload time can roll over for up to three months, ensuring you don’t lose minutes if you haven’t used your full allowance.
  • Fast conversion speed
    Your audio will be processed more quickly than in the Free tier, reducing wait times.
  • Storage (3 months)
    Converted files will be stored for up to three months, giving you more time to access, download, or continue working with your recordings.
  • Custom voices
    Gain access to custom voice models that allow you to personalize or fine-tune voice characteristics for your needs. Submit your unprocessed audio and get your results back for personal use. You can request up to 3 with this plan.
  • Plugin version for your DAW
    Seamlessly integrate Auribus with your DAW for real-time voice conversion without needing to upload/export manually.
PREMIUM Plan

For professionals who need advanced voice capabilities and extended storage.

  • Available voices: 19
    Get access to the full library of voices, offering the widest variety of tones, genders, and vocal styles.
  • Upload time: 100 minutes
    Process up to 100 minutes of audio per month, ideal for artists, producers, and creators working on full-length projects.
  • Rollover minutes (capped at three months)
    Unused upload time can roll over for up to three months, allowing for flexibility in case of varying workloads.
  • Fast conversion speed
    Enjoy significantly faster processing, allowing you to generate voice-swapped audio with minimal waiting time.
  • Storage (6 months)
    Your converted files will be stored for up to six months, giving you ample time to revisit and manage your audio.
  • Custom voices (singing and spoken word)
    Unlock the ability to create custom voices for both spoken and singing applications, offering a more tailored and professional experience. Submit your unprocessed audio and get your results back for personal use. You can request up to 6 with this plan.
  • Plugin version for your DAW
    Use the Auribus plugin for effortless voice swapping and previewing directly inside your DAW.
ENTERPRISE Plan

Tailored for businesses, studios, and large-scale production teams.

  • Available voices: 19
    Gain access to the full range of voices, including exclusive high-end vocal models.
  • Upload time: unlimited
    No restrictions on upload time — process as much audio as needed without monthly limitations.
  • Rollover minutes (capped at three months)
    Even with unlimited uploads, rollover minutes apply for specific usage cases to optimize workflow.
  • Fast conversion speed
    The fastest processing speeds available, ensuring immediate results for high-volume projects.
  • Storage (duration of contract)
    Files will be stored securely for the entire duration of the contract, giving enterprise clients long-term access to their audio.
  • Custom voices (singing and spoken word)
    Work with exclusive custom voice models for spoken word and singing, ensuring premium quality for professional applications. Submit your unprocessed audio and get your results back for personal use.
  • Plugin version for your DAW
    Get full access to the DAW plugin for seamless voice integration and workflow efficiency.

How to Get the best sound

Recording a "dry" and "clean" vocal track is essential when swapping your voice with an
AI voice for several important reasons:
Training Data Consistency

AI models are often trained on datasets that include clean and unaltered vocal recordings. Providing a dry recording ensures consistency with the type of data the AI has been exposed to during training, improving the model's ability to generate realistic voices.

Adaptable to Various Styles

A dry recording allows the AI to adapt to different styles and contexts. Without pre-existing effects, the synthesized voice can be more versatile, suitable for a range of applications from casual conversations to professional presentations.

Avoiding Unwanted Artifacts

Applying effects during recording, such as reverb or echo, may introduce artifacts that the AI model might not handle well. A clean recording minimizes the risk of unintended distortions or anomalies in the AI-generated voice.

Customizable Post-Processing

Starting with a dry recording gives you the freedom to apply and customize effects during the post-processing stage. You can experiment with various effects, adjust pitch, or fine-tune the voice to achieve the desired result without being locked into the choices made during recording.

Consistent Quality

Clean recordings generally result in higher audio quality. This ensures that the AI-generated voice maintains a consistent level of clarity and fidelity, contributing to a more natural and engaging outcome.

Noise Control

Recording in a dry environment helps control unwanted background noise. AI models may struggle to distinguish between the intended voice and background noise in recordings with excessive reverberation or environmental sounds.

Compatibility Across Platforms

A clean recording is more likely to be compatible with a wide range of platforms, applications, and devices. It ensures that the synthesized voice seamlessly integrates into different contexts without sounding out of place.

Better Artistic Control

A dry recording provides greater artistic control during the voice-swapping process. You can shape the final result based on your creative vision, applying effects strategically and refining the voice to suit the specific requirements of your project.

Pick The Right Artist

If you're keen on harmonizing your voice with AI vocals, understanding your vocal range is a fundamental step. Here's a more in-depth guide
Exploration of Low and High Notes

Begin by systematically exploring your vocal range, singing from your lowest comfortable note to your highest. Take note of where your voice feels most at ease and where you encounter challenges.

Identification of Key Characteristics

Pay attention to the key characteristics of your voice. Are you effortlessly reaching low bass notes, or do you find a natural resonance in the higher octaves? These qualities will play a significant role in defining your vocal range.

Categorization of Vocal Range

Broadly, vocal ranges are categorized into Soprano (female), Alto (female), Tenor (male), and Bass (male). Additionally, there are intermediate categories like Mezzo-Soprano and Baritone. Reflect on where your voice aligns within these classifications.

Experimentation with Genres

Engage in singing across various genres and styles. Determine if your voice excels in soulful, low-toned renditions or effortlessly reaches into the higher registers. This exploration will provide insights into the strengths of your vocal abilities.

Use Online Tools

Leverage online vocal range tools or applications that guide you through exercises designed to help pinpoint your vocal range more precisely. These tools often offer valuable insights into the nuances of your voice.

Recognition of Vocal Breaks

Identify any vocal breaks or transitions in your voice. Understanding where your voice naturally shifts in pitch contributes to a more nuanced comprehension of your vocal range.

Practice Across Different Ranges

Delve into singing exercises that traverse various ranges. This practice not only refines your vocal technique but also aids in discovering where you feel most comfortable and confident as a vocalist.

Record In A Studio

Microphone Selection

Choose a high-quality microphone suitable for voice recording. Condenser microphones are commonly used for studio vocals.

Room Acoustics

Select a quiet room with minimal background noise. Use soft furnishings (curtains, carpets) to reduce echoes and improve sound absorption.

Microphone Placement

Position the microphone at mouth level, a few inches away. Experiment with angles to find the best sound capture.

Pop Filter and Windscreen

Attach a pop filter to reduce plosive sounds ("p" and "b" sounds). Consider using a windscreen to minimize breath noise.

Audio Interface

Connect the microphone to a dedicated audio interface for better sound quality.

Headphones

Use closed-back headphones to monitor your recording without bleed into the microphone.

Recording Software

Use a digital audio workstation (DAW) for recording and editing. Audacity, GarageBand, or professional software like Logic Pro and Pro Tools are good choices.

Levels and Gain

Set appropriate input levels to avoid clipping. Adjust gain on the audio interface.

Test Recording

Record a short test to check levels, clarity, and any background noise.

Save in High Quality

Save recordings in a high-quality format like WAV or FLAC.

Record With A Laptop Or Phone

Laptop's Built-In Microphone
Quiet Environment

Find A Quiet Space With Minimal Background Noise. Close Windows And Doors To Reduce External Sounds.

Optimal Distance

Position Yourself About 6-12 Inches (15-30 Cm) Away From The Laptop's Microphone. Experiment With The Distance To Find A Balance Between Clarity And Avoiding Distortion.

Recording Software

Use Built-In Recording Software On Your Laptop Or Free Applications Like Audacity. These Tools Are User-Friendly And Provide Basic Recording Functionality.

Pop Filter (Optional)

If Possible, Use A DIY Pop Filter Made From Materials Like A Sock Or Pantyhose Stretched Over A Frame. This Helps Reduce Plosive Sounds (Like 'P' And 'B').

Stable Surface

Place Your Laptop On A Stable Surface To Minimize Handling Noise. Avoid Tapping On The Laptop While Recording.

Test Recording

Perform A Short Test Recording To Check Levels And Ensure The Microphone Is Capturing Your Voice Clearly.

No Effects

Record Without Adding Effects Like Reverb Or EQ During The Recording Process. Effects Can Be Applied Later If Needed.

Save In A Common Format

Save Your Recordings In A Common Format Such As MP3 Or WAV For Compatibility With Various Applications.

Phone's Built-In Microphone
Quiet Environment

Choose A Quiet Space Free From Background Noise. Turn Off Notifications On Your Phone To Avoid Interruptions.

Phone Placement

Hold The Phone At A Consistent Distance From Your Mouth (Around 6-12 Inches Or 15-30 Cm). Experiment To Find The Optimal Distance.

Recording App

Use The Built-In Voice Recorder App On Your Phone Or Download A Free Recording App. Many Smartphones Have Quality Built-In Microphones.

Stable Grip

Hold The Phone Steadily Or Use A Makeshift Stand To Avoid Handling Noise. Steady Recordings Improve Overall Quality.

Test Recording

Perform A Brief Test Recording To Ensure The Microphone Is Picking Up Your Voice Clearly. Adjust The Distance If Needed.

Avoid Environmental Noise

Be Mindful Of Potential Environmental Noise, And Try To Record During Quieter Times If Possible.

No Effects

Record Without Applying Effects During The Recording Process. Post-Production Can Be Used For Any Necessary Adjustments.

Save And Transfer

Save Your Recordings On Your Phone And Transfer Them To Your Computer If You Plan To Do Further Editing.

What Do You Get From Us

Synthesized Voice

The primary result is a synthetic or computer-generated voice that mimics the characteristics of the input voice. The goal is to create a voice that sounds natural and is consistent with the qualities of the original voice.

Pitch and Tone Adjustment

AI voice conversion often involves adjustments to pitch and tone. You might get a voice that is pitched higher or lower than the original, depending on the desired effect or application.

Prosody Modification

Prosody includes elements like rhythm, intonation, and stress in speech. AI can modify these aspects to create a more natural and expressive synthetic voice.

Language and Accent Adaptation

Some AI models can adapt the voice to different languages or accents. This is useful for applications such as multilingual voice assistants.

High-Quality Audio Output

Advanced AI models strive to produce high-quality audio output that closely resembles natural human speech. This includes minimizing artifacts or distortions in the synthesized voice.

Natural Flow and Fluency

Ideally, the converted voice should exhibit a natural flow and fluency in speech, ensuring that it sounds human-like and is easy to understand.

Sample Rate 48 kHz

48 kHz: This is a standard sample rate for digital audio, commonly used in video production, film, television, and multimedia applications. It has become a standard for various professional audio work, including audio for video production.

48 kHz is preferred in professional audio production, particularly in contexts where synchronization with video is crucial. It provides a balance between high-quality audio capture and compatibility with various audio and video production systems.

Transpositions t0, t6, t8, t12, and -t12

t0 (No Transposition)

Explanation: t0 means no transposition. The original pitch remains unchanged.
Usefulness: Keeping the original pitch is useful when you want to preserve the natural key of a piece or maintain the original tonal quality of a sound.

t6 (Transposition Up by Six Semitones)

Explanation: t6 means transposing up by six semitones or a perfect fifth. Usefulness: Transposing up by six semitones often creates a brighter, more uplifting sound. It's common in music to add energy or shift to a different key without straying too far from the original.

t8 (Transposition Up by Eight Semitones)

Explanation: t8 means transposing up by eight semitones or an octave. Usefulness: Transposing up by an octave doubles the frequency, creating a higher-pitched version of the original. It's useful for adding variety, creating harmonies, or achieving a different vocal or instrumental range.

t12 (Transposition Up by Twelve Semitones)

Explanation: t12 means transposing up by twelve semitones or two octaves. Usefulness: Transposing up by two octaves results in a sound that is significantly higher than the original. This extreme transposition is used for creative effects, sound design, or when you want to explore the upper range of an instrument or voice.

-t12 (Transposition Down by Twelve Semitones)

Explanation: -t12 means transposing down by twelve semitones or two octaves. Usefulness: Transposing down by two octaves results in a much lower pitch. This is useful for deepening the tone of a sound or exploring the lower register of an instrument or voice.

General Recording Tips

The goal is to have fun and experiment with your recordings. The AI will do its best to swap your voice, and the more diverse and clear your recordings are, the better the results will be. Consider exploring different tones, octaves, and styles to provide the AI with a diverse range of inputs.
Record in a Quiet Place

Find a quiet room to record your voice. Less background noise makes the swapping process work better.

No Special Effects

Record your voice without adding effects like echo or reverb. Keep it simple and natural.

Speak Clearly

Speak clearly and at a normal pace. This helps the AI understand and replicate your voice accurately.

Record Different Pitches

Try recording your voice at different pitches—higher and lower. This helps the AI mimic a wider range of tones.

Record Different Styles

Record in different styles—casual, formal, happy, or serious. This gives the AI more variety to work with.

Use Quality Recording Tools

If possible, use a decent microphone. It doesn't have to be fancy, just clear enough for the AI to understand.

Keep Recordings Short

Short recordings are easier for the AI to process. Aim for a few sentences or phrases at a time.

No Need for Musical Perfection

If you're not a singer, don't worry about hitting perfect notes. The AI can still work with your natural voice.

Save Recordings in Good Quality

Save your recordings in a high-quality format like WAV or MP3. This ensures the AI gets the best input.

Have Fun and Experiment

Don't be afraid to play around! Record different things, try different voices, and see what the AI can do.

Natural Tone

Record your voice in its natural tone, the way you would speak in everyday conversations. This forms the baseline for the AI to understand your usual voice characteristics.

Higher Pitch

Experiment with a slightly higher pitch. This can be useful for creating variations in the swapped voice, especially if you want a more energetic or animated result.

Lower Pitch

Record in a lower pitch to explore a deeper, more resonant sound. This can be effective for a serious or authoritative tone.

Casual and Relaxed

Capture your voice in a casual and relaxed style. This can be great for creating friendly and approachable AI-generated voices.

Formal or Professional

Record with a more formal or professional tone. This might be suitable for applications where a polished and refined voice is desired.

Excited or Happy

Express excitement or happiness in your recordings. This can add a positive and lively quality to the AI-generated voice.

Serious or Calm

Try a serious or calm tone. This can be useful for scenarios where a composed and steady voice is preferred.

Narrative Style

Record in a narrative style, as if telling a story. This helps the AI capture the nuances of storytelling and adds a storytelling flair to the swapped voice.

Humorous or Playful

Experiment with a humorous or playful tone. This can be enjoyable for applications that involve humor or playfulness.

Different Octaves

Explore different octaves within your vocal range. This allows the AI to adapt to various pitch levels and create a more versatile swapped voice.

Expressive Variations

Incorporate variations in expression—try recordings with emphasis, pauses, or different inflections. This provides the AI with a better understanding of your expressive range.

Whispering or Soft Spoken

Record in a whispering or soft-spoken style. This can be useful for creating more intimate or ASMR-like AI-generated voices.

How Does It Work

AI-based voice swapping involves using advanced algorithms and machine learning to offer a voice conversion solution. The service typically requires users to provide voice samples, allowing the AI model to learn and adapt to their unique vocal characteristics.Once trained, the service can transform voices, enabling users to swap their voice with another, while retaining the individuality and nuances of their original speech patterns. It's a technology-driven process designed to provide users with a customizable and personalized voice experience.

Vocal Ranges Cheat Sheet

Soprano

High-pitched and often associated with female singers. Think of those soaring high notes in opera or pop diva performances.

Frequency Range: Approximately 261 Hz to 1,047 Hz (C4 to C6).

Alto

A lower female voice, warm and rich. Common in choral music and soulful genres.

Frequency Range: Approximately 174 Hz to 698 Hz (F3 to F5).

Tenor

High male voice, the heartthrob range. Often takes the lead in vocal arrangements.

Frequency Range: Approximately 131 Hz to 523 Hz (C3 to C5).

Baritone

The middle ground for male voices. Not too high, not too low. Versatile and common in various music styles.

Frequency Range: Approximately 98 Hz to 392 Hz (G2 to G4).

Bass

The lowest male voice, deep and resonant. Provides the foundation in many vocal groups.

Frequency Range: Approximately 65 Hz to 261 Hz (C2 to C4).

Contralto

Rare and low-pitched female voice. Known for its depth and richness.

Frequency Range: Approximately 87 Hz to 349 Hz (F2 to F4).

Countertenor

A male singing in the alto or soprano range using falsetto or a special vocal technique.

Frequency Range: Similar to alto or soprano, depending on the individual.

Mezzo-Soprano

A middle-range female voice, versatile and found in a variety of genres.

Frequency Range: Approximately 196 Hz to 784 Hz (G3 to G5).