FAQs, Recording Tips & Tricks
Record with a laptop or Phone
What Do You Get From Us
General Recording Tips
How Does It Work
Vocal Ranges Cheat Sheet
Pricing & Plans
Perfect for users who want to try Auribus with essential features.
- Available voices: 6
Get access to six high-quality AI-generated voices, allowing you to explore different vocal tones for your projects. - Upload time: 3 minutes
You can upload up to a total of three minutes of audio. This is ideal for short clips, testing voice swaps, or experimenting with the platform. - Normal conversion speed
Standard processing time is applied, meaning your files may take a bit longer to convert compared to paid tiers. - Storage (1 month)
Your converted files will be stored for up to one month before being automatically deleted. - Rollover (no)
Unused upload time does not carry ov er to the next month. - Plugin version for your DAW
Access the Auribus plugin for your Digital Audio Workstation (DAW), allowing you to experiment with voice swapping directly inside your music production software.
Designed for creators who need more flexibility and voice options.
- Available voices: 12
Unlock access to a broader selection of voices, including more unique vocal styles and tones. - Upload time: 30 minutes
Upload and process up to 30 minutes of audio per month, providing greater flexibility for longer recordings and projects. - Rollover minutes (capped at three months)
Any unused upload time can roll over for up to three months, ensuring you don’t lose minutes if you haven’t used your full allowance. - Fast conversion speed
Your audio will be processed more quickly than in the Free tier, reducing wait times. - Storage (3 months)
Converted files will be stored for up to three months, giving you more time to access, download, or continue working with your recordings. - Custom voices
Gain access to custom voice models that allow you to personalize or fine-tune voice characteristics for your needs. Submit your unprocessed audio and get your results back for personal use. You can request up to 3 with this plan. - Plugin version for your DAW
Seamlessly integrate Auribus with your DAW for real-time voice conversion without needing to upload/export manually.
For professionals who need advanced voice capabilities and extended storage.
- Available voices: 19
Get access to the full library of voices, offering the widest variety of tones, genders, and vocal styles. - Upload time: 100 minutes
Process up to 100 minutes of audio per month, ideal for artists, producers, and creators working on full-length projects. - Rollover minutes (capped at three months)
Unused upload time can roll over for up to three months, allowing for flexibility in case of varying workloads. - Fast conversion speed
Enjoy significantly faster processing, allowing you to generate voice-swapped audio with minimal waiting time. - Storage (6 months)
Your converted files will be stored for up to six months, giving you ample time to revisit and manage your audio. - Custom voices (singing and spoken word)
Unlock the ability to create custom voices for both spoken and singing applications, offering a more tailored and professional experience. Submit your unprocessed audio and get your results back for personal use. You can request up to 6 with this plan. - Plugin version for your DAW
Use the Auribus plugin for effortless voice swapping and previewing directly inside your DAW.
Tailored for businesses, studios, and large-scale production teams.
- Available voices: 19
Gain access to the full range of voices, including exclusive high-end vocal models. - Upload time: unlimited
No restrictions on upload time — process as much audio as needed without monthly limitations. - Rollover minutes (capped at three months)
Even with unlimited uploads, rollover minutes apply for specific usage cases to optimize workflow. - Fast conversion speed
The fastest processing speeds available, ensuring immediate results for high-volume projects. - Storage (duration of contract)
Files will be stored securely for the entire duration of the contract, giving enterprise clients long-term access to their audio. - Custom voices (singing and spoken word)
Work with exclusive custom voice models for spoken word and singing, ensuring premium quality for professional applications. Submit your unprocessed audio and get your results back for personal use. - Plugin version for your DAW
Get full access to the DAW plugin for seamless voice integration and workflow efficiency.
How to Get the best sound
Recording a "dry" and "clean" vocal track is essential when swapping your voice with anAI voice for several important reasons:
AI models are often trained on datasets that include clean and unaltered vocal recordings. Providing a dry recording ensures consistency with the type of data the AI has been exposed to during training, improving the model's ability to generate realistic voices.
A dry recording allows the AI to adapt to different styles and contexts. Without pre-existing effects, the synthesized voice can be more versatile, suitable for a range of applications from casual conversations to professional presentations.
Applying effects during recording, such as reverb or echo, may introduce artifacts that the AI model might not handle well. A clean recording minimizes the risk of unintended distortions or anomalies in the AI-generated voice.
Starting with a dry recording gives you the freedom to apply and customize effects during the post-processing stage. You can experiment with various effects, adjust pitch, or fine-tune the voice to achieve the desired result without being locked into the choices made during recording.
Clean recordings generally result in higher audio quality. This ensures that the AI-generated voice maintains a consistent level of clarity and fidelity, contributing to a more natural and engaging outcome.
Recording in a dry environment helps control unwanted background noise. AI models may struggle to distinguish between the intended voice and background noise in recordings with excessive reverberation or environmental sounds.
A clean recording is more likely to be compatible with a wide range of platforms, applications, and devices. It ensures that the synthesized voice seamlessly integrates into different contexts without sounding out of place.
A dry recording provides greater artistic control during the voice-swapping process. You can shape the final result based on your creative vision, applying effects strategically and refining the voice to suit the specific requirements of your project.
Pick The Right Artist
If you're keen on harmonizing your voice with AI vocals, understanding your vocal range is a fundamental step. Here's a more in-depth guideBegin by systematically exploring your vocal range, singing from your lowest comfortable note to your highest. Take note of where your voice feels most at ease and where you encounter challenges.
Pay attention to the key characteristics of your voice. Are you effortlessly reaching low bass notes, or do you find a natural resonance in the higher octaves? These qualities will play a significant role in defining your vocal range.
Broadly, vocal ranges are categorized into Soprano (female), Alto (female), Tenor (male), and Bass (male). Additionally, there are intermediate categories like Mezzo-Soprano and Baritone. Reflect on where your voice aligns within these classifications.
Engage in singing across various genres and styles. Determine if your voice excels in soulful, low-toned renditions or effortlessly reaches into the higher registers. This exploration will provide insights into the strengths of your vocal abilities.
Leverage online vocal range tools or applications that guide you through exercises designed to help pinpoint your vocal range more precisely. These tools often offer valuable insights into the nuances of your voice.
Identify any vocal breaks or transitions in your voice. Understanding where your voice naturally shifts in pitch contributes to a more nuanced comprehension of your vocal range.
Delve into singing exercises that traverse various ranges. This practice not only refines your vocal technique but also aids in discovering where you feel most comfortable and confident as a vocalist.
Record In A Studio
Choose a high-quality microphone suitable for voice recording. Condenser microphones are commonly used for studio vocals.
Select a quiet room with minimal background noise. Use soft furnishings (curtains, carpets) to reduce echoes and improve sound absorption.
Position the microphone at mouth level, a few inches away. Experiment with angles to find the best sound capture.
Attach a pop filter to reduce plosive sounds ("p" and "b" sounds). Consider using a windscreen to minimize breath noise.
Connect the microphone to a dedicated audio interface for better sound quality.
Use closed-back headphones to monitor your recording without bleed into the microphone.
Use a digital audio workstation (DAW) for recording and editing. Audacity, GarageBand, or professional software like Logic Pro and Pro Tools are good choices.
Set appropriate input levels to avoid clipping. Adjust gain on the audio interface.
Record a short test to check levels, clarity, and any background noise.
Save recordings in a high-quality format like WAV or FLAC.
Record With A Laptop Or Phone
Find A Quiet Space With Minimal Background Noise. Close Windows And Doors To Reduce External Sounds.
Position Yourself About 6-12 Inches (15-30 Cm) Away From The Laptop's Microphone. Experiment With The Distance To Find A Balance Between Clarity And Avoiding Distortion.
Use Built-In Recording Software On Your Laptop Or Free Applications Like Audacity. These Tools Are User-Friendly And Provide Basic Recording Functionality.
If Possible, Use A DIY Pop Filter Made From Materials Like A Sock Or Pantyhose Stretched Over A Frame. This Helps Reduce Plosive Sounds (Like 'P' And 'B').
Place Your Laptop On A Stable Surface To Minimize Handling Noise. Avoid Tapping On The Laptop While Recording.
Perform A Short Test Recording To Check Levels And Ensure The Microphone Is Capturing Your Voice Clearly.
Record Without Adding Effects Like Reverb Or EQ During The Recording Process. Effects Can Be Applied Later If Needed.
Save Your Recordings In A Common Format Such As MP3 Or WAV For Compatibility With Various Applications.
Choose A Quiet Space Free From Background Noise. Turn Off Notifications On Your Phone To Avoid Interruptions.
Hold The Phone At A Consistent Distance From Your Mouth (Around 6-12 Inches Or 15-30 Cm). Experiment To Find The Optimal Distance.
Use The Built-In Voice Recorder App On Your Phone Or Download A Free Recording App. Many Smartphones Have Quality Built-In Microphones.
Hold The Phone Steadily Or Use A Makeshift Stand To Avoid Handling Noise. Steady Recordings Improve Overall Quality.
Perform A Brief Test Recording To Ensure The Microphone Is Picking Up Your Voice Clearly. Adjust The Distance If Needed.
Be Mindful Of Potential Environmental Noise, And Try To Record During Quieter Times If Possible.
Record Without Applying Effects During The Recording Process. Post-Production Can Be Used For Any Necessary Adjustments.
Save Your Recordings On Your Phone And Transfer Them To Your Computer If You Plan To Do Further Editing.
What Do You Get From Us
The primary result is a synthetic or computer-generated voice that mimics the characteristics of the input voice. The goal is to create a voice that sounds natural and is consistent with the qualities of the original voice.
AI voice conversion often involves adjustments to pitch and tone. You might get a voice that is pitched higher or lower than the original, depending on the desired effect or application.
Prosody includes elements like rhythm, intonation, and stress in speech. AI can modify these aspects to create a more natural and expressive synthetic voice.
Some AI models can adapt the voice to different languages or accents. This is useful for applications such as multilingual voice assistants.
Advanced AI models strive to produce high-quality audio output that closely resembles natural human speech. This includes minimizing artifacts or distortions in the synthesized voice.
Ideally, the converted voice should exhibit a natural flow and fluency in speech, ensuring that it sounds human-like and is easy to understand.
48 kHz: This is a standard sample rate for digital audio, commonly used in video production, film, television, and multimedia applications. It has become a standard for various professional audio work, including audio for video production.
48 kHz is preferred in professional audio production, particularly in contexts where synchronization with video is crucial. It provides a balance between high-quality audio capture and compatibility with various audio and video production systems.
t0 (No Transposition)
Explanation: t0 means no transposition. The original pitch remains unchanged.
Usefulness: Keeping the original pitch is useful when you want to preserve the natural key of a piece or maintain the original tonal quality of a sound.
t6 (Transposition Up by Six Semitones)
Explanation: t6 means transposing up by six semitones or a perfect fifth. Usefulness: Transposing up by six semitones often creates a brighter, more uplifting sound. It's common in music to add energy or shift to a different key without straying too far from the original.
t8 (Transposition Up by Eight Semitones)
Explanation: t8 means transposing up by eight semitones or an octave. Usefulness: Transposing up by an octave doubles the frequency, creating a higher-pitched version of the original. It's useful for adding variety, creating harmonies, or achieving a different vocal or instrumental range.
t12 (Transposition Up by Twelve Semitones)
Explanation: t12 means transposing up by twelve semitones or two octaves. Usefulness: Transposing up by two octaves results in a sound that is significantly higher than the original. This extreme transposition is used for creative effects, sound design, or when you want to explore the upper range of an instrument or voice.
-t12 (Transposition Down by Twelve Semitones)
Explanation: -t12 means transposing down by twelve semitones or two octaves. Usefulness: Transposing down by two octaves results in a much lower pitch. This is useful for deepening the tone of a sound or exploring the lower register of an instrument or voice.
General Recording Tips
The goal is to have fun and experiment with your recordings. The AI will do its best to swap your voice, and the more diverse and clear your recordings are, the better the results will be. Consider exploring different tones, octaves, and styles to provide the AI with a diverse range of inputs.Find a quiet room to record your voice. Less background noise makes the swapping process work better.
Record your voice without adding effects like echo or reverb. Keep it simple and natural.
Speak clearly and at a normal pace. This helps the AI understand and replicate your voice accurately.
Try recording your voice at different pitches—higher and lower. This helps the AI mimic a wider range of tones.
Record in different styles—casual, formal, happy, or serious. This gives the AI more variety to work with.
If possible, use a decent microphone. It doesn't have to be fancy, just clear enough for the AI to understand.
Short recordings are easier for the AI to process. Aim for a few sentences or phrases at a time.
If you're not a singer, don't worry about hitting perfect notes. The AI can still work with your natural voice.
Save your recordings in a high-quality format like WAV or MP3. This ensures the AI gets the best input.
Don't be afraid to play around! Record different things, try different voices, and see what the AI can do.
Record your voice in its natural tone, the way you would speak in everyday conversations. This forms the baseline for the AI to understand your usual voice characteristics.
Experiment with a slightly higher pitch. This can be useful for creating variations in the swapped voice, especially if you want a more energetic or animated result.
Record in a lower pitch to explore a deeper, more resonant sound. This can be effective for a serious or authoritative tone.
Capture your voice in a casual and relaxed style. This can be great for creating friendly and approachable AI-generated voices.
Record with a more formal or professional tone. This might be suitable for applications where a polished and refined voice is desired.
Express excitement or happiness in your recordings. This can add a positive and lively quality to the AI-generated voice.
Try a serious or calm tone. This can be useful for scenarios where a composed and steady voice is preferred.
Record in a narrative style, as if telling a story. This helps the AI capture the nuances of storytelling and adds a storytelling flair to the swapped voice.
Experiment with a humorous or playful tone. This can be enjoyable for applications that involve humor or playfulness.
Explore different octaves within your vocal range. This allows the AI to adapt to various pitch levels and create a more versatile swapped voice.
Incorporate variations in expression—try recordings with emphasis, pauses, or different inflections. This provides the AI with a better understanding of your expressive range.
Record in a whispering or soft-spoken style. This can be useful for creating more intimate or ASMR-like AI-generated voices.
How Does It Work
AI-based voice swapping involves using advanced algorithms and machine learning to offer a voice conversion solution. The service typically requires users to provide voice samples, allowing the AI model to learn and adapt to their unique vocal characteristics.Once trained, the service can transform voices, enabling users to swap their voice with another, while retaining the individuality and nuances of their original speech patterns. It's a technology-driven process designed to provide users with a customizable and personalized voice experience.Vocal Ranges Cheat Sheet
High-pitched and often associated with female singers. Think of those soaring high notes in opera or pop diva performances.
Frequency Range: Approximately 261 Hz to 1,047 Hz (C4 to C6).
A lower female voice, warm and rich. Common in choral music and soulful genres.
Frequency Range: Approximately 174 Hz to 698 Hz (F3 to F5).
High male voice, the heartthrob range. Often takes the lead in vocal arrangements.
Frequency Range: Approximately 131 Hz to 523 Hz (C3 to C5).
The middle ground for male voices. Not too high, not too low. Versatile and common in various music styles.
Frequency Range: Approximately 98 Hz to 392 Hz (G2 to G4).
The lowest male voice, deep and resonant. Provides the foundation in many vocal groups.
Frequency Range: Approximately 65 Hz to 261 Hz (C2 to C4).
Rare and low-pitched female voice. Known for its depth and richness.
Frequency Range: Approximately 87 Hz to 349 Hz (F2 to F4).
A male singing in the alto or soprano range using falsetto or a special vocal technique.
Frequency Range: Similar to alto or soprano, depending on the individual.
A middle-range female voice, versatile and found in a variety of genres.
Frequency Range: Approximately 196 Hz to 784 Hz (G3 to G5).