If listeners find themselves using the volume up and down buttons a lot, level differences within your podcast or audio file are too big.
In this article, we are discussing why audio dynamic range processing (or leveling) is more important than loudness normalization, why it depends on factors like the listening environment and the individual character of the content, and why the loudness range descriptor (LRA) is only reliable for speech programs.

Photo by Alexey Ruban.

Why loudness normalization is not enough

Everybody who has lived in an apartment building knows the problem: you want to enjoy a ...

In the classic loudness war, music and radio producers have been trying to create their recordings as loud as possible and loudness normalization was introduced to stop that. Now one can see the start of a new loudness target war, where podcasters set their loudness targets higher and higher, mainly triggered by high target recommendations of platforms like Spotify or Amazon Alexa.
In this article, we will show how to resist the loudness target war and still be compliant with major platforms.

Resist the loudness target war! (Photo by Nayani Teixeira)

What's the problem?

“Two or three ...

Last weekend, at the Subscribe10 conference, we released Advanced Audio Algorithm Parameters for Multitrack Productions:

We launched our advanced audio algorithm parameters for Singletrack Productions last year. Now these settings (and more) are available for Multitrack Algorithms as well, which gives you detailed control for each track of your production.

The following new parameters are available:

Until recently, Amazon Transcribe supported speech recognition in English and Spanish only.
Now they included French, Italian and Portuguese as well - and a few other languages (including German) are in private beta.

Update March 2019:
Now Amazon Transcribe supports German and Korean as well.

https://auphonic.com/static/screenshots/inspector-mt-closed.png The Auphonic Audio Inspector on the status page of a finished Multitrack Production including speech recognition.
Please click on the screenshot to see it in full resolution!


Amazon Transcribe is integrated as speech recognition engine within Auphonic and offers accurate transcriptions (compared to other services) at low costs, including keywords / custom ...

In late August, we launched the private beta program of our advanced audio algorithm parameters. After feedback by our users and many new experiments, we are proud to release a complete rework of the Adaptive Leveler parameters:

In the previous version, we based our Adaptive Leveler parameters on the Loudness Range descriptor (LRA), which is included in the EBU R128 specification.
Although it worked, it turned out that it is very difficult to set a loudness range target for diverse audio content, which does include speech, background sounds, music parts, etc. The results were not predictable and ...

Large file uploads in a web browser are problematic, even in 2018. If working with a poor network connection, uploads can fail and have to be retried from the start.

At Auphonic, our users have to upload large audio and video files, or multiple media files when creating a multitrack production. To minimize any potential issues, we integrated various external services which are specialized for large file transfers, like FTP, SFTP, Dropbox, Google Drive, S3, etc.

To further minimize issues, as of today we have also released resumable and chunked direct file uploads in the web browser to ...

Lots of users have asked us about more customization and control over the sound of our audio algorithms in the past, so today, we have introduced some advanced algorithm parameters for our singletrack version in a private beta program!

The following new parameters are available:

UPDATE Nov. 2018:
We released a complete rework of the Adaptive Leveler parameters and the description here is not valid anymore!
Please see Auphonic ...

We are pleased to announce a new Audio Inserts feature in the Auphonic API: audio inserts are separate audio files (like intros/outros), which will be inserted into your production at a defined offset.
This blog post shows how one can use this feature for Dynamic Ad Insertion and discusses other Audio Manipulation Methods of the Auphonic API.

API-only Feature

For the general podcasting hobbyist, or even for someone producing a regular podcast, the features that are accessible via our web interface are more than sufficient.

However, some of our users, like podcasting companies who integrate our services ...

Back in late 2016, we introduced Speech Recognition at Auphonic. This allows our users to create transcripts of their recordings, and more usefully, this means podcasts become searchable.
Now we integrated two more speech recognition engines: Amazon Transcribe and Speechmatics. Whilst integrating these services, we also took the opportunity to develop a complete new Transcription Editor:

Screenshot of our Transcript Editor with word confidence highlighting and the edit bar.
Try out the Transcript Editor Examples yourself!


The new Auphonic Transcript Editor is included directly in our HTML transcript output file, displays word confidence values to instantly ...

In a previous blogpost we talked about the Opus codec, which offers very low bitrates. Another codec seeking to achieve even lower bitrates is Codec 2.

Codec 2 is designed for use with speech only, and although the bitrates are impressive the results aren’t as clear as Opus, as you can hear in the following audio examples. However, there is some interesting work being done with Codec 2 in combination with neural network (WaveNets) that is yielding great results.

Layers of a WaveNet neural network.

Background

Codec 2 is an open source codec designed for ...