VMS Support For High Audio Sample Rates?

I have an application that requires high quality, high bandwidth (e.g. 20-20,000 Hz preferred) audio recording in order to maximize speech intelligibility from our specialized microphone installations. The "open" VMS systems I have looked at so far, such as Milestone, appear to support only 8KHz sample rate G.711 and G.726 VOIP codecs. Some cameras support AAC but they appear to cap out at 16 KHz sampling.

If I build or source an audio capture device that streams either AAC or uncompressed audio over RTP at above 16 KHz sampling, is there a VMS that can record that?

FYI our existing solution is a custom DVR and we are looking to transition to IP.

Thanks for any help,


IndigoVision IP cameras and encoders support up to 160kbps, as does their VMS. Monophonic on their two-channel encoders and stereo (2-channel) on their IP cameras and single-channel encoders. Default is 64kbps.

Wow this is the most low-level audio question we've ever received!

I doubt that most manufacturers publish which sample rates they support and some may even downsample it when recording, so unless someone has specific knowledge of the VMS, or you ask the manufacturer, it's going to be hard info to come by.

Also, for those unfamiliar, sample rate is separate from bitrate. It's the number of samples taken in one second, which generally requires higher bitrate, but the two are separate. For reference, 8 KHz is about what old analog phones sound like. 16 KHz is about the clearest voice call quality you'll hear, and 44 KHz is about CD quality.


While sample rate is related to fidelity, so is bitrate. MP3 quality is typically measured by bitrate, while the sample rate is assumed to be 44.1kHz or 48kHz. While IndigoVision doesn't spec out their sample rate, a lot can be inferred by their bitrate.

With MP3, bitrates in the range of 64k or less are only really suitable for voice recordings. 128k is the lowest bitrate typically usable for music, although it typically only delivers analog AM Radio quality. 160kbps is better still, but not really "HiFi" - more like FM radio. For "good" HiFi, typical bitrates should be set at 256kbps at minumum and 320kbps to obtain "near-CD" quality.

160kbps will roll off high frequencies somewhat as the chart below shows, but anything over 15kHz high frequency capability is considered acceptable unless you want to record high quality music.

Bitrate - HF Rolloff Point - Compression Ratio

1411kbps >20kHz 1:1
320kbps 19.5kHz 1:4.4
192kbps 18kHz 1:7.3
160kbps 17kHz 1:8.8
128kbps 16kHz 1:11
96kbps 15kHz 1:14.7
64kbps 11kHz 1:22
32kbps 5kHz 1:44

For the human voice, whose range is typically 300-3kbps (to around 10kHz including the vast majority of harmonics), 11kHz (64kbps) should be plenty sufficient. Incidentally, 300Hz - 10kHz corresponds to the region of highest sensitivity of human hearing.

Sorry, that was 300Hz to 3kHz, Typo...

Carl, that's a really useful table of AAC bitrates and frequency cutoffs that you posted; I 'd like to share it. May I ask where you found it? Alternatively I'll have to attribute it to you....


Sorry, I can't take credit for it. The chart is all over the internet but I can't find the original source. I obtained it from The Session

Thanks for the replies; I will check out IndigoVsion, although an "open" VMS with support for multiple IP camera manufacturers, the ability to integrate our own cameras, and customizable/rebrandable end user UI is preferred.

Our customers require maximum speech intelligibility for discerning the exact words said, including names and other words unfamiliar to the listener, down to a whisper, with minimal distortion. Currently we record uncompressed at 48 KHz for direct lossless export to DVD, using microphones that roll off above 16 KHz, which helps preserve the distinguishing characteristics of some fricative consonants and makes our customers confident that they aren't losing anything.

As we scale from our existing analog cabling solution to IP for greater scalability, we would prefer to go with at least 160kbps AAC so customers don't notice a decrease in quality.

Stephen, congrats on a novel combination of technologies; I'm sure you've had to jump numerous hurdles just getting this far...

If I build or source an audio capture device that streams either AAC or uncompressed audio over RTP at above 16 KHz sampling, is there a VMS that can record that?

So are you doing the A/D yourself and encoding the AAC? Have you been able to sucessfully stream 8khz audio via RTP to a VMS/DVR? Which ones? Is there a video stream also? is synchonicity a concern?

We have our own digital video camera platform that we use for niche applications. We built a high quality microphone into it, among other things that differentiate it from typical IP cameras. We can stream H.264 video plus audio over IP to a proprietary client/recorder as well as to other "standard" clients like VLC Media Player. We can transmit the audio as uncompressed 16 bit 48KHz PCM or using a variety of codecs and bitrates. For larger scale multi-camera recording applications, we are interested in integrating with an off-the-shelf VMS solution rather than rolling our own, but we do build our own recording solutions when needed. The nice thing about using an off the shelf VMS with support for many cameras is that it would be easier to swap out different cameras for different customer requirements when our custom camera wasn't needed or preferred.


I would call IndigoVision "semi-open". Although their NVR software doesn't directly support other manufacturers' primary streams, they do support ONVIF streams. They also have software that can be installed on a separate server that supports direct streams, called "Camera Gateway" and although their VMS cannot redistribute a single Unicast stream like some VMS', they support "Proxy Servers" for that task.

Yes, that is not the same as other companies' direct support for third party cameras and obviously they would prefer you buy a complete IndigoVision end-to-end solution but they are making strides toward openness. They even recently announced availability of a firmware for their HD IP cameras that will allow them to be used with other VMS' via ONVIF streams - a first for them.

I will have to see what audio options the support via ONVIF if we want to use our own camera (assuming we implement ONVIF support) versus what we can get using IndigoVision cameras, which may be fine for many of our customers. Our vertical applications call for some customization of the user interface, however, which will also be a factor in our selection.

We are currently using our own DVR software with third party COTS analog cameras and microphones for the application that requires good audio with AV sync, but scaling up requires us to consider IP.


I don't know about other VMS providers... but our VMS Digifort is able to record and playback audio at "any" frequency you through at it. In fact, I expect that any VMS that supports PCM or AAC should be the same...

While G.711 and G.726 are limited to 8KHz frequency and 64kbps (g.711) and 16,24,32,40kbps (g.726), other codecs such as AAC and pure audio (PCM) will support different frequencies.

When transmitting AAC or pure PCM audio over an RTSP connection, the server (your camera, DVR...) is supposed to inform the frequency of the audio it is serving in the SDP Description (In RTSP DESCRIBE command reply), that way the VMS can initialize its audio decoder to handle the proper data...

Likewise, when tranmitting video over RTSP, the frequency is always 90khz... but the audio frequency is variable.

We use Axis P8221 for high quality audio recording in interview rooms using Exacq (AAC 128) and Louroe/Crown microphones. These have balanced phantom power mic inputs. They do requires an additional device license.

[IPVM Note: The Poster is from Genetec]

Genetec supports bidirectional audio from IP cameras and encoders. The level of integration and exact capabilities is different per manufacturer. When it comes to audio, there are many parameters that an IP camera/encoder may support to configure audio. Some of them are, but not limited to, Line Level, Mic Level, Gain, Audio Codec, Sampling Rate, Bit Rage, Volume.

The parameters used for a particular unit depends on what the unit supports, as well as what support is included in the driver. When it comes to codecs, there are 3 main codecs that are currently used by different unit manufacturers and supported by Genetec: G.711, G.726, and AAC. When it comes to Sampling Rates, Genetec has integrated to units that support from 8 KHz up to 44.1 KHz. The codecs and sampling rates supported are dependent on not only the manufacturer of the unit but also the model of the unit. As an example, for Axis units, today, we support the AAC codec and sampling rates up 32 KHz.