Audio Surveillance Guide

Published Apr 30, 2014 04:00 AM

Audio for surveillance is a complex topic with a lot of confusion and poorly implemented systems out there. We regularly get questions from users, such as:

  • What kind of mic should I use?
  • Where should it be mounted?
  • How do I configure my system for best possible audio quality?

And finding the answers to these questions and others can be nearly impossible for those without an audio background.

IPVM Image

So, in this guide, we aim to clear up the following topics:

  • Microphone types: Internal vs. external, mic vs. line level, mounting location, and directional vs. omnidirectional.
  • Sensitivity and range: Testing 3 cameras with internal microphones vs. a ceiling mount external units we bought from Louroe to see which perform best at distances up to 20'.
  • Audio configuration: Setting up the camera, client, and VMS for best performance.
  • Basic use: How to use your camera and VMS with audio inputs and outputs.
  • VMS considerations: Differences in operation, configuration, and export of audio from VMS systems.
  • Door intercom: How to configure a camera and VMS for use as a door intercom, one of the most common audio applications.

In this guide, we bought and tested two popular Louroe Electronics products, a ceiling mic with audio interface and a door intercom station:

IPVM Image

Selecting Microphones

These are the key considerations in microphone selection:

Mic vs. line level: When buying microphones, line level output is typically desired for two reasons:

  1. Most cameras support online line level inputs. Some devices such as the Axis P8221 audio and I/O module [link no longer available] support mic level devices, but this is rare.
  2. Line level signals are stronger outputs than mic level, and require less gain. This reduced gain keeps noise lower, resulting in better audio.

In brief, mic vs. line level are different voltage outputs from the microphone. A microphone creates a very small amount of voltage when someone speaks into it, typically in the range of ten-thousandths of a volt. A line level signal, on the other hand, is approximately one volt. Plugging a mic into a line level input will result in extremely faint audio, if any, and plugging a line level source into a mic input will result in loud, distorted audio.

Pickup pattern: Microphones may be either omnidirectional (picking up sound in all directions, most common in surveillance) or directional (picking up in a specific directed pattern) depending on where the mic is mounted and its purpose.

For example, a mic mounted on the ceiling of a cell block or interview should be omnidirectional, as sound may travel in multiple directions. However, a microphone used near an entry door could use a directional mic, as the visitor entering the door is generally only facing one direction.

Mounting location: In surveillance, microphones are most often ceiling mounted, centered in the room they are intended to cover. However, wall mount models are also available, for use in door entry applications or where ceiling mounting is not possible. Finally, in some interview systems, microphones may be mounted to the table for better pickup of speech between parties.

IPVM Image

Vandal resistance: If the microphone will be mounted in reach of subjects, a vandal resistant model should be used. Foreign objects such as paperclips or pens may easily damage microphone elements. Vandal resistant models typically include multiple layers of screening and smaller ports to prevent this damage. However, these protective measures dull microphone sensitivity moderately, so gain may need to be increased to compensate.

Built-In vs. External Microphones

Many cameras include built-in microphones, the simplest way to add audio to the surveillance system, since no external equipment is required. However, these microphones are typically lower quality than external components, with less sensitivity and more noise. This makes them less usable in areas where higher quality audio is required, such as interview systems or cell block areas, where forensic use and court admissibility are factors.

For occasional use, such as door intercom or home look-in, built-in mics may work fine. Microphones do not need to be as sensitive to pick up voices in these smallers areas and noise rejection is less of an issue at these closer ranges.

Built-In vs. External Testing

To demonstrate the difference between built in and add-on microphones, we tested three cameras' internal mics versus the Louroe audio surveillance kit (~$215 online). The Louroe microphone offered the best combination of sensitivity and noise rejection by far, with built in mics either too sensitive, nearing distortion when subjects were at close range and picking up low frequency rumble and ambient HVAC noise, or not sensitive enough, losing intelligibility at 5-10'.

The screencast below reviews these results. A ZIP file containing clips from each camera (63.3 MB) is also available.

Physical Components

This video reviews the Louroe ASK-4 #300 audio surveillance kit used in our tests, which includes a Verifact A ceiling mount microphone as well as an IF-1 audio interface. Users should note:

  • Easy connection between microphone and interface, requiring only a single shielded twisted pair (22/2 in our tests).
  • IF-1 provides power to the mice and converts to line level output to camera via single RCA or 3.5mm plug.
  • Gain should be increased on the IF-1 first, before increasing in the camera for cleanest signal.

Audio Configuration

Camera audio configuration varies between cameras. Some are as simple as on/off and audio level, while others allow multiple CODECs, finetuned control of bitrate, simplex and duplex settings, and more. In general, these are our recommendations:

  • CODEC: AAC (Advanced Audio Coding) should be used if the camera supports it. This CODEC was designed for MP3 audio and offers better quality at lower bitrates than G.711 or G.726 which were developed for VoIP use.
  • Gain: Most cameras allow control of input gain via the web interface, allowing microphone volume to be raised or lowered to prevent noise and distortion.
  • Bitrate: Typical audio streams range from 32-64 Kb/s, a minimal increase considering 1-2 Mb/s is typical for many IP cameras. We do not recommend lowering stream size below 32 Kb/s, as bandwidth becomes too low to support a normal voice conversation, and stuttering audio will result.

In this video we review basic configuration of audio via the camera web interface.

Audio Operation

At its simplest, audio may be monitored, and two-way audio talkback used, via the web interface of the camera. this is typically accomplished via buttons on the live view, allowing the user to mute or unmute the microphone, and push to talk using a PC mic.

VMS operation functions much the same, with some possible additional complexity:

VMS clients typically allow only one audio stream to be monitored at one time, though multiple streams may be recorded. This is true in both live view and playback. Though multiple cameras may be viewed via split screen views, only one stream will be heard.

The same is also true of 2-way audio. The VMS client operator must choose from a list of available speakers, and push a specific button to activate one. Some allow for a global push to talk to be used, allowing the operator to speak through all connected speakers at one time, but this is not common.

This video illustrates these key points:

Door Intercom

One of the most common uses of audio in surveillance system is as a door intercom system. In the following examples we demonstrate how to operate, connect, and configure the Louroe AOPSP-PB with an IP camera and VMS system for this purpose.

Use Example

This clip shows a typical scenario: a visitor entering the building, calling inside staff via the door intercom, and entering after being admitted.

Note: This clip was combined from two different Exacq exports. Exacq only allows one audio track at a time to be exported and played, so audio in and audio out were mixed together over the camera's video. No other editing was performed. Users may download the original clip including only audio in (Exacq standalone player).

Physical Connections

In this video we review how the AOPSP-PB connects to an IP camera, including audio in/out and digital I/O for the intercom pushbutton:

Note, for those curious, we used 3.5mm screw terminal plugs for simpler installation requiring no soldering.

VMS Event Configuration

In this video we demonstrate how a VMS may be configured to work with the door intercom and associated camera to trigger a tone, switch video, and record event audio and video. Key points to note about this configuration:

  • The VMS (in this example, Exacq) may be configured to record video and audio prior to an event (10 seconds in our example) in order to provide some history of visitor actions prior to entering the building, check for suspicious activities, whether another subject is hidden out of view, etc.
  • If audio and video are not continuously recording, event recording should be set long enough that the entire conversation is recorded, typically 30-60 seconds, so no part of the event is missed.
  • While bidirectional audio may be recorded, only one audio track may be played or exported at a time from both Exacq and Milestone. This limitation means that only visitor audio or staff response may be reviewed with video, not both.

Comments are shown for subscribers only. Login or Join