Judging Video Quality Of Old Analog Cameras?

I have a consulting engagement that is very important to me. The challenge is to evaluate an existing analog video system spread over 8 separate sites involving about 80 cameras total. Since this is a bid for consulting services only, no integrators will be permitted to bid this work unless they do not want a shot a bidding the actual work (which could be substantial). Having conducted a brief tour of all the sites involved, I don’t think I have ever seen a system in worse condition than this. Every type of camera, old DVRs, twist-on cable connectors, NVRs covered with dust, devices stuck on the wall with double back tape, you name it.

The owner wants my firm to conduct an evaluation of the system and make recommendations that- at least to the extent possible- leverage existing equipment. There is one present site, where the recommendations will be implemented first and those specs will serve as the standard for the remaining sites. At the end of the day, the Owner wants a single VMS with multiple workstations (guard stations, site administrator and central security operations). All of the cameras are fixed with cameras at each entrance, elevator lobbies and day rooms in each facility. All except for two of the cameras are interior.

I am now setting up a test protocol which must include a schedule of cameras, DVRs and other devices. Also on this schedule is an evaluation of "image quality" to make a go-no-go decision to replace the camera. The first stop is the security director to discuss, his objectives, how he has used the images in the past and what problems he has run into with regard to video quality. The result will be a "clear" (yeh right!) definition as to the objectives of each camera which i have organized into three view levels as noted by John's article last Feb. But as pointed out in this article, its easy to understand (the security guy is an ex cop, but is very non-technical) , not so easy to judge in the field (I know because I have used this method for a number of years). That being said, I still have to come up with a protocol for testing. I will probably eventually use my own judgment based on my experience, but I need to put forth (and price) a reasonable test protocol. Given these conditions, how would you conduct this “image quality” test? Also, what would you anticipate the useful life cameras and DVRs to be?


I work in the research and development field, so keep in mind that I'm a geek whos products are seldom required to fully function in the real world :)

My thinking is, you are looking for a software tool that can be reasonably automated to serve your needs.

These cannot provide a complete answer your question, but may be supporting elements of your solution.

--------------------------------------------------------------

Leverage the properties of the spatially invarient fourier transform for image analysis. I haven't investigated, but I expect you can find online representations of this function capable of operating on video of varying frame rates and resolutions. It would require two inputs: an image or a video stream, and a reference image you want to match within that image or stream.

Taking a leaf from IPVM's playbook, build a large poster representative of issues you wish to test. The details are not terribly important: traditional eye charts with increasingly small letters, standard video test charts with radial lines and such, imagery of objects of interest, or other test charts could all be suitable.

Try to place your poster within each camera visual field on the boresight of the camera. Off boresight can limit SIFT relevance because it may require a mathematical camera model to compensate for non-linear distortions, which would unnecessarily complicate your solution.

You should try to place the poster at such a range that, for each camera, it takes up the same % of the camera field of view. That normalizes the result for effective range (eg even "bad" cameras can indicate a high score if the reference image nearly fills the field of view, compared to a "good" camera where the reference image fills only 5% of the field of view).

The SIFT provides an indication of the position of the image within the camera field of view, as well as an indication of the "goodness" of the correlation. If the video imagery is degraded, the correlation will be poorer than if the video imagery is crisp and clear.

Cons:

1) you have to figure out what is a SIFT, how do you use it, where can you get a suitable algorithm, etc.

2) the approach only works during those times that your poster is in the camera field of view.

Pros:

1) it's a pretty reliable quality measure

2) relevance: your poster is representative to your camera imagery needs

Variations:

Once you have investigated, developed, and placed this arrow in your quiver, you can implement it in a number of ways. A continuous quality measure can be available in those fields of view that have sufficiently detailed permanent feature sets. For example, a field of view with resolvable text or some fairly reliable constant detail can be continuously monitored for image quality.

You might use a high-quality camera with a zoom that allows you to take an image from the same point-of-view of each camera of interest, with similar parameters. Snap the image of that space without any changing objects such as people or vehicles in the scene: this becomes the reference image. Run the SIFT on video stream vs reference image. Note quality and degradation with time.

--------------------------------------------------------------

Alternatively, there are a number of off-the-shelf functions you can use to provide a first-pass approximation of image and video image quality. One example is information entropy. An entropy function returns a result between 0 and 1 that represents the amount of information in the sample. You can see how this might apply. A blurry image will have less information content in each frame than will a crisp image. A camera with "stuff" on the lens will provide less information content over time than will an un-occluded image. An image with impaired dynamic range will have less information content than a wide dynamic range image.

Just as a quick example of how information entropy function can be useful in video, we use it for auto-focus. Calculate the entropy of an image, auto-change focus a few steps, calculate entropy of the resultant image. Entropy (information content) either improved (+) or degraded (-). Auto-change focus an amount proportional to that change in entropy. Repeat. Result: continual small focus adjustments about perfect focus.

Also, we run entropy across two domains. First, we use it on each video frame. This is used for auto-focus and general quality assessment of blurry, foggy (eg dusty lens), or low dynamic range video. Second, we use it over time on the video stream. This selects segments of interest across video streams, because a static scene has low information content (once you've seen one frame, you can mostly predict all subsequent frames).

--------------------------------------------------------------

I can see where this wouldn't be a high priority, but wouldn't it be interesting if VMS manufacturers embedded such a capability for any arbitrary video input, either from a generic quality metric such as entropy, or programmed against a manufacturer-specific test chart (which they can sell for a ridiculous sum such to make even more money)?

I tried googling "spatially invarient fourier transform", and then I just tried googling "inverse Fourier transform", and then I just googled "Fourier transform" and THEN I googled "Fourier analysis" and now I just feel stupid.

Any chance you can give an overview of the topic for laymen?

It has been a while since I took descrete mathematics and/or, algorithms but I believe he is trying to say is that you can reconstruct an image using the inverse fourier transform and the weights respresented in the image.

http://en.wikipedia.org/wiki/Fourier_series

This is one of the great values in IPVM. Some yahoo comes on with babblespeak, and then the guys who do this every day tell you how to REALLY approach it.

Thanks to Elliot Stoll for the translation.

I did a quick review of Imatest's web page. It looks as if it's already been done much better than my imaginings and that there's no sense in reinventing the wheel. A rough analog of my suggestion is here, apparently fully implemented and accessible, if you're still interested in relatively automated characterization of video properties.

Here's my simple solution for showing image quality to an ex-cop: Get an HD camera, set it up side by side to each of the existing cameras and grab shots of both simultaneously with a subject in the FoV.

This is actually fairly easy to do. Buy a fixed HD camera with integrated zoom (we've been using an Avigilon H3 box for this recently), mount it on a tripod, connect it to a Pointsource and a laptop. Walk from camera spot to spot, adjust the fov to match (using the integrated zoom). With two people you can do dozens of cameras in a day.

Show the results side by side of the existing camera and the current gen HD one. Then the user can determine how significant an improvement a modern camera will make.

I was going to say something similar...

Analog/D1 resolution is pretty much dead these days for anyone who wants any kind of "performance" out of their CCTV system. Putting together an exhaustive test matrix seems like a total waste of effort at this point, better to spend that time and money on overhauling the system to something "current".

It seems like by this point, they've either gotten their ROI out of the previous system, or they're never going to. Trying to put it on life-support for another 3 years doesn't seem practical.

John, you are correct that this will easily provide images that can be used for subjective, side by side comparisons.

I'd also recommend taking those side by side shots, have your subject holding a color test chart as well. I'd then put those side by side comparisons through an image quality testing software like Imatest. This will provide objective, concrete values that can be used for comparison.

The very first thing you should do, I think, is try to define the purpose of each camera. "Which", as in "which camera", "which NVR", "which bit of technology am I going to use to solve this problem" is not the important question. "Which" can change any day, because the best camera of today is going to seem ridiculous and overpriced compared to the average camera of 2018. The first question you should be asking is "why"? Why do you have a camera there? "What" is another good one, as in "what is the camera supposed to be doing for me"?

So. First things first- define the mission of each camera. How many areas would we like to be able to see? From how far away would we like to be able to see a face? From how far away would we like to be able to see movement- say, distinguishing one person's clothing from another, or identifying a vehicle's color and general shape (not enough to tell an Impala from a Malibu, maybe, but allowing you to see the difference between, say, a Durango and a Grand Caravan)? This question will tell you more about required resolution and lenses than a four hour PowerPoint on technical blahblahblah ever will. And it'll get the ex-cop thinking along with you, not in terms of what's possible- he simply doesn't know, although putting an HD camera on a tripod, as John suggest, will get him thinking along the right lines. But years of actually using the old system in place will tell your customer exactly what works and what doesn't, where you need a bit of extra width and where you need the extra resolution and where the sun comes out for two weeks in May and blinds the camera for twenty minutes a day.

Talk to the ex-cop. Talk to the security staff, the people who watch the cameras all day. Ask them about The Perp Who Got Away and The Hit And Run We Never Solved and The Time We Were All Sure A Janitor Was Stealing Toilet Paper But Couldn't Prove It. Ask to look at old footage from around the year and in different weather conditions.

And then figure out the "which".