Do Any Surveillance Cameras Offer RAW Output?

Do any surveillance cameras offer RAW format as opposed to MJPEG, H.264, etc? I know this wouldn't be possible for high frame rate video streams, but what about for time lapse/stills, etc? RAW has been around for quite a while with DSLR's and recently inroads for cinema/video cameras. I would love to have a single RAW image every second or so...and curious if anyone is dabbling with this for security applications?


The short answer is 'no', the long answer is 'depends what you mean by RAW'.

No surveillance cameras, AFAIK, offer a RAW option for video, at least how the term is used with DSLRs, i.e., raw sensor data that has neither been processed into a colorspace NOR been compressed*.

RAW image data is therefore always uncompressed*. But uncompressed image data is not always RAW. Sometimes people refer to uncompressed video as RAW video, but this should be avoided since RAW in terms of DSLRs always means without colorspace sampling.

In any event, since I think you actually mean 'uncompressed video', the answer is slightly more more positive. I see three options all with some drawbacks, but all possibly workable.

1) HD-SDI - Since you didn't specify it needed to be an IP camera, option 1 is the beleaguered HD-SDI based security camera, which although not likely to be around in a couple of years, is easily obtainable today at reasonably cheap prices. They output to the SMPTE HD-SDI broadcast standard, and so are uncompressed but color processed. They normally work with HD-SDI DVRs, but in order to avoid the automatic compression applied by the DVRs on input, you would want a SDI capture card in a PC to grab the uncompressed frames. There may be a way to save uncompressed frames on the DVR directly, but I didn't see one.

2) GigE - Another option would be GigE machine vision cameras, which can output uncompressed video onto a standard TCP/IP network. Though on the other end you need a driver that can take GigE and make bitmaps out of them. So again a PC, but no capture card this time. On the other hand the cameras usually run in the thousands of dollars, so...

3) Native Camera SDK - An interesting option would be to use something like Axis' Embedded Device SDK. Uncompressed frames can be acquired and then saved on memory cards or ftp'd. What frame rate could be supported, I don't know, probably very low. Maybe some one already has made a simple uncompressed capture program. Vendor specific though...

*compression used in the sense of lossy compression.

Note: I intentionally omitted any analog HD technologies, although one could argue they are not compressed, at least in the normal sense.

To expand on #2 in the excellent post above, any GigE Vision compliant camera will output uncompressed (and most almost always in an unaltered color space) "video", which is really just a sequence of images. Indeed many of these cameras are in the thousands of dollars, but a growing number are somewhat competitive (price-wise) with surveillance cameras - look at Point Grey or Basler. GigE Vision cameras wouldn't plug into any ONVIF software/hardware, but multi-platform SDKs usually ship with each camera.

Disclosure: I work in this space, but not related to the companies I mentioned above.

We have had some success with using our Direct Show driver with various VMS systems, so this could be an option. We also have an ARM/Linux SDK, so you could create a very simple embedded solution.

Our Blackfly cameras are fairly affordable costing below $500 for most models:

http://www.ptgrey.com/blackfly-gige-poe-cameras

Disclosure: I work for Point Grey.

Regards,

Vlad

Thanks for the comments. This should get me working in the right direction. My goal with looking for RAW(uncompressed) is to have an optimum image for further work in Adobe Photoshop/Camera RAW(if possible). In the still digital camera world, I shoot everything in RAW(compared to JPEG) and can do a lot more with it later in post processing.

Possibly building a Rasbperry Pi camera may fit into this as well? Something I had not yet looked at, but an area someone had mentioned considering.

Nick not sure how much this will help but a few years ago I discovered industrial machine vision cameras sold in the US by the Imaging Souce, nice thing for me as an end user is the direct availability to buy product. The line up covers USB, GiGe, and Firewire. I selected a USB model with a monochrome Sony CCD chip 1280x960 with 4.6 micron square pixels it is only 8 bit dynamic range but my application experience is similar to what I think you looking for a way to get maximum image quality in post capture. I use the supplied Windows software package to control the camera in the capture phase which IMO is an easy UI to lean and dial the camera in for each specific scene exposure. Files get rather large when I run at the max of 15FPS on my model but most of my subjet matter is low light often at longer focal lengths so I am running much slower frame rates to get the data I am looking for. I use a stacking software application to ingest the original file and increse S/N in low light scenes, eventually finishing up in Photoshop. The resolution in the end images is nothing short of amazing when I have everything dialed in.

Kinda getting outside the scope of this forum, but... why not just pick up a cheap used DSLR and use something else to trigger the stills on a time lapse? The software that came with my Canon cameras could do this when connected via USB cable... or you could use your Raspberry Pi to either trigger via the remote shutter input, or scrape up a driver to connect it via USB. Doesn't even have to be a DSLR; a lot of point-and-shoot cameras support RAW output as well.

What are you lacking in your current images that you feel you would get out of a RAW image (realistically)?

Who/what would view the RAW's? They're not web or mobile device friendly, so they need to be converted to JPG/PNG/GIF (it's pronounced JIFF!), etc.

Great information thus far! But it’s critical to note that Video/Image compression is vastly different than what the ISP (Image Signal Processing) engine does to convert RAW image data to the "uncompressed" video data that is fed into the H.264 or MJPEG engines.

RAW sensor image format is in what’s called a Bayer pattern -> RGGB - Red, Green, Green, Blue - which no (to my knowledge) off the shelf security cameras can output.

The reason for that is simple… That data is both MASSIVE and wicked fast. To give you an idea of how fast that is, a typical sensor is set up for a rate of 60Hz for 1080P video. To get all that image data dumped from the imager to the processor in time is 2.376 Gigabits per second of data. That’s fast.

The only way that modern architectures can keep up with the video flow is to have a dedicated hardware pipeline that converts the mosaic image from the sensor into RGB and ultimately YUV image formats.

https://en.wikipedia.org/wiki/Demosaicing

That hardware is set up for speed and is usually 'untouchable' by the CPU and/or software. I say usually given the fact that if you're doing the processing with FGPAs or other specialized video engines, there can be a tap to get RGGB data from single snapshots via a special pipe.

RGGB data is awfully big image to keep in memory as well, so it’s usually processed using “line buffers” in memory which only contain stripes of the RGGB data mid-flight while it is being processed by the hardware engine.

Frame buffers on the other hand, store the entire frame, which is handy for 3D noise reduction where the ISP has to know the status of the pixel from the frame before to see if it’s noise or real.

Regardless, the CPU doesn’t usually get to touch the video data midflight, even if it’s part of DRAM memory that it has ‘access’ to. Even just snooping around there can interrupt the timing of the video feed which can mess everything up and cause jittery video. All of those bits and bytes are flowing around using hardware DMA engines and have top priority for memory access, even over the CPU.

All that said, if one had a magic wand, there would be some pretty awesome use cases for getting full RGGB/RAW frames out of the pipeline.

Many assumptions - color balance - white balance - exposure - etc, are made and hardwired into the conversion from RGGB -> RGB so that's why most pros/prosumers use RAW mode when they are shooting with DSLR's. Once that conversion is done, it's hard to 'undo' and hard to fix if something was wrong.

For RAW image data, if the exposure is 'close', Adobe Lightroom, PhaseONE, etc etc can fix it easily.. if it's RGB or YUV... not so much...

The other challenge with doing those settings in a hardware pipeline, is that (generally speaking) all the settings are global to the entire image.

The only caveat to that is the new dynamic range schemes that are coming out now that allow multiple exposures to be multiplexed together to form a single, properly exposed, high dynamic image.

White balance is global, as is the overall exposure, etc.

Higher end image processing tools (like Lightroom, PhaseONE, etc) can change and re-process various parts of the image using different settings. In the case of mixed lighting where you have sunlight streaming in to a fluorescent lit room, it’s almost impossible to white balance properly if you use a single setting. The camera picks the dominant light and the other lighting will have a strange hue to it.

Security cameras don’t really care about this so long as it’s not too ugly, but for photographic means, that is very important.

The ISP also throws away information as part of the color space conversion from RGB to YUV.

https://en.wikipedia.org/wiki/RGB_color_model

https://en.wikipedia.org/wiki/YUV

YUV is really handy for processing and storing image data and it allows you to keep ‘intensity’ Y values (how bright and dark a pixel is) versus what color they are ‘U’ & ‘V’. Scientists long figured out that our eyes are very sensitive to how bright a pixel is as compared to a pixel adjacent to it, but really bad about seeing color changes between those two pixels. So by throwing out that color data that we can’t see, you can shed a lot of pounds, but still keep it looking pretty good.

That’s one of the reasons why there are two green pixels versus only one blue and one red pixel respectively. Green usually conveys intensity for our eyes so it has the most information and the red and blue pixels tell the engine what color that green pixel really was.

Programs like Photoshop, PhaseONE, Aperature, etc love to have all that extra image data. Having every bit of data makes resizing, color adjustments, fine exposure changes, etc to be the most accurate possible. For security cameras, that’s just a lot of empty calories and throwing them away is the right call.

IPVM should add a "Best Of" voting category for posts like this.

Really great info Ian!

Question, is the colorspace processing done on the line buffers or on the frame buffers?

Is there even an instant in time when there exists an entire 'frame' of RAW data, to be captured?

Does the sensors architecture, vis a vis rolling vs global shutter, directly impact how the pipeline is processed?

One bright spot is that if a full RAW frame does exist, the OP is specifically referring to time lapse stills, so there could be a second or more between every frame with which to write the data.

Thanks! Great questions too!

The answer is HIGHLY dependent on the processor used, but the short answer is: Both.

Given the fact that most 'better' video processors support "3D" noise reduction, the only easy way to do this is using a frame buffer. It has to look back in time to get a sense for how frequently that pixel is changing in time, so it has to have at least one frame behind it.

But that frame buffer could be used on RGB or YUV data, which negates the original author’s request of having all those delicious unmolested bits to play with. The noise reduction is easier on less information, so doing it only on the Y data makes sense. Noise is usually changes in intensity, especially in low light, so again, Y only fits the bill for a frame buffer.

If they’re doing that then the incoming RGGB data is fed into line buffers, processed into intermediary stages and then dumped to a frame buffer.

That said, memory is also getting insanely cheap and blazingly fast now, so having big DRAM buffers to support multiple FULL bayer frames isn't too bad (at least less bad than it was 3 years ago).

Even if it was RGGB data, all the voodoo that the processing engines use to get frame rates and resolutions ultra-high, might mean however that it might be in non-contiguous hunks, etc, so putting it all back together is at best 'tricky'.

These are very closely held secrets and I don't have any inside knowledge aside from having worked in this space for a very long time and know the limitations pretty well, which give insights to how it is architected.

The major limitation is the CPU can’t get in there fast enough to write back out a full buffer worth of data before the other frame comes in and clobbers the one you were trying to read out. Like before, it might not all be continuous chunks and it’s possible that there’s never a whole frame that could be used.

All that to say, “I don’t know” but I’m guessing both :)

We have really wanted to get RAW / RGGB frames as well, and when I’ve plied the vendor’s hardware engineers with lots (and lots) of beer, they’ve allowed us to have “chunks” of RGGB data, but compiled from a bunch of frames that could then be combined to make up a complete image.

For time lapses of stuff that doesn’t move very quickly that technique would work, but it’s very hairy and those semi ‘back doors’ are prone to being lost in the next revision of the firmware. So even though this would likely satify the OP’s dreams, we gave up on having this as a feature in our cameras since it was too hard to maintain and it’s rarely used.

Great question on Rolling Shutter versus Global Shutters… For the rest of the group reading along, there are two basic types of electronic shutters available with modern CMOS sensors.

A rolling shutter starts scanning at the top and then drops line by line, ‘rolling’ its way down the imager. Once it gets to the bottom, it starts at the top and keeps going, ad infinitum… This can lead to some very weird artifacts with objects that move through the scene while the capture is being taken.

For example, if a truck drives through the scene, the top of the truck will be ‘seen’ first. As the truck swipes through the image, it is ‘seen’ at later times, but at that time it’s driven a few inches since then, so for the next row, it ‘looks’ like the truck shifted over a bit. By the time it gets to the bottom, the entire truck will be leaned ‘back’ and the wheels will be significantly further forward from the top. Weird.

Global shutters capture the entire scene at one time, so the top of the truck and the bottom of the truck are captured exactly at the same moment, so there is no distortion.

Rolling shutters are a lot easier and less complicated than global shutters, so they have ruled supreme in the video and cellphone markets.

The read out speeds are typically VERY fast as well, meaning that the time it takes from the top to the bottom has decreased dramatically as sensors have gotten faster. The faster you go from top to bottom, the less something moves in between, making it almost look like a global shutter was used. That’s why in my example above for the data rate, we put the sensor in 60 FPS (Hz) mode when at all possible, even if we’re only supplying 1080P30 to the H.264 engine. The faster frame rate makes things look better if they are moving.

Ok… all that aside, to finally answer your question, how does that change things buffer wise… the answer is not much.

The video frame is read in so quickly, even with a rolling shutter, that the processor has to gobble it in as fast as it can.

On the other side, the sensors with global shutters still have to feed that data out row by row as well, so they’re pretty much the same when it’s all said and done.

Claiming a line of cameras designed for Global Security / situational awareness, comes these machine vision turned security cameras from Adimec. Still they claim to be able to output Bayer data, so this is the closest match to the OP's request that I have come across so far.

@Ian, from the spec sheet, does this sound like pre-demosaiced data?

Do any surveillance cameras offer RAW format as opposed to MJPEG, H.264, etc? I know this wouldn't be possible for high frame rate video streams, but what about for time lapse/stills, etc? RAW has been around for quite a while with DSLR's and recently inroads for cinema/video cameras. I would love to have a single RAW image every second or so...and curious if anyone is dabbling with this for security applications?

RED make very high resolution digital cinema cameras (Used by Jackson for Hobbit 1-3 and Cameron for Avatar 2-4) with 17 stops of dynamic range and all output is REDCODE RAW. Typically for a Hollywood feature you'll record to onboard SSD mags for 6K (6144 pixels horizontal) at 120 fps. If you wanted to integrate this into a survaliance system, you would want to tether via GigE instead of using SSD mags. GigE obviously has less throughput than onboard SSD, but you could still get 4K REDCODE RAW at 60 fps or 6K at around 50 fps. We typically find GigE delivers around 80MB/s, so you can experiment with the frame size/frame rate/redcode compression rate to see all possible options at http://www.red.com/tools/recording-time

I haven't used the AXIS SDK so I'm not sure how it compares to surveillance standards, but all aspects of the camera are controllable via our open REDLINK SDK and if you wanted to write software to access this data inside of a VMS, you could do that via our R3D SDK. Most of the documentation is publically accessible at https://www.red.com/developers

Of course, digital cinema is our primary market so at this time I can only see this working for very high end projects.

Have you checked out CHKD for Canon cameras? The relevance depends upon your intended application.

Assuming you mean CHDK...