Here's a whitepaper from the researchers on this. Most interestingly, the software itself has been open sourced (see the code and documentation on Github).
There are 2 key parts to this:
(1) detecting RTSP video feeds and then replacing them with manipulated outputs, excerpt:
"The camera sends its network stream over RTSP (Real Time Streaming Protocol), which uses TCP to set up the video stream, RTCP (RTP Control Protocol) over UDP for synchronization, and RTP (Real Time Protocol) over UDP to transport the H.264-encoded video. Each layer in lens parses apart incoming packets, applies any modifications, and reconstructs outgoing packets."
"In practice, video encoded with H.264 can be looped without re-encoding by concatenating the looped segment indefinitely. However, by re-encoding the stream with mpeg, an attacker can apply standard mpeg transformations to the video. For example, the attacker could loop most of the video, without modifying the corner of the frame that contains the timestamp."
(2) Tapping into a network cable, described below:
The software part is available but I am trying to figure out what the status is of the tap / hardware.
This is the most advanced / general attempt I have seen, especially the open sourcing of the software (compare to the 2009 Defcon attempt).
Anyone have thoughts on this?
This is theoretically more doable on a basic system. Unmanaged switches, default camera passwords, unencrypted video.
On many mid to higher-end managed switches the switch will renegotiate the entire link if there is a connectivity drop of more than a few milliseconds. You run a big risk here of upsetting the switch and/or upsetting the VMS, causing it to detect the camera as down and then up, resulting in a full reconnect. If your device isn't able to handle the full auth part, then it will be obvious there is a problem.
A more practical approach might be to insert the device well in advance of carrying out the switchover. That way if a camera dropout becomes visible to the operator they are less likely to notice anything "off" with the feed and might just attribute it to a network congestion.
people don't use 802.1x. It only authenticates the first hop, it doesn't secure the data.
yes you can do an RSTP loop. No that's not sophisticated. The team that did the hacking demo at PSA TEC this year in Colorado (I did color commentary on the side of their preso) was doing this/prepared to do this with audio and video.
people don't look at link up/down. If you caught that and treated it like an alarm all this stuff could be mitigated. this requires managed switches and a log server that's listening. Some integrators and many large enterprises can handle this, it's not impossible.
yes you should be using TLS, When IT security people say "use TLS" they mean FOR ALL THE TRAFFIC". Not this conmmand-over-tls/critical-video-in-the-clear sloppy engineering prevalent in the VMS world.
In my experience many VMS/camera vendors make protocol-illiterate claims like "tcp is slower" and "you can't do crypto in a camera" and "you can't do TLS in a server" and "you can't do that in a camera". They're all at risk when they do this. What they really mean is they buy crap low-end OEM cameras with weak processors and they're too lazy to re-architect their VMS software away from it's late-90's code base.
No, not sophisticated. About average for real hackers.
This article and blog comments might lead one to think IP video in ANY scenario is not secure and maybe shouldn't be used without high level network security applied which most small to medium video customers can't/won't pay for. I think its important to discuss what the actual risk of this kind of hack would be given the type of customer and the application.
John you mentioned: "Two key parts to this"
Part 2 looks like you'd have to have physical access to the network cable attached to each camera? So the first penetration would need to be physical. If that's the case, I can see an Oceon 11 operation hacker taking the time to penetrate the physical security (access control, barriers etc) and doing this but I'd think 98+% of the security systems in place would not be a target for this kind of attack.
That said, if Part 1 and Part 2 are completely separate and Part 1 can be accomplished without ever touching the physical network cable - that would be the one most likely to affect the 98+% and that would be a reason to lose sleep (or go back to stand alone, off network, analog tech)
Am I misunderstanding something? I guess I'm just thinking its a very small risk in the majority of cases and I wouldn't lose sleep over it but it should definitely be addressed for higher security apps.