Windows Based VMS Users: How Often Do You Reboot?

Windows machines have a reputation for needing to be periodically rebooted. Almost all VMS software applications run on Windows. This begs this question / poll:


(Meanwhile, all the Linux operators are busy grabbing screenshots of "$ uptime' results to post.)

I selected "Weekly" just because Vigil comes pre-configured to reboot at midnight every Sunday/Monday, and we never bother to change it (unless specific circumstances dictate).

Just checked on one system that we had disabled the auto reboot, I forget why...

C:\Documents and Settings\Administrator>uptime
08:59:03 uptime 103 days, 22:55:08

Still running smoothly....

I would suggest to separate Server form WorkStations. From my experience: Servers rarely but workstations every 2-3 weeks

Alexander, good point. Too late to change the poll but members feel free to mention this in the comments.

About every 6 months for servers. Memory leaks in the last three versions of the VMS software force this. Older versions never required rebooting.

Whenever they need for clients.

Monthly, due to updates. Haven't had to reboot it yet due to it running below standard.

Never unless there's a need, usually either patching the O/S or the VMS software.

In my personal experience only Windows 98 had some memory leaks during file system operations.

From that time never had any problems with Windows reliability.

Windows, no. Other software, ???

Usually every 90 days for servers, not that it´s needed but that's the time frame between preventive maintenance we perform for our customers so we take that opportunity to also clean the hardware, dust always finds a way in.

Client machines are rebooted when needed.

Btw, I found it interesting that, by default, Mobotix cameras reboot every night. Over the years, I have found a handful of products do that but it make me nervous because if something important happens in that two to three minute period and the user did not know about, it will be a crisis.

John, it's usually just shy of 1 minute from no picture to picture during the reboot. But I agree, it is a little dangerous.

Why would cameras require nightly reboots? It seems to me that could be construed as a lack of trust in their own software.

Please don't ask me to explain Mobotix's approach :) but yes I agree seeing a default daily reboot is a yellow flag. On the other hand, Mobotix does have that's their secret :)

Our cameras also come out of the factory wth a default daily factory reboot. Tthis can be diabled/changed and the down time is about 30 seconds (20s for camera OS starup and ~10s for the VMS to restore the stream.

So I from your above comment, John - should I assume that this is likely not the "world's best approach" to automatica maintenance? If so I will speak to engineering to have that changed. Your opinions are highly appreciated as we are always looking for ways to improve.

Bohan, I am not a fan of undisclosed default reboots :)

If you have not seen our test results, see our IP Camera Bootup Shootout. Beyond the time it takes to reboot, there is always a risk that the VMS has an issue reconnecting or there is a delay (especially if you are using with 3rd party systems).

Ultimately, the decision needs to factor in the downside of running for long periods without reboots (maybe this causes performance issues, etc.). That said, every day seems to be overkill.

Mirroring the question above: why would a camera need a nightly reboot?

If the firmware or hardware build is that unstable, should I even be using that camera?

Btw, I got feedback from Axis. There is no default nor any recommendation for rebooting Axis cameras. Scheduling an event, if one wanted to, they could reboot the camera.

I couldnt agree more, an automatic reboot of anything is a bandaid to a bigger issue.

It depends on quality of software. If software has no memory leaks, then there is no any reason to reboot ever.

How many software applications never have any memory leaks? :)

Currently, programmers have many tools to debug such things.

But may be with embedded programming it is not so easy :)

The problem isn't how easy or hard something is to debug... the problem is getting the guys with the actual source code to do it in a reasonable time frame!

Well, this is not our problem at all. Our usual problem is to find where is leak and how to reproduce it.

We never recommend to reboot system, because in this case we'll never know that problem exists.

Matt, your comment is close to my heart. Every year I have to tell a few consultants and integrator service technicians that a reboot is a last resort, not a first action. Certainly for an industrial-grade product it should not be a daily action.

It is true that cable companies always instruct you to reboot your set-top box/DVR as the first action, for a number of reasons, some of which are valid technically, and some are financial. If rebooting fixes your problem, and it only happens once or twice a year, they'd rather not spend money on troubleshooting it. In a few years the box is likely to be replaced, given current technology trends. They are already working on the next generaion.

My Cisco-built cable box/DVR periodically gets automatic updates and reboots, usually in the middle of the night. Which is why I didn't even know about it for over a year after I got the box. I like sleeping.

But back to our topic, if you reboot a Windows machine you lose the machine (hardware/OS/application) state, and may lose log file data from applications that don't get it written to disk before the shutdown. I'm amazed at the number of technicians that don't check the Windows system logs for warnings and error messages, and I have found several VMS systems where there were hundreds of such messages in the troublesome server, never inspected.

The non-dianostic fix-by-reboot approach for Windows workstations and consumer machines is completely inappropriate for a critical server, especially one that is intended to be functioning 24/7.

But back to our topic, if you reboot a Windows machine you lose the machine (hardware/OS/application) state, and may lose log file data from applications that don't get it written to disk before the shutdown.

Potentially true if you're doing a hard reset. However, if you do a proper shutdown (or if software initiates it), Windows should close out its logs cleanly, and any properly-written software should also exit smoothly, closing out its logs along the way. Nothing should be lost if the system uses proper procedures.

I'm amazed at the number of technicians that don't check the Windows system logs for warnings and error messages, and I have found several VMS systems where there were hundreds of such messages in the troublesome server, never inspected.

I'm with you on that one. There's a wealth of information in there than can point the way to a wide variety of problems, if you just look.

Just the fact that this question needs to be asked, begs another.

Why isn't Linux more popular amongst VMS manufacturers? Would it turn off that many Integrators?

Even if we concede the fact that Windows is a lot more stable (compared to its own abismal past), and we disregard stability as a criteria for evaluation. There are lots of other pluses for Linux:

  • Better as a "headless" device. VMS Servers shouldn't have monitors attached anyways (IMHO), and Linux is basically designed to be run from the command line via utilities like SSH.
  • Easier remote management (especially over low bandwidth). Again - SSH and command line management.
  • More secure.
  • Everything is free. The initial install, future upgrades, more management tools than you can count, all free.

There is some learning curve with Linux, but distros like Ubuntu have made it pretty darn easy for the novice to get up and running.

Why isn't Linux more popular amongst VMS manufacturers?

Disclosure: I personally have used Ubuntu, Fedora, and Debian.

Wish I was driving around a chick-magnet like that!

A serious answer to "why not Linux?":

1. Linux isn't very approachable to non-techies. (Whether this is true or not is beside the point, it has the perception of being complex.) The guys championing Linux are the guys already using Linux.

2. The 'open source' nature can scare 'for-profits' and their ability to monetize pieces of code away. In earlier years, this argument was much stronger, but it still persists.

3. Linux-based talent is relatively difficult to find, compared to the myriad of Windows OS certification programs at the local community college or vo-tech.

4. Many entities have long standardized on Windows and have no compelling business or technology case to move to Linux. Even as evidenced by the poll results, a traditional 'weakness' of Windows (regular reboots) isn't all that common.

Most of the major VMSes were developed in another era when Windows was dominant. VMSes developed more recently (like NLSS and Network Optix) are not Windows based. Porting old ones to Linux would come at a huge cost.

More accurate to say that new ones are OS independent. And there is a demand for this. Computer today is not only Windows+Intel.

Matt, The statement "any properly-written software should also exit smoothly" does not always apply, such as when the software itself is in a loop that may or may not complete soon, and the user employs the "end task" function or the shutdown process simply terminates the non-responsive process. I have seen software applications that have actually been running and processing data and writing to disk have a temporarily (5 to 10 minutes?) non-respoinsive UI be terminated during the shutdown process, which to my understanding happens only if some parts of the software are not properly written.

My impression is that software in the physical security industry is overall getting better, but there are still some pretty poor software products out there in terms of bugs and bad design.

I think we're generally in agreement, with slightly different experience tracks. You are right that if a properly-written application initiaties a shutdown, all other properly-written software should respond correctly and all logs would be preserved.

As Yogi Berra is credited with saying, "In theory, theory and practice are the same. In practice, they’re not." Especially true in the security industry.

Matt, The statement "any properly-written software should also exit smoothly" does not always apply, such as when the software itself is in a loop that may or may not complete soon, and the user employs the "end task" function or the shutdown process simply terminates the non-responsive process. I have seen software applications that have actually been running and processing data and writing to disk have a temporarily (5 to 10 minutes?) non-respoinsive UI be terminated during the shutdown process, which to my understanding happens only if some parts of the software are not properly written.

Ray, thanks for making my point for me :-) Really, I don't see that this disagrees with anything I said above. Of course, there is no accounting for the human factor. That includes other user-installed software that may interfere with the proper shutdown of other well-written software.

You are right that if a properly-written application initiaties a shutdown, all other properly-written software should respond correctly and all logs would be preserved.

That's the key. I've seen DVRs loaded up with all kinds of crap by the users that cause all manner of problems. I've seen DVRs riddled with viruses and spyware because someone decided to use it for their general internet surfing. I've had a customer call to complain their system wasn't recording, only to find out their kid figured out the DVR computer was more powerful than the family's desktop machine, and was using it to download movies and burn DVDs, then leaving the DVR software offline when he was done. Again... the human factor.

As Yogi Berra is credited with saying, "In theory, theory and practice are the same. In practice, they’re not." Especially true in the security industry.

I've worked a lot of different jobs in a variety of disparate industries in my life - I can't think of one where this doesn't apply :)

If you are running a windows application that requires monthly or even yearly reboots then you need to find a better application. Reboots are unheard of on enterprise class applicatoons running a server based OS.

Christopher,

You state "Reboots are unheard of on enterprise class applicatoons running a server based OS." Perhaps you should talk to our IT department, who regularly send out emails discussing the need to reboot certain systems and warning users to log off or lose data.

The problem with Windows is that it has to be "all things to all men" - in other words it is designed for the broadest possible range of applications. As the vast majority of the target market for Windows is not real-time, Windows is not ideal for real-time applications. Apart from reboots and updates, typically you require third-party anti-virus and beefed-up firewall software.

However, for me one of the key weaknesses of Windows is how poorly it handles time synchrionisation. Even Microsoft's own knowledge base on Windows virtually says that it isn't recommended for time-critical applications. Thus Windows handles NTP quite poorly, updating once per week typically. It also can't act as an NTP master server very well at all. Again, third party software is required to make that work and to force the PC to update the time-sync much more frequently. It always astonishes me how little attention is paid to time-synchronisation in security applications, as a key use of security systems (especially video surveillance) is evidence gathering. Accurate and synchronised time across all the servers and embedded systems is therefore essential.

Linux, on the other hand, can be configured by skilled manufacturers to do just the job required extremely well and extremely reliably. I'm not suggesting that Linux is immune to viruses of course, but the simpler the OS, the easier it is to secure. It is also very flexible with regard to NTP time synchronisation.

I completely agree with Brian that engineers with extensive Linux skill-sets are harder to come by and I'm very thankful that we have a fantastic team in that regard.

"Thus Windows handles NTP quite poorly, updating once per week typically."

You know, that can be changed very easily, and it doesn't take a skilled engineer - just a single registry tweak, which can also be made via a variety of tweak utilities.

One thing about being "all things to all men", it also means a lot of tweaks, changes and enhancements can be performed "by all men" without needing a BSc in computer science.

How to do it.

Assuming you are talking about time drift, some Lunix based IP cameras suffer the same problem. Also, the server time could be set to 1776 and it wouldn't matter because the video file transmitted from the IP camera includes date/time information in the header.

We have some servers (Non VMS) which have been up for years. As said, if depends on the apps and how well they are written, and how healthy the OS build is. Generally the simplier the OS build (less apps, less changes), the less to go wrong. As you move into a multi server environment, simple automatic reboots may not work, as you may need to sequence your server start ups to get them up clean (as some have reliance on others being up first).

Speaking of uptime... years ago I worked in IT for a local crown corporation. They had a "Special Investigations" office (fraud investigations, mainly) that had a half-dozen OS/2 workstations on their own private LAN with an OS/2 server, completely detached from any other networks. When the time came for them to move into a new facility, being the resident OS/2 guru, I was tasked with shutting it all down and prepping the systems for the move.

Only problem was... nobody knew where the server lived. Tracing the cables was getting frustrating because of all the other furniture and fixtures hiding things. I told them to call me back when the filing cabinets and such items had been removed...

...and there I found it, in a back corner: and old IBM PS/2 Model 90 monster tower, what had been a top-of-the-line 486 when it was installed. Found the keyboard and mouse tucked in the corner with it... hooked up a monitor, went in to shut it down... checked the uptime, just out of curiosity... I don't recall the extact number, but it was well over 1500 days(!!!).

I downed the machine, pulled it out... saw it looked a little dusty, so I opened it up... it was PACKED FULL with dust bunnies. Top to bottom, front to back. Thing had been running in a corner for over four years straight, totally choked with dust, and still working flawlessly. Fortunately it was an EARLY 486 that didn't have a CPU fan - just the PSU fan and one case fan.

Of course, it also helped that this thing ran ONLY their specific software, no extra crap or games or BS that is typically the root of system instability.

Good comment this morning!

To Matt's point about being filled with dusty bunnies, I ran across an even more extreme examples recently. Check out this one filled with roaches.

John, back in those days, I remember another story about a Model 90 server that techs couldn't find until they followed the network cable... and found it sealed INSIDE a wall - drywallers had simply boarded over it several months previous, and there it sat, chugging along. This may sound like an urban myth (how could they not notice the computer in there??), but given some of the $#!t I've seen drywallers do, it seems completely plausible to me.

Brian, we did a shootout of IP camera bootup times. In our test, slightly over 1 minute was average. This excludes the time / any delay in the VMS reconnecting, which could add tens of seconds or maybe a few minutes depending on how often the VMS reconnects and any issues it might have reconnecting.

The timing issue also depends on how recorders handle timing. Many recorders simply time stamp the video when they receive it, regardless of what the input time is. This eliminates any concerns with cameras time being off but exposes the risk of time being off (at least slightly) if there are network delays. See our post on Resolving IP Camera / VMS Time Synch Problems.

I'll usually just turn off the time OSD on cameras and let the DVR handle the timekeeping - besides the fact my cameras are usually on a private LAN with no access to an outside time server, they can too easily drift differently and become out of sync with the DVR and each other, creating even more confusion. At least if the DVR handles all the timestamping, the playback of individual channels will all show the same time.

There's also the issue with server-side motion detection, that a ticking clock on the camera's display will keep the MD triggering constantly, sometimes requiring a fair-sized area to be masked out. Eliminating the OSD eliminates that problem entirely.

Of course, this is pretty much the only course if you're dealing with a hybrid recorder, where analog cameras need to be sync'd as well.

We've had issues with lock up and the system looked like everything is normal, when we rebooted found out system got stuck in endless loop and the displays never changed only the hardrives were affected.

Had an issue with the intel chipset where it came faulty and intel had to reboot, cost me 2 days of time, no recompense for the headache.

Had an issue where the out puts to the monitors would look normal and the program was recycling information and barely found this problem.

For the most Part if you are reviewing and logging in on a regular basis you will find them before they become an issue. It is when the owner does not want to pay for tech services that things slip by. I would like to hear everyones take on linux Vs Windows , or windows 8 vs 7

We reboot when virus updates are applied. This is a requirement of a govt customer. It actually causes more problems for us than it solves.

This happenes obout once a month.

Your survey is missing an option for "When I apply OS patches". Our windows servers and VMS are pretty stable. We install OS patches every quarter. Specific reboots for VMS issues are rare, maybe 1-2 a year.

My limited knowledge of Linux makes be believe that OS patches generally don't require reboots. Microsoft Windows patches normally need a server reboot. This seems to be one of the biggest factors over how often a server needs rebooting.

My limited knowledge of Linux makes me believe that OS patches generally don't require reboots. Microsoft Windows patches normally need a server reboot. This seems to be one of the biggest factors over how often a server needs rebooting.

Linux services and programs run in something called 'user-space', which is basically protected memory and execution outside of the kernel.

Because most software and system services reside in these areas, Linux and Unix machines can be updated and patched by just restarting affected services.

Linux kernel changes/upgrades DO require a reboot. It's really the only situation that requires it. Even then, it's only for the kernel itself, and not modules, which are hot-pluggable, and independently restartable depending on the nature of the module.

Windows, on the other hand, has only recently moved video and printer drivers outside of the kernel-space. These were the primary reason behind restarts.

The other reason Windows requires restarts for patches is due to open files. Instead of handling these files intelligently, Windows just assumes that if it's open, and is getting replaced for a patch, to just queue it and handle it on reboot.