It depends on the software. Milestone, and I suspect most any popular VMS, is not very CPU hungry on the recording side.
The two biggest consumers are motion detection and transcoding. But we all have strategies to minimize CPU use for motion detection. Keyframes only, ignoring a percentage of pixels, or some pull a low resolution substream for motion detection.
The result is that for small systems like this, an i3 is adequate, so long as you are not running a client on the same machine as the server.
Personally, I wouldn't go i3 because the cost difference between i3 and i5's is negligible and why not get more bang for your buck and a little margin for error or growth.
Side note: The XProtect Advanced VMS codebase utilizes hardware acceleration on the recording server now, so motion detection is done using Intel Quick Sync Video when available. It would probably sing pretty well on an i3 with the OP's project spec.