The problem with this is that it is a file, an mp4 format, that is being streamed. It may be that the decoding time stamp (DTS) and presentation time stamp (PTS) (see tutorial) do not reflect real-time disparities that might arise when streaming a file. vs. a camera's streaming feed of h264 encoded bytes. While helpful, I'd like to have a real-time camera to test against.
For example, when running ffmpeg against my Reolink camera, I occasionally get:
[h264 @ 0x556acd1b4c80] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1 [null @ 0x556acd1922c0] Application provided invalid, non monotonically increasing dts to muxer in stream 0: 1770 >= 1770 [h264 @ 0x556acd260c80] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1 [h264 @ 0x556acd27d500] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1 [null @ 0x556acd1922c0] Application provided invalid, non monotonically increasing dts to muxer in stream 0: 1773 >= 1773 [h264 @ 0x556acd299cc0] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1 [h264 @ 0x556acd2b6480] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1 [h264 @ 0x556acd2d2c40] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1 [null @ 0x556acd1922c0] Application provided invalid, non monotonically increasing dts to muxer in stream 0: 1776 >= 1776 [h264 @ 0x556acd1acf80] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1 [h264 @ 0x556acd1e4340] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1 [h264 @ 0x556acd1be400] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1 [null @ 0x556acd1922c0] Application provided invalid, non monotonically increasing dts to muxer in stream 0: 1779 >= 1779 [h264 @ 0x556acd1b4c80] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1 frame= 1518 fps= 26 q=-0.0 size=N/A time=00:00:59.64 bitrate=N/A speed=1.04x [h264 @ 0x556acd260c80] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1 [h264 @ 0x556acd27d500] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1 [h264 @ 0x556acd299cc0] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1 [null @ 0x556acd1922c0] Application provided invalid, non monotonically increasing dts to muxer in stream 0: 1782 >= 1782 [h264 @ 0x556acd2b6480] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1 [h264 @ 0x556acd2d2c40] nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 1
I tried running against the Bunny URL above and there were no such "monotonic" errors. Having a real live camera providing an RTSP stream will help isolate where the problem may be arising.
While one might jump to the conclusion that the problem is arising from Reolink's server (which I believe uses LIVE555 Streaming Media v2013.04.08), I have it on the authority of Scott Lamb, a Google engineer who has demonstrated expertise in streaming video that the problem actually lies in FFmpeg's code. See poor behavior when camera has audio enabled · Issue #36 · scottlamb/moonfire-nvr · GitHub