RTSP — Real Time Streaming Protocol

RTSP

RTSP controls multimedia streaming sessions — it is the remote control for streams, not the stream itself. Security cameras, IP video systems, and media servers use RTSP to negotiate, start, pause, and stop streams, while the actual audio and video data flows separately over RTP.

applicationrtspstreamingrtpsdpipcameramediarfc7826

Overview

RTSP (Real Time Streaming Protocol) is a network control protocol for media streaming sessions, defined in RFC 2326 (1998) and updated in RFC 7826 (2016). A common description: RTSP is the remote control; RTP is the television signal.

RTSP itself carries no media data. It handles session control — establishing, pausing, seeking, and tearing down streams. The actual audio and video data flows via RTP (Real-time Transport Protocol) on separate ports, with stream parameters negotiated using SDP (Session Description Protocol) embedded in RTSP messages.

Primary use cases:

Port 554 (TCP) for RTSP control. RTP uses dynamically negotiated UDP ports (typically in the range 1024–65535).


RTSP Methods

RTSP syntax is deliberately similar to HTTP — a request line, headers, and optional body. The methods map to media control operations:

MethodPurpose
OPTIONSQuery which methods the server supports
DESCRIBEGet stream description (returns SDP)
SETUPEstablish the transport channel for a stream
PLAYStart or resume stream delivery
PAUSEPause stream delivery
TEARDOWNEnd the session
ANNOUNCESend stream description to server (for recording)
RECORDStart recording

RTSP Session Flow

Client (VLC / NVR)
RTSP Server (IP Camera)
OPTIONS rtsp://camera.local/stream RTSP/1.0
What do you support?
200 OK — Public: OPTIONS, DESCRIBE, SETUP, PLAY, TEARDOWN
DESCRIBE rtsp://camera.local/stream RTSP/1.0
Give me the stream parameters
200 OK — Content-Type: application/sdp [SDP body: codec, resolution, bitrate]
Stream description (H.264, 1920x1080, etc.)
SETUP rtsp://camera.local/stream/track1 Transport: RTP/AVP;unicast;client_port=12340-12341
Set up video track — I'll listen on UDP 12340/12341
200 OK — Session: 1234567 Transport: RTP/AVP;unicast;server_port=5004-5005
Session established — I'll send from UDP 5004/5005
PLAY rtsp://camera.local/stream RTSP/1.0 Session: 1234567
Start the stream
200 OK — RTP-Info: url=...
Stream starting
RTP packets (UDP 5004)
Video data flowing continuously
TEARDOWN rtsp://camera.local/stream RTSP/1.0 Session: 1234567
Stop the stream

SDP — Session Description Protocol

The DESCRIBE response body contains an SDP payload describing the stream. SDP is a text format (RFC 8866) with one attribute per line:

v=0
o=- 1234567890 1234567890 IN IP4 192.168.1.100
s=IP Camera Stream
c=IN IP4 0.0.0.0
t=0 0
a=control:*
m=video 0 RTP/AVP 96
a=rtpmap:96 H264/90000
a=fmtp:96 packetization-mode=1; sprop-parameter-sets=Z0IAH...;
a=control:track1
m=audio 0 RTP/AVP 97
a=rtpmap:97 MPEG4-GENERIC/16000/1
a=control:track2

The m= lines declare media streams (video and audio). a=rtpmap specifies the codec and clock rate. The client uses this to set up appropriate decoders before the RTP data starts flowing.


Transport Modes

Unicast (most common): Client and server negotiate client-side RTP ports in the SETUP request. The server sends RTP directly to the client’s IP and port. Standard for point-to-point camera viewing.

Multicast: The server sends a single RTP stream to a multicast group address. Multiple clients join the multicast group to receive the stream. Used in IPTV and large-scale video distribution where the same stream goes to many viewers.

RTP over TCP (interleaved): Instead of separate UDP ports for RTP, the video data is multiplexed directly into the RTSP TCP connection using interleaved binary data frames. Required when the client is behind NAT/firewall that blocks inbound UDP. Specified with Transport: RTP/AVP/TCP;interleaved=0-1 in SETUP.


RTSP URLs

RTSP resources are identified by URLs with the rtsp:// scheme:

rtsp://camera.local/stream
rtsp://admin:[email protected]:554/live/main
rtsp://nvr.internal/cameras/cam01/substream

Most IP cameras require authentication embedded in the URL or via RTSP-level Authorization headers. Digest authentication is common; basic authentication sends credentials in base64 (use only over a secure network).

RTSPS (rtsps://, port 322) wraps RTSP in TLS — used when transporting streams over untrusted networks. The RTP data can also be encrypted (SRTP) when security is required.


ONVIF and IP Cameras

ONVIF (Open Network Video Interface Forum) is an industry standard for IP camera interoperability. ONVIF Profile S requires cameras to support RTSP for live video streaming. The RTSP URL format for ONVIF cameras is typically discoverable via ONVIF’s SOAP-based discovery mechanism (WS-Discovery on UDP 3702).

Common RTSP URL patterns for popular camera brands:

rtsp://<ip>/stream1          ← Hikvision main stream
rtsp://<ip>/live/ch00_0      ← Dahua
rtsp://<ip>/video1           ← Axis
rtsp://<ip>/11               ← Reolink

Key Concepts

RTSP controls; RTP delivers

RTSP is stateful and bidirectional — the server remembers the session state. RTP is unidirectional and largely fire-and-forget. They work together: RTSP establishes the parameters, RTP carries the payload. This separation means the control channel (TCP 554) stays small while the data channel scales with bitrate.

NAT traversal is the hard part

IP cameras behind NAT present a challenge for remote RTSP access. The camera’s private IP is not reachable from outside. Solutions: VPN to the local network, RTSP proxy/relay, or RTP over TCP through an SSH tunnel. Exposing raw RTSP port 554 directly to the internet is inadvisable — most IP cameras have weak authentication and unpatched firmware.


References