RTP is the fundamental media transport protocol in WebRTC (which is not a protocol, but rather a suite of protocols working together in a defined way). Basically all web videoconferencing uses RTP already.
Okay it's funny because I had vaguely remembered something like that, but a "C-f rtp" on the wikipedia page for WebRTC didn't yield any hits, even though SIP (over websockets) was prominently mentioned, so I just figured the similarity in RTC and RTP had caused me to misremember.
I assume WebRTC includes STUN/TURN/ICE (negotiated over SIP?) then for traversing NATs? The last time I was really into networking was 2001-ish so that stuff was still around the corner, but I kept up with my reading for a few years after that. I also had some of these acronyms refreshed when setting up Jingle, which uses XMPP instead of SIP, but establishes an RTP connection much like traditional VOIP would use.
WebRTC doesn't proscribe a signalling mechanism at all. SIP is sometimes used, Jingle (XMPP) sometimes, or sometimes it's just a custom protocol exchanging SDPs (or equivalent structures) over WebSockets or a REST API.
WebRTC itself is RTP (DTLS-SRTP), ICE (incl. STUN/TURN), codecs & related parameters, capture mechanisms, all bundled up into a Web API.