I think of full-featured pattern matching as a fairly recent addition even to Lisp. Lisp has had some pattern-matching constructs forever of course (cond, destructuring-bind, Norvig's sexp matcher from PAIP [1], etc.). But it's only with the more recent emergence of optima [2] as a de-facto standard that it now has really good pattern matching. It was probably the #1 thing I missed in Lisp, after having used ML a bit, until optima came along.
AFAIK pattern-matching is not a historical Lisp feature, it's usually been limited to simple destructuring.
In the modern acception of tree patterns (as opposed to text patterns aka regular expressions), I guess it comes from ML (and possibly prolog but prolog's unification goes even further?): it doesn't look like ISWIM had tree patterns and I can't find older references.
In Lisp pattern-based programming has been first implemented in 1962 in by D.Bobrow (METEOR). From then on there are many implementations of pattern matching in Lisp based software, from Planner, to rule-based systems, LISP70 (Tesler, ...)...
This makes me sad. As somebody who started a CS degree back in the early 80s (finishing in the late 80s), and who had a few glimpses of languages which didn’t suck, the switch down to “everything is an 8086 running C code, if not x86 assembler” was an incredibly destructive event in this industry.
I was going to specialize in AI in my major, but I guess it was about 25 years too early. But that’s another tangent.
> AFAIK pattern-matching is not a historical Lisp feature
Aside from full packages like what lispm mentioned, implementing pattern matching (and later, full Prolog-style unification) in Lisp is a very common exercise in beginner Lisp textbooks, going back decades.