there's two ways to do this; you can do this using set theory (which is what chomsky did, and has precursors in some of the more absurd philosophers that chomsky would go on to criticize, like badiou), or by using functions, which is more modern and minimizes the use of the cumbersome and maybe problematic (see axiom of choice) components in set theory. so i didn't exactly study chomsky directly in computer science class, but the conversion of his theory of grammar into a modern framework, using mappings and functions. mathematicians nowadays would generally prefer the language of functions over the theory of sets.
i did, however, a take a math course in automata theory at the graduate level that was built almost directly on chomsky's work, which was different than the computer science courses that used chomsky's hierarchy as a foundational part of the theory of computation.