AFAIK no one ever asked Clapper (or any other NSA representative) to give a precise operational definition of the term "metadata". I don't want to hear what someone else thinks "metadata" is, I want to know how NSA operationally defines "metadata" internally and precisely when and where the usage(s) are made.
Voice:
I am on the telephone with you. My actual voice is the data. Is the digitization of my voice not metadata? After all, natural voice =/= digitized voice. So it seems that digitized voice could be, to the NSA, metadata and therefore subject to capture.
I know the NSA has software/hardware to translate digitized voice to text. Suppose all phone conversations are converted to text. Then is that text therefore not data but metadata? Once again, it seems that the text of all phone conversations could be, to the NSA, metadata and therefore subject to capture.
Fax:
I send a fax to you. What's on the sheet of paper is data. But is the digitized and compressed sequence of bits sent to you not metadata? It certainly is not the same as the original data. So it seems that the digitized fax could be, to the NSA, metadata and therefore subject to capture.
Still, were that not so, the NSA has OCR that can read handwritten or typed text from a fax and convert it to text. Is the converted text not metadata? It is certainly different from the original data. So it seems that the converted text from the fax could be, to the NSA, metadata and therefore subject to capture.
E-mail:
I e-mail you. The text I see before me is data. Once I press the send key, it is compressed through various software/hardware for rapid transmission as it passes over the TCP/IP/phone network. Is not that compressed text also metadata? Therefore it seems that e-mail content could be, to the NSA, metadata and therefore subject to capture.
IOW once data is processed through any transformation whatsoever then it could be defined as "metadata". Again I don't know NSA's exact definition (used internally) of what metadata is. There may be many different definitions. But since NSA's representatives raised the use of the term w/o defining it (and then used the metaphor of a library to draw questioners off track) I believe that they should be questioned again about precisely what "metadata" means when and where. Furthermore the question of what precisely the NSA captures should be driven to ground thoroughly.
Personal belief:
NSA considers the product of any transformation of data to be metadata. The NSA captures all voice, fax, e-mail, chat, sms, etc. and maintains it in digitized form. Regardless of origin, all communications end up in text format (with pointers back to the original digitized form) which is then subjected to semantic/content analysis (scanning for naughty words), and optionally (under control of an analyst) to social network analysis and AI software that analyzes events, objects, actors, their intentions, and possible variations in interpretation of the text.
Your request to have them explicitly define what they are designating as metadata is wise. They may have a completely distorted concept of metadata that represents a drastic departure from any sane definition.
But the definitions you've put forth are completely off target, with respect to the layman's ordinary, rational concept of metadata. Conceptually, metadata forms a map of relationships between actual examples of data, in the sense that the wires between the lightbulbs are the metadata, while the lightbulbs are the data. Your hypothesis is that someone might propose that only the light emitted from the bulb is the data, and that all other phenomena beyond that are the metadata, so, if you take a picture of the lightbulb while it's switched on, and mark the time, the timestamped photo is the metadata, only because it recorded measurements of the intensity of the photons emanating from the bulb, and did not capture and retain the actual photons themselves (all else, aside from the photons being fair game). No one in their right mind would ever build such an absurd mental model.
The reality is that anyone proposing concepts like the ones you mention, is simply lying through their teeth. Thus, why would you want anyone like that to speak a single word?
If that's their version of the truth, and they seriously believe that's a representation of honesty, it's not worth listening to them.
If they know it's a lie and try to sell the lie anyway, it's not worth listening to them.
If they know what the reality is, but lie and provide the rational definition of metadata, regardless of how inaccurately it aligns with the truth, it's not worth listening to them.
The only thing you'd gain from hearing them speak to their belief of how metadata is defined, would be if you compare what they say to the actually evidence that proves the reality, and assess how warped they are, and how much they lied.
I don't believe the layman has an "ordinary, rational concept of metadata".
The NSA's library metaphor was well-chosen: loose enough to possibly explain but complex enough to mislead. It derailed the conversation.
But I see no utility in the light bulb metaphor you present, except possibly to mislead as most metaphors can do. I would never use it in this context.
The point is to eliminate the metaphors. "Just the facts, ma'am." as Dragnet's Sgt. Friday (didn't exactly) always says.
"why would you want anyone like that to speak a single word?"
To find the truth. If not, to reveal those who lie under oath. To eliminate the metaphors and replace them with facts.
Voice:
I am on the telephone with you. My actual voice is the data. Is the digitization of my voice not metadata? After all, natural voice =/= digitized voice. So it seems that digitized voice could be, to the NSA, metadata and therefore subject to capture.
I know the NSA has software/hardware to translate digitized voice to text. Suppose all phone conversations are converted to text. Then is that text therefore not data but metadata? Once again, it seems that the text of all phone conversations could be, to the NSA, metadata and therefore subject to capture.
Fax:
I send a fax to you. What's on the sheet of paper is data. But is the digitized and compressed sequence of bits sent to you not metadata? It certainly is not the same as the original data. So it seems that the digitized fax could be, to the NSA, metadata and therefore subject to capture.
Still, were that not so, the NSA has OCR that can read handwritten or typed text from a fax and convert it to text. Is the converted text not metadata? It is certainly different from the original data. So it seems that the converted text from the fax could be, to the NSA, metadata and therefore subject to capture.
E-mail:
I e-mail you. The text I see before me is data. Once I press the send key, it is compressed through various software/hardware for rapid transmission as it passes over the TCP/IP/phone network. Is not that compressed text also metadata? Therefore it seems that e-mail content could be, to the NSA, metadata and therefore subject to capture.
IOW once data is processed through any transformation whatsoever then it could be defined as "metadata". Again I don't know NSA's exact definition (used internally) of what metadata is. There may be many different definitions. But since NSA's representatives raised the use of the term w/o defining it (and then used the metaphor of a library to draw questioners off track) I believe that they should be questioned again about precisely what "metadata" means when and where. Furthermore the question of what precisely the NSA captures should be driven to ground thoroughly.
Personal belief:
NSA considers the product of any transformation of data to be metadata. The NSA captures all voice, fax, e-mail, chat, sms, etc. and maintains it in digitized form. Regardless of origin, all communications end up in text format (with pointers back to the original digitized form) which is then subjected to semantic/content analysis (scanning for naughty words), and optionally (under control of an analyst) to social network analysis and AI software that analyzes events, objects, actors, their intentions, and possible variations in interpretation of the text.