it sounds like his initial invoice was quite clear in the work completed, then updated at the client's request. So while you can argue moral grounds for not doing this work, I don't think there's illegality, i.e. conspiracy.
I mean if you are a professor and knowledgeable in how the startup uses the data, it’s hardly justifiable that “oh crap i didn’t know they were using it for illegal purposes”.
This is spoken to [in the full complaint][1]. The data scientist was told Frank really did have 4 million users, and the scientist only needed to generate this "synthetic data" as a way to "anonymize" their "real" data. I.e. the scientist was duped:
JAVICE told Scientist-1 [...] that she had a database of approximately 4 million
people and wanted to create a database of anonymized data that mirrored the
statistical properties of the original database (the “Synthetic Data Set”).
[After JAVICE sends Scientist-1 the data], Scientist-1 understood that the data
available via the Access Link Email -
**a data set of approximately 142,000 people** (emphasis added) -
was a random sample of a larger database which contained data for approximately
4 million people. In fact, that data represented every Frank user who had at
least started a FAFSA.