It's great that you are engaging and writing about this, many thanks.
While your blog is interesting, it doesn't change the impact of the Terms of Service as currently written. They seem to give you the freedom to train your current and future AI/ML capabilities using any Customer Content (10.4), and your terms apparently have your users warrant that doing so will not infringe any rights (10.6).
Perhaps your terms of use should reflect your current practices rather than leaving scope for those practices to vary without users realising? Will you be changing them following all this feedback?
Following up on this point, we’ve updated our terms of service (in section 10.4) to further confirm that we will not use audio, video, or chat customer content to train our artificial intelligence models without your consent.
This addresses concerns about Zoom Video Communications, Inc. itself using e.g. recordings for purposes of training their own AI models. It does not address the potentially much greater risks arising from the company potentially selling access to the collection of zoom recordings to other companies for purposes of training AI models of such other companies. Here’s a somewhat-in-depth analysis: https://zoomai.info/
Thanks for following up, Michael, it is much appreciated. It does leave me (and judging my adjacent comments, also others) with questions, including:
* That wording seems very specific - is there a reason you did not just say "we will not use Customer Input or Customer Content to train our AI" given you have defined those terms? Are you leaving scope for something else (such as uploaded files or presentation content) to still be used?
* Can you also clarify exactly which (and whose) "consent" is applicable here? In meetings between multiple equal parties there may not be any one party with standing to consent for everyone involved. Your blog post seems to assume there can be, but the ToS don't appear to define "consent".
Thanks for commenting. The issue is not with using AI features though - it is with the Terms granting you unrestricted and eternal use to our conversations to train your AI and potentially disclose our work to your other customers.
Well said. Zoom thinks we are not talking about the terms of service as it pertains to a particular feature and not their entire rights moving forward.
With this election, AlmaLinux becomes the only major CentOS Linux replacement distribution to be solely owned and operated by its community of developers and users.
According to the OSI, while the US government needs to clarify exactly what it was intending to do relating to open source freedoms, in the Tornado Cash case the onus is on the cryptocurrency community to show they can abide by money-laundering law rather than appear to intentionally evade it.
Most harm from GDPR to small business actually happened thanks to armies of "GDPR consultants" who smelled money in the market and, in my experience, often heavily skewed perception in order to increase their own profits.
Majority of small business never have to deal with GDPR unless they gather personal information - and for most of them the limited cases that a small business might gather information were already covered by previous laws, and for most cases can be done on simple process of human action.
Data privacy laws applied whether you used computers or not, and GDPR doesn't change that.
Ultimately, the simplest way of not having to deal with GDPR is to not store private data, or storing the minimal possible set. The archetypical smallest possible business already outsources a lot of such operations to aggregate eShop vendors, which are a good place to place controls. Those who run B2C sales directly in ways that require private information can spend a bit of time on getting compliant, generally with a set of cookie-cutter processes.
The definitions of personal data are highly ambigious. Perfectly normal processes that people would not normally consider collecting personal information (like maintaining a normal HTTP access log) do count because of things like claiming an IP Address is personal information.
hell everything about the law is highly ambigious. Let's say somebody runs a web forum where people can post messages. We are using one right now. Lets say no personally identifiable data is deliberately collected. No use of email addresses as usernames, or even to offer a forget password flow. The user names are arbitrary. There is no IP address logging of any form even to combat abuse. All posts are publicly available, so users don't need to sign up unless they want to post, and obviously posting is entirely voluntary, so obviously consent is no issue.
Even under this scheme, which is seeking to minimize all compliance burdens of the GDPR while still offering a forum, the site still needs to offer data dumps from users on demand, since their screenname is quite plausibly linkable to their real life identity.
And how do you implement the right to be forgotten. You can implement it as a mass delete of that user's posts, but that is probably not good enough. The problem is that many forums either support or have a user convention of quoting parts of previous messages (including screen name) so people know exactly what you are trying to reply to.
If commonly used, then that defeats mass deletion as a proper implementation of the right to be forgotten, since large chunks of the user's posts (associated with the user's screenname) will be left in replies. Filtering that out may range from easy to difficult, or even to being a completely manual process (which wouldn't work well for certain super prolific posters).
And this is before considering the fact that the law does not let you off the hook if you fail to find some personally identifiable data for that data dump because it was something somebody else posted as unstructured text in the body of a post. It could just be a phone number and address pairing, with no reference to the screen name or anything. No way are you going to reliably find that, but you are still technically liable if a user discovers you had that data and missed it when they requested the data dump. The regulators may decide not to really enforce such edge cases, but they could. They even might enforce that against you if the regulator decides they want to send a message, and crack down hard on violations.
Consider instead a video provider, and there is personally identifiable information about a different user mentioned in some uploaded video. Are regulators going to expect you yo be able to find that video in response to a subject access request? Surely not. But what if you are YouTube? The answer then could very well be quite different, since google could easily search their autogenerated closed captions to potentially find this video. They certainly could be doing that already to augment their profile of you. (I doubt they are, but it is definitely not impossible).
Very very few business never collect any personally identifiable information. Even if they try, they will likely collect some accidentally, like if a consumer calls with a question which cannot be answered right away. A note of some form will be taken that will probably have a name or phone number, so that once an answer is found, it can be provided back to the caller. A B2B business cannot avoid this either, since they still deal with individuals in the other businesses. While I'm sure the majority of businesses just ignore this, since the GDPR is really only targeting mass data collection, not little incidental bits here and there, but the terms don't actually make any such distinction.
The GDPR depends a lot of regulators exercising discretion in terms of not going after such tiny violations. Which is not really a problem per se, but does mean greater legal uncertainty in cases that involve some slightly gray areas. At least in contrast to the often far more black and white hyper-specific legislation often found in civil law countries, or even the eventual binding precedent over ambiguous laws set in common law countries. The main avenue for getting such certainly with the GDPR is the enforcement harmonization system in the GDPR which will likely take several more decades before many common gray areas have been definitively settled.
>Your experience is not shared by small businesses
As I own a small businesses I know for a fact that is not so, as I just stated. Maybe some do and some don't but you cannot claim it as universally so. Which small businesses are you talking about? Your own or something you read online?
To put it concisely, “why is having a patent license until you commence litigation worse than not having one at all?” The answer is subtle.
Many corporate lawyers operate on the assumption that all open source licenses that do not mention patents (BSD, MIT etc) implicitly grant a patent license. Clarifying this ambiguity is seen by them as harmful — that’s why approval of CC0 at OSI was abandoned[1], for example. Including an explicit patent grant removes the possibility this could be argued in court and is seen as an escalation of the patent conflict by Facebook.
Given many voices at Apache are being quietly guided by corporate counsel, this seems the most likely underlying explanation for the antipathy that's been rationalised out into the open.
As far as I can see the files that were originally under MIT still have their original copyright statements intact, as comments above illustrate. Which files are you alleging have had their credits removed from?
I wrote the article you're complaining about, which is actually mainly about making hybrid PDFs. I did not see your article, let alone copy it; I would have linked to it if I had. I have been working on that piece for a few months off-and-on and pitched it as a "lightning talk" at FLOSS UK Spring 2012 in Edinburgh beforehand.
More than that, I was a manager of the team that created the hybrid PDF feature in OpenOffice at Sun, and have been advocating avoiding editable attachments for years - the earliest I can find on my blog is http://www.webmink.net/2003/07/feature-creep.htm but I am pretty sure I was advocating it before.
The web is a big place where there are often people working on the same ideas as you (which is why software patents are a travesty), and I recommend avoiding accusing people of incompetence without a little more research.
Going further than that, given Google hired a number of staff that had worked on the Java implementation at Sun, and given that the head of Google was present at Sun as the Java patent-and-copyright trap was being constructed, it seems inconceivable to me that Dalvik would have been permitted to violate Sun's patents.
In a rational world and under a reasonable patent system, I might agree. But you know as well as I do, Simon, probably better, that under the current process it's difficult to guarantee that you're not violating anyone's patents. Particularly when you're reimplementing an existing system.
Is it possible that Oracle's patents don't read on Dalvik? Certainly. Did Google take care to minimize the risk of such? I'm sure they did.
But however careful the execution, the system at present is would be actively working against them.
While your blog is interesting, it doesn't change the impact of the Terms of Service as currently written. They seem to give you the freedom to train your current and future AI/ML capabilities using any Customer Content (10.4), and your terms apparently have your users warrant that doing so will not infringe any rights (10.6).
Perhaps your terms of use should reflect your current practices rather than leaving scope for those practices to vary without users realising? Will you be changing them following all this feedback?