Launch HN: Credal.ai (YC W23) – Data Safety for Enterprise AI

javidlakha · on June 14, 2023

Congratulations on the launch! (I'm a beta user.) Audit trails, access controls and security certifications are big headaches when developing in regulated industries. Having these already set up has made it easier for us to experiment with and build on LLM APIs.

jackfischer · on June 14, 2023

Many thanks! There is a lot of opportunity for LLMs lurking in regulated industries right now - glad to have given you a boost!

pseudonymouspat · on June 14, 2023

Go Ravin and Jack! We're not at sufficient scale to really get use of this product, but would love to try it down the road. Are you using Foundry for data integration and ACLs?

jackfischer · on June 14, 2023

Thank you! We haven't gone down the Foundry route yet. We do have some smaller scale apps and companies using Credal either as their AI API or chat platform respectively - would be interested to hear a bit about your use case and see if it's a match?

pseudonymouspat · on June 14, 2023

Word- we're in the thick of it but I'll reach out once we're ready to start thinking through bringing in chat.

r_thambapillai · on June 14, 2023

Stay tuned!! Right now no, but Foundry is definitely under consideration

aiappreciator · on June 14, 2023

This seems like a very promising product, a ChatGPT interface, enterprise version, is a large gap in the market.

However, this part in your advertising, sounds very dubious:"Credal can be deployed fully on-premise for Large Enterprises, including the large language models themselves"

What do you mean the LLMs themselves? Open source I can understand, how are you going to move GPT-4 to on prem? OpenAI is not giving you the weights.

r_thambapillai · on June 14, 2023

Thanks for your encouragement and that is a totally fair criticism - when we say that, we mean two things:

1. We support using Credal with your own, open source LLM, which can of course be fully on prem in every sense

2. We also support using Credal with your own Azure OpenAI instance. As you say, OpenAI aren't giving us the weights, but many of our customers have procured Azure OpenAI from Microsoft and then we point GPT 4 usage at their Azure instance, meaning that the data never goes to Open AI at all.

One of the things that's going to be really interesting to see moving forward is whether the open source models are going to be able to compete with the blistering pace and funding that the closed source ones - Bard, Claude, and GPT-X are going to be able to attract (and maybe Mistral?). For the sake of the industry, I really hope that the OS models catch up but given the amount of funding (and now in OpenAI's case, revenue) the closed source models are generating, its hard to see how that happens

kriro · on June 14, 2023

Congrats, I think this will be really successfull and you got a very early food in the door.

Do you consider self hosted LLMs a competitor of sorts? I suppose your premise is if a company uses Google Docs they will also likely never host internal LLMs, right?

r_thambapillai · on June 14, 2023

Thanks so much! So about half of our enterprise customers use Credal in conjunction with either a self hosted LLM or Azure OpenAI (which you can debate, but most companies we've spoken to seem to treat their Azure OpenAI instance as equivalent to self hosted). In practice, you still need to:

1. Manage the permissions of making sure the self hosted LLM is only reading from the documents, slack channels, e.t.c. that the end user should actually have access to

2. Generate an audit log of exactly who did what when

So we actually see self hosted LLMs being a big part of how Credal is used! In the long term, we think Credal will actually become a tool for AI app developers to safely request access to data & embeddings from the enterprise on the fly, and make sure the data they get is appropriately controlled and the audit logs exist in a single place for the enterprise to see what data went to whom/when/why etc

mritchie712 · on June 14, 2023

Going thru the SOC2 process myself[0].

As I expected, we're hearing from customers they won't use a product that passes the contents of their database tables into an AI model (although some AI products are doing this). So the problem Credal is solving makes sense. Have you considered building an open source Python package for solving just this bit of the problem?

Any tips on the SOC2? Did you use something like Drata / Vanta?

0 - https://www.definite.app/

r_thambapillai · on June 14, 2023

Thanks!! There are some fairly good OS models for the core stuff (PII, SSNs etc) out there already (Presidio, Spacey), so folks that need an OS option have one to start with. Detecting the more complex stuff can sometimes need a little iteration, but I could definitely imagine a world where we publish that in the future

On SOC 2, we used Drata, and spoke to Vanta, Laika and a few others. The price Vanta initially quoted us was waaaay higher than the other two, and between Laika and Drata we went with Drata mostly because there seemed to be more automation in Drata. In the end, the Drata live support was incredible and hard to imagine how we would have gotten the certification so fast without. We started our infra on DO, and so the most painful part of SOC 2 for us was the migration we did to AWS to take advantage of AWS' many security features. My main advice would be take full use of the Drata live support (I'd guess Vanta have something similar), but maybe on a deeper level - when you're doing SOC 2, don't focus on the certification: focus on the policies and technology that actually makes your company secure. In the end, that's what enterprises really care about, especially for the ones that have given us 300 question long questionnaires!

mritchie712 · on June 14, 2023

Nice! How long did it take end-to-end to get the SOC2 Type 1?

r_thambapillai · on June 14, 2023

Our AWS migration wound up taking about 4 weeks, getting all the policies in place took about 8 weeks (which overlapped with about 2 weeks of the migration), and then the audit itself was a couple weeks as well

debarshri · on June 14, 2023

Apologies for the shameless plugin. I generally don't do this, but I just thought our product might be relevant for the use case you mentioned. We do not compete with Credal, but at Adaptive [1], we have been building a platform that helps with infrastructure access management and allows users to automatically generate and collect evidence, especially for CC5 and CC6 (logical access). Vendor security questionnaires become easy to answer when we, as an organisation, use our product.

We have seen reproducibility and access auditability in organisations that adopt products that access schema and metadata from databases, compute infrastructures, etc. comforts customers. Your customers care about security incidents like unauthorised access, privilege abuse, accidental operations, insider threats, etc. on the vendor's side which in my opinion are real threats.

[1] http://adaptive.live

deet · on June 14, 2023

Looks awesome and will make many enterprises feel more comfortable using AI.

I suspect your intuition about moving emphasis from redaction to unified access control and audit logging over time is right.

The "AI Chief of Staff" sounds interesting though -- can you share a bit more about what you showed to companies and received lukewarm response to?

jackfischer · on June 14, 2023

The AI Chief of Staff had a few layers. The first was data integration of both productivity data (slack, notion etc) and "big data" lakes/warehouses. The former tells you what is getting done at a human level and the latter has the potential to tell you whether and how it is working. The second layer was modeling of your business strategy and including dependencies between concepts like projects and teams, which allows us to back out things like stakeholders and early warning recipients for any given progress or problems. The third was a presentation layer allowing humans to get a birds' eye view of what's happening including generating artifacts like meetings decks.

Ultimately this 1) wasn't successfully solving an urgent enough problem for most businesses and 2) was too difficult to adopt.

LLMs do break open opportunities in this space so I expect to see some more versions of this, perhaps on top of the Credal API!

rishsriv · on June 14, 2023

This is so sorely needed. I used the app after the PH launch and loved how easy the self-serve was!

Do you have plans to let users define "types" of data that can be redacted (like monetary terms in a contract, code embedded in documents etc)? Also, any plans on making this an API that other developers could build on top of?

r_thambapillai · on June 14, 2023

Great questions and thanks for trying the product!!

Yup so a few thoughts here - we're exploring using embeddings to allow a description of what you want to hide, that will actually immediately show you what of the data you have synced already would be caught by that (or which previous requests).

On the API side: yes ABSOLUTELY! the API is already live and used intensely by some of our startup customers like Sourceful. The API docs for using the OpenAI models here: https://credalai.notion.site/OpenAI-Drop-in-API-0ef7cfd18a7c...

and the Anthropic models here: https://credalai.notion.site/Anthropic-Drop-In-API-ad298f6f7...

alokjnv10 · on June 14, 2023

That look what i need for data privacy for my chat with pdf tool Documind. https://documind.chat

r_thambapillai · on June 14, 2023

Nice! Which AI model are you using for it? If you're using ChatGPT, you can actually use our ChatGPT API and get the PII redaction for free, with hopefully hardly any code changes

version_five · on June 16, 2023

Very cool. The demo was really impressive, this feels like it could be a very successful product. All the best!

gaut · on June 14, 2023

This looks awesome. Congrats on the launch!

r_thambapillai · on June 14, 2023

Thanks! :) It feels so surreal to be launching on Hacker News! When I was first discovering tech the people launching YC funded startups on HN seemed like wisened old Gods to me. Now I laugh about it because obviously I'm still learning so much, even the basics, every day. I hope we get to inspire someone else the way the early YC cos inspired me