Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Psychic - An open-source integration platform for unstructured data (github.com/psychic-api)
122 points by jasonwcfan on May 22, 2023 | hide | past | favorite | 25 comments
My cofounder and I used to work at Robinhood where we shipped the company’s first OAuth integrations, so we know a lot about how data moves between companies.

For example, we know that the pain of building new API integrations scales with the level of fragmentation and number of competing "standards". In the current meta, we see this pain with a lot of AI startups who invariably need to connect to their customers data, but have to support 50+ integrations before they even scale to 50+ customers.

This is the process for an AI startup to add a new integration for a customer:

- Pore over the API docs for each source application and write a connector for each

- Play email tag to find the right stakeholders and get them to share sensitive API keys, or give them an OAuth app. It can take 6+ weeks for some platforms to review new OAuth apps

- Normalize data that arrives in a different formats from each source (HTML, XML, text dumps, 3 different flavors of markdown, JSON, etc)

- Figure out what data should be vectorized, what should be stored as SQL, and what should be discarded

- Detect when data has been updated and synchronize it

- Monitor when pipelines break so data doesn’t go stale

This is a LOT of work for something that doesn’t move the needle on product quality.

That’s why we built Psychic.dev to be the fastest and most secure way for startups to connect to their customer’s data. You integrate once with our universal APIs and get N integrations with CRMs, knowledge bases, ticketing systems and more with no incremental engineering effort.

We abstract away the quirks of each data source into Document and Conversation data models, and try to find a good balance to allow for deep integrations while maintaining broad utility. Since it’s open source, we encourage founders to fork and extend our data models to fit their needs as they evolve, even if it means migrating off our paid version.

To see an example in action, check out our demo repo here: https://github.com/psychic-api/psychic-langchain-tutorial/

We are also open source and open to contributions, learn more at docs.psychic.dev or by emailing us at founders@psychic.dev!




This looks like a promising idea, and potentially solves a problem I’ve faced recently.

It’s been a challenge getting my SaaS app connected to fragmented APIs belonging to many of my customers, each with their own use cases.

One of the biggest hurdles I faced was Asana’s API. A customer wanted us to hook into an Asana webhook: when a task was added to their project, they needed to push the data to their account on our platform (and vice-versa).

But because Asana is so “flexible” (ha!), all the field names in their API responses were UUIDs. It was a total nightmare to figure out which key/values were the ones we wanted. I’m not sure if/how Psychic can figure this out.

Secondly, maybe it’s just how your landing page is phrased — but this feels like “IFTTT for AI tooling”, rather than “IFTTT powered by AI”.

I see a lot more commercial value in the latter direction. To most prospective customers, your headline “Easy to set up” doesn’t mean a React hook and Python SDK. Just give us a REST API! :)


IFTTT for AI tooling is definitely more accurate! It's not powered by AI... yet. Zapier came out with that recently: https://twitter.com/zapier/status/1658457320849018882

Definitely worth exploring but as you've experienced there are enough problems with extracting and normalizing data across the long tail of SaaS apps for us to get to reasonable scale.

re: the Asana API issue, that's both hilarious and sad. We do plan to build a transformation layer so that all data is reshaped to a consistent schema before sending it off to customers (hence the "Universal" aspect of the API). These quirks of each data source are exactly the kinds of things we want to solve for so our users don't need to worry about it.


Thanks for clarifying. Perhaps I misinterpreted your offering a bit.

> “Psychic lets you meet your customers where they are. Never let "do you integrate with that?" be a blocker.”

I’d assumed from your marketing that most of your customers’ customers are in places like Asana/Notion/Slack, and so on.

If I want to meet them there, I don’t need a vector database or Python SDK. These APIs provide normalised, structured data over HTTP.

The problem you claim to solve (disconnect between multiple platforms) can be solved by a service like “IFTTT powered by AI”. But if you offer an “IFTTT for AI tooling”, that’s all good! It just doesn’t seem to align with how you present yourselves.

Maybe it’s just product/market fit, or choice of phrasing on your landing pages. Just saying I’d spend a fortune with whoever creates a true “universal API” (in the sense of IFTTT/Zapier), designed for general-purpose use cases over HTTP. Hope that makes sense!


Gotcha! Thanks for the feedback, we'll make the copy on our site more clear. We are solving the "disconnect between multiple platforms" problem, but for a specific use case.

TBH I'm not sure it's possible to create a true universal API with the current state of LLMs. Robust integrations tend to be very use case specific. Perhaps something we explore in the future!


I can’t help but remember the ad for IBM’s UBA.

https://youtu.be/AIOqOxI0K_I

There is no magical adapter, but in general platforms that connect a lot of things do well. Love the project.


Hey I've been thinking a lot about your comment. Would you be open to connecting non-anonymously? Would love to pick your brain on API integrations. If so you can email me at jason@psychic.dev so you don't have to doxx yourself.


I have just built a Notion integration that pulls pages into our statically built API documentation website, and it was, frankly, horrible. While the end result works (the team can write docs in the tool they know, the site is built and released from the structure there automatically), it was a lot of pain to even discern children from their parent pages, parse attributes or let alone get databases right.

Considering I’ll need to get other data in there soon, probably, I’m in the market for Psychic. The question I have, though, is: can you really reconcile the Schema of several apps into one, without settling for the smallest common denominator? What do you do about platforms like Notion, that don’t even provide webhooks? We settled on polling, but obviously that won’t scale.


Reconciling schemas -> this will be hard. We're starting with just two data models (Documents and Conversations) that are relatively universal, but there's no way to avoid a lossy transformation from Notion because things like tables and embeds aren't neatly captured as a Document without making our data models just as complex. I suspect LLMs can help finding a good balance between generalization/depth since they're very good at automating work that typically requires a lot of customization.

Data syncs -> If the source doesn't offer webhooks, we just poll daily, do a diff on our side, and send the updated data to our customer. I'm not aware of any way to avoid polling when webhooks aren't available, but we plan to do the polling ourselves so we can provide a webhook like experience for customers.


do you think you'll have issues with rate limiting and security blocks more generally if you are doing lots of polling from your side on behalf of many integrations? One time i wrote a script to poll for changes to my frequent flyer mile account and they banned me and froze my miles. just from a polling script.


Even with oauth applications you will experience rate limiting 'at scale' when many users have authorized your single "application" to perform actions. In my experience, some cases make it easier to have the customer create and provide the oauth application details to help mitigate rate limits, review requirements, and access restrictions.


Yeah we do offer that actually! For customers with significant volume it'll help lighten the load


Yes - we will run into this problem eventually with a single OAuth application. Having our customers use their own credentials is a workaround but won't work for some platforms that have a really onerous approval policy for new apps. If you or anyone else knows of a solution to this problem let me know, because we haven't crossed that bridge yet.


I worked at [undisclosed Fintech] and their solution was pretty funny. They had a rack of mobile phones on 3G/4G running Android to poll [undisclosed throttle-happy bank]. Was an incredible hackathon project that lasted years.


The reason to use the Pro hosted plan is for support and the convenience of not needing to self-host? Or is there actual functionality you don't get by self-hosting?


Correct, the benefits of the paid version are support and convenience. 100% of our code is in the open source repo.


Looks interesting! I tried to sign up to the cloud service with GitHub and got an error message that the integration wasn't enabled.


Thanks for the heads up! We'll have that fixed ASAP but in the meantime signing up with Google or email/password should work


Congrats on the launch! I am curious how you see apps evolving to provide natural language interfaces on top of existing APIs. Also, do you plan on strictly remaining the data layer (between a startup and its API integrations) or do you plan on dogfooding your platform for a particular killer use case?


Personally, I think it's going to happen but not for nearly as many applications as you might think. Point and click is still king for any use cases that requires precision (e.g. booking an uber)

We plan to focus strictly on the data layer, helping companies connect to their data sources through a universal API. We already are dogfooding our platform for some customers! By far the most popular use cases are customer support automation and search through workplace apps.

It's facinating that the build/buy decision has flipped for a lot of companies. As long as they have an engineering team, a lot of companies are trying to build out their own AI capabilities in house, I'm guessing because no one wants to miss the boat.


Why the decision to license as GPL?


We specifically chose AGPL-3 because we wanted it to be permissive, but we didn't want others to fork our project, take it closed source, and charge for it without adding back anything of value.

We also don't expect companies to customize the functionality, just to self-host it or use the cloud version, or use it for personal projects.


what is your concern with gpl? you can still commerialize apps that use it as long as you use the normal interfaces it exposes.


why would you call it psychic? stupid name, uninformative, difficult to google.


I think the name fits perfectly, since the intent is for companies to use it with LLMs then the quality of the results is the same as you would get from asking a psychic.


Exactly our intent :) some people just like to hate




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: