- Hire an intern / "Customer Service Representative" / "Technical Account Specialist" to manually copy data from one service into another
- Dump some file in a directory and hope something is treating that directory like a queue
- Read/write from the same database (/ same table)
Or the classic Unix trajectory of increasingly bad service communication:
- Read/write from the same local socket and (hopefully) same raw memory layouts (i.e. C structs) (because you've just taken your existing serialized process and begun fork()ing workers)
- that, but with some mmap'd region (because the next team of developers doesn't know how to select())
- that, but with a local file (because the next team of developers doesn't know how to mmap())
- that, but with some NFS file (for scaling!)
- that, but with some hadoop fs file (for big data!)
Obviously all of these are at some level an 'application programming interface'. But then, technically so is rowhammering the data you want into the next job.
"Think of the acronym CSV. Don't look up the definition of the format, just meditate on the idea of the format for a bit. Then write your data in the format you have just imagined is CSV, making whatever choices you feel personally best or most elegant regarding character escapes. Pass this file on to your downstream readers, assuring them it is a CSV file, without elaborating on how you have redefined that."
"Comma separated values? But my data has commas in it! Ah, I know, I'll use tabs instead, I've never seen a user put a tab so that'll work perfectly forever and definitely won't cause a huge fucking mess for the poor bastard who has to try and decipher this steaming pile."
The quotation marks are stylistic rather than for attribution.
My personal experience comes from ingesting product feeds from online stores. Misapplication of \ from other encodings was the most common sin, but I'm pretty sure I saw about three dozen others, from double-comma to null-terminated strings to re-encoding offending characters as hex escapes. (And, of course, TSV files called CSV files, with the same suite of problems.)
ACH is not the lifeblood of the American economy. If you removed ACH there would be unimaginable distruption but I can't see that economic activity would completely cease. I think the lifeblood of the American economy is our population.
This gets abused even within one service. If I could get my coworkers to FFS stop using rows in a database as a degenerate kind of communications channel between components (with "recipients" slow-polling for rows that indicate something for them to do), I'd be a lot happier.
> rowhammering the data you want into the next job.
I hope the aforementioned coworkers don't read HN.
Oh damn you just reminded me. They tried to bring in "Robotics" (e.g. Blue Prism) here. For in-house apps.
Apparently it was so hard to deal with the developers that instead of exposing an API they would automate clicking around on Internet Explorer browser windows.
It's funny to think about but the reality is that it's better than a lot of other options.
- The page has automatic history & merge conflict resolution
- There's built-in role-based security to control both visibility and actions (read-only vs. read-write)
- You can work on a draft and then "deploy" by publishing your changes
- You can respond to hooks/notifications when the page is updated
Considering that the alternative is either editing a text file on disk or teaching business users to use git, a Confluence page is not so bad.
As crazy as that may seem on the face of it, it’s actually kind of genius for merging the roles of those who need to configure / don’t get git, and those who need to develop against the configuration.
Yeah, SFTP + CSV file is still the standard for enterprise software.
The problem is that these kinds of things have to be built to the lowest common denominator, which is usually the customer anyway. The customer in enterprise software is usually not a tech company they typically have outdated IT policies and less skilled developers than a pure tech company would have. Even if the developers are capable of doing something like interacting with a queue they also need to be supported by a technology organization which can deal with that type of interaction.
Some times you get lucky and someone in the past has pushed for that kind of modernization. Or your project really won't work without a more advanced interaction model and you have someone in the organization willing to go to bat for tech enhancement.
But otherwise the default is "Control-M job to consume/produce a CSV file from/onto an SFTP"
My experience is that usual reason for RPC-over-SFTP is that it is the only thing that corporate IT security does not control and thus cannot make inflexible. Adding another SOAP/JSON/whatever endpoint tends to be multiyear project, while creating new directory on shared SFTP server is a way to implement the functionality in few hours.
Shared database is, in fact,a classic enterprise integration pattern, and much of classic relational database engineering assumes that multiple application will share the same database; in the classic ideal, each would use it through a set of (read/write, ad needed) application-specific views so as to avoid exposing implementation details of the base tables and to permit each application, and the base database, to change with minimum disruption to any of the others.
Reading data from another application's database is pretty common (although even this can cause chaos if done without some care) but writing to application databases is often a very bad idea and often explicitly forbidden by CRM/ERP vendors.
Calling parts of your application "services" means that you're already thinking in terms of the "services/API" metaphor. If you're not using services, you might be, for example:
- Building one large application (monolith). Parts of the application communicate with each other via function calls. Everything runs in one large process. You can go quite a long way with this approach, especially for parts of the application that are stateless. (You can also build components of the application using a service/client metaphor within the process as well.)
- Multiple separate applications might communicate with the same database, file system, or some other data store. Before we had distinct distributed systems components taking on the role of queues, event buses, and things like that, it was common to represent queues using folders or database tables. These approaches are still seen today, though they're uncommon in new applications.
It's a middleware appliance that allows any API to talk to any other API and handles just about any data format. You can script it with javascript or XSLT. It can handle ad-hoc things like ftp polling.
It has the added benefit that you can add security for outside facing clients.
Disclaimer: I helped develop this appliance (but I no longer work for IBM)
IBM's business model is totally antiquated and exhausting for modern processes. We have had a nightmare trying to get IBM MDM's solution to finally admit they were not actually cloud-ready after saying repeatedly that they were. No TLS support for DB2 out of the box for K8 support, documentation sucks. But contact us for pricing. IBM sucks.
This is the Enterprise Service Bus concept, right? I remember a pretty good conference talk about how we realized in the mid 2000s that these things are problematic and you probably want services communicating directly over dumb pipes.
In showed an evolution from a monolithic spaghetti codebase, to an SOA, to realizing there are now spaghetti connections between services in an SOA, to a very clean looking ESB architecture, to showing how all the chaos is still there, just inside the ESB where it’s even harder to reason about or change.
You integrate a "service" by creating and linking a library that implements the entire service, including data access. Now try tracking down everything accessing the "service" database, or rolling out an upgrade to the "service".
Call out to a shell? Pass structs between C-based programs using sockets? Use runtime marshalling? Make your "API" be just different arbitrary JSON objects mapping to maps? Hide the entire API behind a lazy cache with undocumented side effects when cache misses occur? I wouldn't consider any of these "API"s although they are interfaces.