Hacker News new | past | comments | ask | show | jobs | submit login
Kina Knowledge, using Common Lisp extensively in their document processing stack (lisp-journey.gitlab.io)
192 points by p_l on Oct 22, 2021 | hide | past | favorite | 60 comments



This part is really interesting:

> It is fast - our spatial classifier takes only milliseconds to come to a conclusion about a page (there is additional time prior to this step due to the OpenCV processing - but not too much) and identify it and doesn’t require expensive hardware. Most of our instances run on ARM-64, which at least at AWS, is 30% or so cheaper than x86-64. The s-expression structures align to document structures nicely and allow a nice representation that doesn’t lose fidelity to the original layouts and hierarchies.

The article also mentioned they have 3 programmers and 100k lines of code, that sounds impressive.


> The article also mentioned they have 3 programmers

One of the problems we had with promoting commercial use of Scheme and then Racket was that -- although some companies were using it to great success -- there weren't any job postings for it.

It was the norm for a single programmer to be doing the work that would normally be a team (sometimes multiple teams).

And the knowledge of that success wouldn't be well-known. (Because they liked to focus on the work, or because the larger team of business etc. people they were in was also small, or, in at least one case, the business person thought "we use Lisp" would kill business deals even though the code wasn't customer-visible.)

So there would be no success stories, no job postings mentioning it as something people should learn, etc.

Which, I suppose was good for open communities self-limiting themselves to people who were genuine enthusiasts not motivated by money, and with no need to posture as influencers or do SEO, but... not so great for bringing in large developer base, getting lots of startups using it, etc.


> It was the norm for a single programmer to be doing the work that would normally be a team (sometimes multiple teams).

One thing that surprised me when I entered the professional world was just how much this is seen as a downside. At almost every level, companies will choose technologies and techniques that allow them to hire more headcount, even if that results in a worse end product.

Many startups value the appearance of having a large actively-hiring engineering team, and many BigCo middle managers want lots of direct reports. They don't usually care how much work gets done per employee, and sometimes they don't even care about the total amount of work that gets done across the organization. It's all about getting warm bodies into seats, either to impress investors or to gain status within the organization.


Bus factor. What happens if the knowledgeable lead developer gets into an accident? What happens if they retire? More realistically, what happens when they burn out because it's so hard to hire other team members and so they can never really take a vacation and have to take their pager everywhere with them? That's why businesses optimize for headcount.


I have yet to encounter positive solutions to those questions, as usually increasing headcount does not relieve pressure on team members - it's used to enable quicker replacement by yet-another-junior-to-burned-at-the-altar who will leave after gaining experience and burning out.

The classic "we have niche languages" solution to that is to enable on-the-job learning of those, and not just doing simple fact checking on whether someone honestly claimed to have learnt Popular Language X as claimed on their CV.


Incompetent managers certainly believe they're hedging for bus factor by increasing headcount.

In practice, it doesn't work. For any given product there's still always a small/solo core team.

Don't optimise your business for mediocrity.


There's probably a link to monolith and distributed systems here. You gain way more resiliance when you get more people in your team, but what was once a thought firing in the brain of someone is now two people talking, which is order of magnitude slower. But on the flip side, you can probably handle more things in general this way.


The interview sort of answers this:

> Because we operate a lot in Latin America, I trained non-lisper engineers who speak Spanish on how to program Lisp

So they grew their lisp heads from raw human. That's not an overhead most companies would readily accept.


I think here he is talking about "integrators" - engineers who write in DLisp rules to process documents.


Hiring thousands of developers is a success metric now.


Management is still one of the fastest ways to a high paying job. Managers get rewarded (implicitly or explicitly) for being in charge of more people, either with better pay in the same position or by appearing to be more competitive when going for promotions. This creates a perverse incentive where someone can make decisions which promote a higher headcount at the cost of actual capability or efficiency.

It's often not even deliberate. No one sits down and says, "This language or tool kit requires a higher headcount, so I'll select it for my project". But they aren't motivated to find a more efficient approach that results in them being put in charge of a smaller team (because this would cost them and not reward them).

EDIT: Some grammar things.


It's also a way to hedge risks. If you've got a single programmer working on something, they're a single point of failure and they also have huge leverage (which can be good or bad depending on the person and circumstances).

Not that middle management bloat isn't a thing...it definitely is. But I don't think it's as black-and-white as it's made out here.


> It was the norm for a single programmer to be doing the work that would normally be a team (sometimes multiple teams).

I felt this so very often. There are economical paradoxes and sociological issues at play. People want to be impressed and a big corp makes it look like hard while 3 lispers does not have the glow and even at lower prices you may fail to sell good work.

A lot of this world work this way, if everybody worked smart and efficiently there would be a lot less jobs.


> It was the norm for a single programmer to be doing the work that would normally be a team (sometimes multiple teams).

Be careful with what you conclude from this observation. Is it that A) Lispers are generally 10x Programmers, or that B) Lisps are generally poor languages for team environments?


I can’t imagine the choice of language being strongly correlated with programmer skill (it would be elitist to think so, and everybody feels best in the language they spend the most time in).

I think C) Lisp is well suited to certain applications, from what I can see, web development or niche areas where exploratory programming is required. The downside of Lisp is lack of good GUI (CAPI is the best they can offer, but this is not as good as other language implementations for various reasons) and not having the backing of Apple, Microsoft, Google or Facebook, and thus lacking in APIs.

But given its dynamic development (its a real joy), its very well suited to exploratory programming.

Web Development is an undiscovered gem for Lisp. It doesn’t face the issue of GUI / lack of APIs, since you can work directly in HTML / JS / CSS for the front end, and its a serious solution that covers servers, databases and all aspects of the stack very well. Its simply amazing (this is coming from a web developer), but I shouldn’t advertise this too much - its a secret weapon!


The choice of language is anecdotally strongly correlated with programmer skill, but the best evidence I've seen was not about "Programmers using language X are better than language Y", but about availability of programmers.

Counterintuitive, it was the "elitist" lower numbers involved in some languages (not just Lisp) that were found to be advantageous by some companies in hiring. Essentially, it selected out people with lower experience and those not willing to learn a new language. An example I was given was that choosing Common Lisp and combining it with globally remote hiring greatly increased the quality of incoming applicants - and that had they went with the popular option (Python in their case) they'd have to deal with deluge of fresh bootcamp grads of which many had unwarranted high opinion of themselves.

Both Kina Knowledge and ITA Software (and from what I heard but can't cite now, other companies as well), teaching people Common Lisp on the job apparently tends to work pretty well, and willingness to master a new language correlates much better with good skills. It does have the disadvantage of not being able to throw a new hire directly tickets to solve, but the timeframes I was given were essentially "it takes around a month for new hire to go from zero to productive lisp programmer".


Exactly. The average JavaScript programmer is much worse than the average Lisp programmer not because of any inherent features of either language, but because JavaScript is the first thing beginners learn in bootcamps. It's the same problem PHP used to have.


Common Lisp certainly gives you the tools to make an unmaintainable rat's nest out of your application, but it also gives you the tools not to. I expect there's survivorship bias here.

More generally what you're referring to as elitism is really just the result of a simple theorem. As the size of the set of Xlang programmers approaches the size of the set of all programmers, the average competence of an Xlang programmer approaches the average competence of all programmers. The asymmetry is that, as far as I'm aware, there aren't really any significant languages that attract small communities of abnormally incompetent programmers. On the other hand, there are languages, like Haskell, the MLs, and some Lisps, that do attract relatively small communities of abnormally competent programmers.

None of this is to say that someone that only writes JavaScript can't be an exceptionally competent programmer. We're talking about group averages here.


Well, that's one way of looking at it, the one I've been thinking of is that the distribution of competence is similar in all languages, but with niche ones it becomes easier to filter because good applicants do not disappear under deluge of bad ones.

There's also survivorship bias in sticking long enough with programming to actually look for jobs in niche languages or accept that you will be taught on the job, something that I've seen only with either experienced or more open-minded/educated people


I think that's because usually with "elitist" programming languages you can freely play with ideas, way more than with other languages. I remember reading in "The passionate programmer" that they increase their signal to noise ratio a lot when trying to recruit Java developers by looking for Smalltalk developers.


I think the productivity advantages of old-fashioned Lisp and Smalltalk environments are real enough. I've experienced them enough times in practice to be fairly well persuaded.

I'm a Lisper by preference, with almost thirty-five years of experience in the market. Because Lisp is a niche language, I've also had to be competent working with other languages, too, so I have at least some basis for comparison. My impression is that Lisp and Smalltalk do offer real productivity advantages. Don't misinterpret me: I'm not saying there are no disadvantages. I am saying that the advantages are real, though.

A lot of the productivity advantages are development-environment features, rather than language features, but the language features make the development-environment features easier to build and nicer in practice. As a simple example, it's both easier and faster to look up definitions if they're already easily accessible in memory than if you need to write a separate tool to index the source files.

Are Lispers generally 10x programmers? I doubt it, but the language and the best development environments for it do offer some real leverage, and programmers accustomed to them can get a lot of productivity out of them. I've actually been one of those 10x programmers a few times now, as measured by vcs statistics, or feature-delivery times, or client expectations. In each case that it happened it wasn't because I was so much smarter than everyone else; it was because I was permitted to use tools that I knew would give me a lot of leverage, and I already knew how to use them.

Is Lisp bad for team environments? Not that I've seen. The smallest team I've experienced is just me. The biggest was about 50 or 60 people. I've seen teams of Lispers at various sizes that had good discipline, structure, and productivity.

I have noticed remarks from some programmers that they've seen big, messy Lisp codebases, and I don't doubt it. Common Lisp, at least, is pretty unopinionated. You can pretty much do whatever you want with it. It doesn't prevent you from setting up good project structure, and if you want tools to help with that, Common Lisp makes it pretty easy to create them, but it doesn't guide you much out of the box, and if you proceed with no plan or structure, it doesn't prevent you from making a mess. If you want good structure, you have to deliberately create it. I guess I'd say it's not a good language for a bunch of people with no experience in it and no good guidance.

I don't really agree with your point C; Common Lisp is a general-purpose language. I've used it productively for quite a large variety of purposes, from system programming to document processing to compilers to Desktop GUI apps to network-security apps to webapps to Good Old-Fashioned AI, and I haven't found it dramatically stronger or weaker in any specific areas.

There's some truth to what you're saying, but I think it has more to do with the availability of supporting code than with the nature of the language. Some things will be easier to do than others because it'll be easier to find working code of good quality.

That's not a function of the language so much as it's a function of where open-source contributors have chosen to spend their efforts. That's where the relatively small size of the global Lisp community works against Lispers: there just aren't enough open-source Lisp contributors to make all the supporting code that we might like to have available.

A secret weapon? Yeah, sometimes it is. I've surprised folks a couple of times with how much I could get working quickly, and that's generally been to my advantage.

It can also work against you, most especially when you encounter a decision maker who really dislikes Lisp for whatever reason. I've encountered them a few times. For example, I knocked out a working solution to a problem for a bid before anyone else, and got an immediate enthusiastic response. The enthusiasm lasted until the client found out I used Lisp, at which point they ghosted me.

On the other hand, not long ago I was involved in a proof-of-concept that got funded, and one of the evaluators who supported us said in the review that the only reason he thought we could make it work was that we were doing it in Common Lisp. So people's preferences don't always work against you.


One of the things I like about Racket is that it has a cross platform GUI built on GTK in the standard library[1]. It also has a GUI builder app (though I’ve never used it so can’t say how good it is)[2]

1. https://docs.racket-lang.org/gui/

2. https://github.com/Metaxal/MrEd-Designer


I’ve heard that. How do you find racket for speed and also interactive development?

I don’t mind a bit of scheme / racket, so would be curious to jump in.


My most recent example is a prime sieve I wrote in Racket which was at least 10% faster than the Python implementation (can’t remember the exact number right now) I modeled it after, and then another 10% faster than that when I converted it to typed-racket[1]

For interactive development it’s true Racket isn’t as heavy into top level development as other lisps, but I often run my code and then play around with modifications to it in the repo afterwards using racket-mode in Emacs.

[1] https://github.com/diego-crespo/Primes/blob/drag-race/PrimeR...



One of the main developers of a Lisp compiler is employed by Google to work on exactly that.


exactly… a new GUI lib, is this what you mean? (there is also a group working on Qt5 bindings, they say to have a working prototype and they're ironing things out since some time)


> but I shouldn’t advertise this too much - its a secret weapon!

But do you use this weapon?


Yes, I am thinking of starting a part time web company based on it. But very early days so I shouldn’t be quoted or assumed to be anywhere successful as my preceding sentence may imply.


Lisp allows you to build compilers for your own languages. Compilers are a force-multiplier, but then you're the only one that knows your language.


> but then you're the only one that knows your language

This only applies if the language implementation is complex, the language syntax is unlike anything in existence, and the language creator does not pick good names for functions and variables nor write any documentation, manuals, or papers to explain things nor he/she is available for any questions.


I don't think that's true. What you describe is a language that's hard to learn. But plenty of people already have a moderatly hard time when switching languages, or when using languages inside languages. One example would be SQL. SQL is a great DSL for relational databases, but many people never really embrace it or learn it "properly".


C) lisp is a "10x" language?

I can't be similarly productive in a language that doesn't have a simple syntax (easy to write) and doesn't allow you to do meaningful live programming.

By the time you've got a major headcount, organizational bottlenecks will render the individual velocity of programming tools irrelevant. Lisp will scale poorly because every language does. It's hard to justify unusual technical choices if there's no benefit. It makes perfect sense that a large organization would eventually drop Lisp.

Every startup is in the game of leverage and thus should keep the team as small as possible. Lisp helps you do that. Once you're over a certain size you WILL pay a huge organizational overhead and there's nothing you can do about it. The more logically monolithic your tech/system is the bigger the price. A lot of work just cannot be parallelized. Most of the things startups are working on tend to be that way.


> Lisp will scale poorly because every language does

I guarantee you that a highly opinionated language like Go scales horizontally better* than a totally unopinionated language like Common Lisp. It's literally why Google spent a bunch of resources developing it.

It's also why Lispers tend to despise Golang, or deride it as a simple language for average developers.

*Requires fewer resources to get a new hire up and running on prod code


Ehhh, not all of us despise Go.

Unlike its creators, who were quite blunt in saying it's simple language for middling devs.

(Other than the part where it's essentially new version of C from alternative world)


Really interested to play with dlisp if/when it is open sourced!


Have they said whether it's a complete CL implementation or just a subset? I have the impression that it gets asymptotically more difficult as you approach the finish line.


Assuming you aren't just talking about how any project becomes harder to fix bugs in once you've fixed the easy bugs, it only gets more difficult as your approach the finish line if you don't plan on making a CL implementation from the start.

It's really easy to make something that kind-of looks like CL by use of clever smoke and mirrors, and many "Lisp in X" projects do so. See e.g. parenscript's LOOP implementation.

Other notes:

- Implementing CLOS is a massive undertaking, but there are two reasonably good portable implementations of it (Closette and PCL).

- The numeric tower is sufficiently loosely specified that you can typically get away with using any existing bignum library.

- LOOP and FORMAT are annoying to implement just because they are big DSLs. I'm also not aware of any portable implementations of them (though SICL might have one).

- Most of the rest of the effort will be spent on implementing things that interact with the system. Streams/files/pathnames are all annoying


LOOP can be cheated a bit with, as it originated in separate semi-portable package AFAIK, just like for CLOS you can use PCL for heavy lifting. Specifically there was MIT LOOP package that wasn't exactly ANSI compliant but got you significant portion of the way there.

Also, if you're willing, it's possible to just lift a bunch of code from SBCL/CMUCL due to permissive license.

MIT LOOP https://www.cs.cmu.edu/afs/cs/project/ai-repository/ai/lang/...

few other versions including symbolics update to MIT which is supposedly closer to ANSI: https://www.cs.cmu.edu/afs/cs/project/ai-repository/ai/lang/...


When you say "Implementing CLOS is a massive undertaking" do you mean implementing CLOS in some other language than Lisp?

Shouldn't it be possible to implement most of CLOS in CLOS itself and thus reuse already existing CLOS-code?


I meant implementing CLOS in any language is a large undertaking, but then pointed out two portable implementations in CL one could use, to save that work.


Looks like it's CL-inspired so kinda like subset.


Yes! This sounds really promising.


> I liked the idea of distributing binary applications as well, which we needed to do in some instances, and building a binary runtime of the software was a great draw, too.

fwiw this is possible to with clojure using graalvm. Only mentioning this because of the clojure comparison in the earlier paragraph. While I do lament the JVM I haven't found a straightforward way to build statically compiled binaries with sbcl. I would love to be proven wrong though! :D


Doesn't `(save-lisp-and-die :executable t)` [1] create static binaries?

I had a look at one of my SBCL-made binaries on MacOS, it shows this:

    ▶ otool -L  lisp-enc 
    lisp-enc:
 /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1292.100.5)
 /usr/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.11)
EDIT: GraalVM takes several minutes to generate binaries, and the binary may not behave the same as the JVM-based runtime, so I wouldn't recommend using that.

[1] http://www.sbcl.org/manual/#Function-sb_002dext-save_002dlis...


Yeah, you've got a dynamically linked binary.

There is a work in progress on SBCL to make it possible to create a really static binaries:

https://www.timmons.dev/posts/static-executables-with-sbcl-v...

I tried this fork and it worked for me!


Well, the example was on OSX, and it's generally not a good idea to do static linking there ;) (same on many other systems, you need minimally certain amount of dynamically linked stuff).

Blaze/Bazel rules for Lisp support building fully static SBCL binaries (except possibly for grabbing dynamic stuff that is forced by glibc). ECL also supports static linking.


Give SBCL another try for building compiled applications. If you compile SBCL from source, enable compression of standalone compiled apps.

I own a license for LispWorks Pro, which does a very good job building applications, but SBCL is also good for delivering apps.



> Lisp allows us to scale dramatically and manage a large code base.

Wow, really? How big is your company?

> Right now, in our core company we have three people, two here in Virginia and one in Mexico City.


Sounds like scaling developer power all right.

Reminds me of quotes from Google SRE materials about scaling "SRE per managed server count".


Seem's like you're trying for sarcasm with the rhetorical question, but it's unclear. Scale what? Team size?


You know Google's flight scheduling software runs on Borg right ?

After they acquired ITA, they have few openings in between, and are yet, apparently, one of the biggest consumers of quotas.


What's a quota?


I think he means they one of the biggest consumers of cluster resources. Each group within google probably has a quota for how many resources they are allowed to use. It's unclear here if their consumption is so huge because they are successful or inefficient though...


Does anybody know what they actually do? I see them mentioned all the time but I don’t get it. Is it just flight information or do they do pricing analysis and other stuff


QPX (ITA's, and now Google's) system is responsible for calculating and searching airfares, i.e. the complete "solution" for "I want to get from A to B with those constraints". It's what responds when you ask google flight for a route, essentially. This is quite computationally complex due to many dimensions involved in airfare calculations. This system is essentially implemented on top of SBCL with some small bits of C++ (responsible, IIRC, for handling memory-mapped airfare data files).


Thanks!!


oh no no, scale only means hiring tons of programmers after winning funding rounds in these parts


Apparently, scale means LOC of Lisps (or LOL, if you will).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: