Hacker News new | past | comments | ask | show | jobs | submit login
Sourcegraph is no longer open source (github.com/sourcegraph)
444 points by CAP_NET_ADMIN on July 4, 2023 | hide | past | favorite | 253 comments



Sourcegraph CEO here. Sourcegraph is now 2 separate products: code search and Cody (our code AI). Cody remains open source (Apache 2) in the client/cody* directories in the repository, and we're extracting that to a separate 100% OSS repository soon.

Our licensing principle remains to charge companies while making tools for individual devs open source. Very few individual devs (or companies) used the limited-feature open-source variant of code search, so we decided to remove it. Usage of Sourcegraph code search was even more skewed toward our official non-OSS build than in other similar situations like Google Chrome vs. Chromium or VS Code vs. VSCodium. Maintaining 2 variants was a burden on our engineering team that had very little benefit for anyone.

You can see more explanation at https://github.com/sourcegraph/sourcegraph/issues/53528#issu.... The change was announced in the changelog and in a PR (all of our development occurs in public), and we will have a blog post this week after we separate our big monorepo into 2 repos as planned: the 100% OSS repo for Cody and the non-OSS repo for code search.

You can still use Sourcegraph code search for free on public code at https://sourcegraph.com and on our self-hosted free tier on private code (which means individual devs can still run Sourcegraph code search 100% for free). Customers are not affected at all.


Sourcegraph only provided non-OSS images and the build process was difficult and broken for a long time, the application itself was frequently broken in OSS version as well, searching issues for a few minutes brings up quite a few results. [1] [2] [3] [4]

It's no wonder, that the usage of OSS version was pretty low, when few were able to build it and even if they managed that, the resulting application was broken every few releases.

Both VS Code and Chromium are easy to build, due to their nature and popularity, they are available prebuilt from many sources. I would install "unofficial" Chromium build from my distribution's repository, I wouldn't keep my code in unofficial Sourcegraph build from some random person on Github. Comparing them is rather unfair, but there's another issue that stopped OSS adoption.

For a long time, official Sourcegraph Docker image came with a 10 seat free license, which suited many people and they weren't looking for alternatives like OSS build.

I would argue that announcing license change and closing of your product as a small block in change log file or when someone mentions the problem in Github Issues is not adequate for such a change.

Not using open-first principles, restricting the product by using enterprise only plugins, which others mentioned under this post, not providing open source builds and changing license without preceding announcement, while previously using open source terminology for some feel-good free marketing leaves a bitter taste. Especially with so many companies doing this right now due to interest rates.

https://github.com/sourcegraph/sourcegraph/issues/43231 https://github.com/sourcegraph/sourcegraph/issues/43203 https://github.com/sourcegraph/sourcegraph/issues/6790 https://github.com/sourcegraph/sourcegraph/issues/6783


Exactly my thoughts. I am using the Homebrew version of Sourcegraph, which I presume to be quite dead [1]. I do this because there is no packaged version of the Sourcegraph OSS. I would happily use the OSS version instead otherwise.

[1]: https://github.com/sourcegraph/sourcegraph/discussions/54589


Right. A license change like this being done in such a silent manner would lead me to drop usage of this product if I wasn't already avoiding it due to their dubious non-foss principles.


"We made the OSS version intentionally bad and broken, and we're now discontinuing it because people didn't use it".


This is exactly my experience with SourceGraph.


For what it's worth, I'd advocated to adopt SourceGraph at work for a long time but the open-source version being impossible to deploy essentially blocked us from ever considering it.

I don't doubt that your perception of the OSS version's lack of success is accurate, and it is definitely easier to close off the source, but at the same time the outcome of this is that one funnel into it is closing with the calculation that the effort spent to keep that funnel open wasn't worth the people coming in through it.

The other possibility, and one that I subscribe to (not that it isn't self-serving) is that the funnel was never open enough to see success in the first place.


> charge companies while making tools for individual devs open source

Stop using the term Open Source. It's not open source if you apply restrictions like this, it's pretty easy to see that you're being disingenuous. These licenses are not OSI approved.


I am referring to Apache 2 here. We license a subset of stuff (a smaller subset after this change, but still all of client/cody*) as Apache 2.


Apache 2 doesn't require that companies are charged. It's proper open source (meets all the criteria of the Open Source Definition).


Yeah, and we aren’t charging companies for our Apache 2 code. We have some open source (Apache 2) code, and some non-open-source code.


Apache 2 also doesn't forbid companies being charged.


I’m a fan of OSI, and their definition of “open source” is widely recognized, but still, language policing usually turns out to be incorrect. OSI didn’t invent the term open source, and like it or not, other definitions do exist that don’t meet OSI’s standards. The term doesn’t belong to any one organization. I don’t understand the confrontational stance either, with code being offered to individuals. That far exceeds what most companies do, in terms of serving the open source community, no?


Open source concept existed before OSI, it has originated from Debian Free Software Guidelines. OSI has only standardised it. To claim something else is disingenuous and at this point it's proven to be a robust definition. FSF has their own, older and not too dissimilar.

> other definitions do exist that don’t meet OSI’s standards. The term doesn’t belong to any one organization.

And all attempts to redefine what open source means fail, because both FOSS camps OSI and FSF/GNU as well as everyone in between see it for what it is: attempts to muddy the water in order to deceive users and customers. Your software is either FOSS or it's not, there is no scale here, it's a social contract. Just because there are a bunch of ambiguous licenses that no-one knows what to make of, doesn't mean there are two or more concepts. There's only one concept for both "open source" and "free software" and the difference is philosophical.


Nobody tried to redefine “open source” here in this thread. @xenago jumped to an incorrect conclusion.

> There’s only one concept for both “open source” and “free software”

Well neither OSI nor FSF agree with this statement. You’re on your own there.


The term doesn't belong to the OSI. But the basic tenets of the OSI definition are very important to devs. And source being available but not able to be legally reused makes it useless for the vast majority of FOSS projects.


I agree. I’m not arguing against the value or the stature of OSI’s definition at all, I’m only reacting to the demand to never say the words “open source” unless you mean OSI’s definition. The Robustness Principle applies to language; be conservative in what you say, and liberal in what you hear. It’s fine to point out when a license or particular software isn’t OSI-approved open source. It’s fine to ask if people mean OSI when they say “open source” without a qualifier. It’s fine to add a qualifier too.


There is a difference between strictly following the OSI definition and the general idea of "open source." For instance, while "open source" and "free software" are effectively interchangeable category definitions, there are some minor technical reasons why the FSF definition allows a few licenses the OSI definition doesn't and vice versa. If we were talking about "oh, this is accepted by the FSF as free software, but not by the OSI as open source," then OK, sure. And you could go on with Debian and Fedora approvals and so on.

But we're not talking about technicalities here. We're talking about the vey basic idea of what it means to be "open source." I'm not telling anyone that they can't use words however they want. They can. But the way they're using the term "open source" is just fundamentally incompatible with how the vast majority of people in the field use it. So it's at the very least going to cause some confusion to use the term "open source" in the way they are.

Besides, I think people have a bigger problem with the licensing change itself than any wordsmithery.


> I’m not telling anyone that they can’t use words however they want. They can.

It sounds like we are in full agreement, and you’re with me that @xenago’s demand to not use the phrase might be overstepping a little bit, no? Isn’t this confusion easier to clear up with a single short question than with assumptions or demands?

There is a slight problem with claiming using “open source” is confusing to the people who know about OSI. To the lay person who’s not a software developer, “open source” does mean ‘source is visible’, and “free software” does mean ‘software that costs no money’ (and these definitions are included in dictionaries and Wikipedia, next to the OSI and FSF versions). The OSI and FSF definitions are terms of art that these orgs are trying to establish and control, and they deviate from what the literal words alone imply, both in meaning and level of specificity, therefore they will always be confusing to people who are neither developers nor lawyers. Wouldn’t it be better if FSF and OSI relied not on co-opting everyday words, but having phrases that are more obviously terms of art and more obviously attached to the orgs? Even something as simple as “OSI Open Source” or “FSF Free Software” would go a long way. OSI does on it’s site use “OSD - Open Source Definition” quite a bit.


> To the layperson who’s not a software developer, “open source” does mean ‘source is visible’, and “free software” does mean ‘software that costs no money’

Source available is the correct term. Laymen who don't know any better often call shareware open source as well; they don't know how the software is made and don't care. That is not a good reason to use incorrect words.


Serious question: what, exactly, is “correct”, and who says? BTW I’ve never heard “source available” used outside of discussions about OSI and FSF.


At the end of the day, what matters is how it stands in court.


No, we don't agree at all.

The whole point is that they are trying to use the words "open source" to appeal specifically to people, like those here in this thread, who work on free and open source software in the FSF/OSI sense.

This is a message for the exact group of people who use the term of art because they're practitioners of that art.


> No, we don’t agree at all.

Oh okay. I was trying to tell you I agreed with what you said, but if you insist: Okay, fine then I disagree with you. Naw, I still agree with what you said. I don’t know what you’re disagreeing with at all, you haven’t made that clear. Consider the possibility that we might be, as the phrase goes, agreeing violently.

> they are trying to use the words “open source” to appeal specifically to people

Who is “they”? The top comment was referring to Apache 2.0, which is an OSI approved license. So, what, exactly, are you thrashing against here?

> This is a message for the exact group of people who use the term of art because they’re practitioners of that art.

Exactly! Sorry for saying this again, but I agree with that sentence. My point, which does not disagree with what you just said, is that because they are the practitioners, they are the people who know and use the term of art know better and should be the least likely to be confused when someone doesn’t use their term of art, and most likely to be able to handle the disambiguating gracefully like adults without getting upset or whinging that their term of art wasn’t used. They should be the people who best understand that their term of art has a special and overloaded meaning next to the literal words and some common non-term-of-art usage of those words.

This is all academic, the top comment was using the term of art in all of its overloaded glory. All I’m saying is that @xenago’s response is a bit inappropriate no matter what, regardless of whether it was the term-of-art usage or the lay-person usage, even if it was intentionally misleading (which it wasn’t). What you said appeared to agree with that, because you said “I’m not telling anyone that they can’t use words however they want”. So it does in fact continue to sound like you and I are agreeing on this point, among others.


> I’m a fan of OSI, and their definition of “open source” is widely recognized, but still, language policing usually turns out to be incorrect.

I'm no fan of the OSI, but language policing usually turns out to be correct. There are other definitions of open source - the free software definition, the DFSG - but in practice they're all close enough that the differences don't matter.

> OSI didn’t invent the term open source, and like it or not, other definitions do exist that don’t meet OSI’s standards.

Very few. I don't think there's any nontrivial software that doesn't meet the OSI's standards but is recognised as open-source by anyone who isn't a) a paid shill or b) a self-important provocateur who wrote it.

> I don’t understand the confrontational stance either, with code being offered to individuals. That far exceeds what most companies do, in terms of serving the open source community, no?

Companies are trying to salami-slice away the rights users expect from open-source software. The response to that has to be a firm line in the sand.


Nobody die and made OSI king of open source.


open source simply means you can read the source. It doesn't have to be free.


The usual term for that is 'source available.'

'Open source' generally gets read to mean 'available under an OSI or OSI-like license', and 'free software' as 'available under the GPL or a GPL-like license' IME - if the license is commercial only but the code's still freely -readable- albeit not freely -runnable- then 'source available' is clearer.

(note that sqs clarifies elsewhere in the thread that 'open source' was being used to refer to the stuff that's (still) Apache 2 licensed, not the source available parts)


Again.

Free software is largely coterminous with open source. Even Stallman uses the term 'copylefted software' to refer to software under a so-called copyleft license -- one which, like the GPL, forbids license changes upon redistribution. Stallman will tell you that he'd rather your free software be copylefted, but it doesn't have to be.

You're right about source-available though. It's what we used to call 'shared source', after a failed attempt by Microsoft to promote that as an alternative to the viral, cancer-like open source.


You seem to be confused about the common meaning of open source. Open Source only means the source code is available. It doesn't mean it's license or TOS is also open. That's why the acronyms FOSS and FLOSS exist as well.


Open Source has a very clear meaning in the context of IT, understood by developers and lawyers, judges and policy makers alike: it means that the code is distributed with a license approved by the Open Source Initiative.


Open Source Initiative is not an authority here. We had this debate many many times. You may not agree with this point of view, but let's not make it as if yours is the universal one.


> Open Source Initiative is not an authority here. We had this debate many many times. You may not agree with this point of view, but let's not make it as if yours is the universal one.

It practically is, and they've done a good job of gathering the relevant citations to make that point.

https://opensource.org/authority/

If you're passionately against this, feel free to make the relevant edit here as well: https://en.wikipedia.org/wiki/Open-source_software#Definitio... (but you may have to have a litany of citations to justify why the OSI definition is not a de facto standard if you want it to stick).


I’ve heard this line before and it confuses me. What is the authority?

It seems to me that it’s OSI (around for years, reputable, etc) vs some for profit companies misusing common terminology in the, I think false, sense that people think non-open things they call open are good. Not sure if they are deluded or just wrong.

Happy to talk about some new authority for open source licenses, but it seems like the “OSI isn’t an authority, nobody is” is an argument by 8th graders who just read the Wikipedia article on communism.

OSI formed to help open source developers and users to better understand “proper” licenses from bullshit.


But "Open Source" both as a term and idea pre-dates the OSI's formation. The general definition of "Open Source" shouldn't be universally defined by a single body.

The OSI has done a great job at introducing a legally ratified and globally recognized license format to help reduce uncertainty, but it is not, and has on several occasions been denied[0], the global authority on the definition of Open Source. They have a trademark and are the authority for "Open Source Initiative Approved License" (ie: "OSI License") specifically.

[0]: https://opensource.org/pressreleases/certified-open-source.p...


Of course it predates the formation of OSI. OSI didn’t invent the term, it’s just a group of people who formed to formalize and help adoption.

It’s not like there’s some competing definition. OSI has been around for 20+ years and only recently did a few companies decide they want a different definition so they can make more money.

But the issue isn’t that there’s some word police. The issue is that open source has a definition in use and when people try to overload, it gets confusing. I wish people wouldn’t do that, but it’s free country (free as in speech, not free as in beer).

No one cares if source is “open” in that people can view it. In that case windows is “open.” The important part of open is the ability to change, reuse, and participate.

Why would anyone care if source is visible but not usable? I’ve been able to decompile forever. I can see the source if I need to. The community and reuse aspect is important.

Finally, OSI doesn’t define the term. They just certify licenses that adhere to open source principles and ideas. The community defines the term. Everyone is free to make up new licenses. OSI just helps the community filter out noise by reviewing licenses that actually are open source.


> Finally, OSI doesn’t define the term. They just certify licenses that adhere to open source principles and ideas. The community defines the term. Everyone is free to make up new licenses. OSI just helps the community filter out noise by reviewing licenses that actually are open source.

This is exactly my point, the community defines the term. The OSI definition does a good job of making the legal aspect of (their vision for) "open source" explicit, but it also adds additional definitions beyond what the average layperson might consider "Open Source".

Take section 5 and 6 of the "Open Source Definition"[0]. It states you can't discriminate against "persons, groups, fields, or endeavors". So if I wrote some software, put a MIT license on it, with a single additional clause that says the CIA can not use this software. Magically, it is no longer "Open Source" according to the OSI, even though 99.9999% of people can freely use it under the MIT.

[0]: https://opensource.org/osd

They have a definition, they even have a pretty good definition, but the OSI shouldn't be the definition. All OSI licenses are open source licenses, but not all open source licenses are OSI licenses. (All thumbs are fingers, not all fingers are thumbs).


> So if I wrote some software, put a MIT license on it, with a single additional clause that says the CIA can not use this software.

Right, because it’s not open source and not MIT. Open source isn’t about 99.999% of people being able to use it, it’s about be free and open.

This is the commonly accepted definition of open source and there’s very few who would consider your custom license open source.

Practically speaking, it means I can’t use it even though I’m not in the CIA because I want my project to be compatible and reusable down the line by anyone. So I use a true OSI license like MIT and want all the software I link to and use compatible so users have a clear expectation.

You can make your new license, but I don’t want to use it as I only want to use open source licenses.

I don’t want to hire an attorney to review your license and see if it works or not. I want to just filter by known licenses and make sure they are compatible with my other licenses.

There are open source licenses that aren’t OSI, but they are pretty few and OSI works diligently to review new licenses and add them.

Your example license isn’t open source though, so it doesn’t fit this small group.


"This is the commonly accepted definition of open source and there’s very few who would consider your custom license open source."

You might think so but I disagree.


Yes, it’s a free country. That’s why we have consensus.

1000 people misusing the term means the term changes. 1 person misusing the term is just a jerk.

I’m always suspicious when people overloading a term are all doing it for personal benefit (ie, promoting their product).

I think consensus is around OSI now, but it could change. I might not be right in the future, but I think I’m right now.


"Source" also was a term and idea predating the first ever computer source code. Definitions can change.


This is a somewhat pedantic and useless comment. Of course definitions can change, but the notion of "open source" hasn't changed in any significant and notable fashion since it was first coined in the late 90s. Feel free to disprove me.


Exactly right!


It’s not about the OSI. They just happen to have a convenient list that everyone agrees on.


Open source doesn’t mean active authority, because you have the source code doesn’t you have some level of control on the repo that the source code is being host, having Go source code doesn’t mean you have the right to just commit into the Golang repo without checks and decisions


> just commit into the Golang repo without checks and decisions

Oh, but you can, just do it in your fork. You can't claim their trademark but that's a completely different thing.


I am pretty sure that for most developers open source = source code available, and nothing else.

Anegdotally I have been a professional dev for 10 years and this is the first time I hear your definition of open source.


That's probably because you've only been a professional developer for 10 years. Here's a quick history lesson:

If we go back, say, 25 years, when the term Open Source entered common usage, it was a way of describing the things that had thusfar been labeled "free software", but as a way of deemphasizing the notion of the Free Software movement that saw non-free-software as immoral. It was a term to describe things that met the Free Software Definition, but without harping on morality.

It was very much a counter-culture (it was, after all, the Free Software movement and the Open Source movement), and very much not a generic term for having access to the source. That was already super common in enterprise agreements, and nobody considered that to be open source.

Then around the early 2000s, Linux became hot shit, and some large companies wanted to avail themselves to the rising tide and began labeling their watered down versions of "source available" things as open source in an attempt to jump on the bandwagon. But that was an intentional attempt to water down the definition everyone already understood for marketing reasons.

You not knowing this history means that to some extent the marketing worked. But just realize that in arguing here, you're participating in the astroturfing. Also, get off my lawn!


Love it.

But I’m also in the camp that didn’t know this history and had a softer definition of “open source”. That’s just how language changes, shrug.


I'm also often an advocate of what you're saying there, "language changes..."

But this I think is one of those cases where there is a difference, because it's also descriptive of a community, and it matters how that community sees itself. With whatever definition of open source you have, the most high traction stuff that we all rely on (I originally wrote, "most", but I think e.g. the Linux kernel matters more than a random abandoned repo on Github) is produced by people that use that older definition of open source, and mostly by people who identify with that social movement. (I for a long time was one of those people.) In this case I do believe that in that all of us now rely on open source software, that redefining it in opposition to the group of people who produce that thing is less than respectful.


Anecdotally, I've also been a professional dev for over 10 years, and have been involved with open source projects longer than that. And in my experience, "open source" almost always means you are free to modify and redistribute from the source (possibly with a requirement that you also release the code for your changes, in the case of the GPL). The exceptions are mostly companies that want to claim they are open source for marketing, without actually following the spirit of open source.


> for most developers open source = source code available

I'm pretty sure it's not, unless by developer you mean land developers and not software devs


What's an example of "open source" software you've used that matches that description?


You're referring to source available licenses. These range from the nearly open like Mongo or ElasticSearch to the almost totally closed like Windows or Solaris.

They're not inherently bad licenses but they aren't open source.


“Open Source” is a proper noun defined by OSI, not to be confused with the general phrase “open source” which predates OSI. You use both interchangeably in your reply, which is invalid (apples/oranges). Please be more specific: do you mean the OSI definition or the pre-OSI colloquialism?


I've heard the term "source available" for that. Nowadays they also have "open core" which means you can't do shit with the source you do get. But that's another thing entirely.


> Open Source only means the source code is available

No, that's "source available." If there is no open source license, then it is not open source.


The readers may not like it, but you're not actually wrong. Go ask 100 developers if BSL or ELv2 are open source and the majority of them will say that it is -- because the source code is available, and for the majority of users, these licenses are less restrictive than AGPL. Not Open Source (tm*), but open source -- FOSS vs OSS.

(I do understand that Solargraph is now using a proprietary enterprise license, so this comment is directed at the OP mentioning "OSI approved" licenses, not at Solargraph's new license.)

* it ain't trademarked.


That's why there a separate term for what you are describing, https://en.wikipedia.org/wiki/Source-available_software


I'm on mobile and with family today so I can't respond in-depth (happy independence day!), but have ever wondered why the term "source-available" has changed meaning, yet the term "open-source" is not 'allowed' to? (And I'd argue it already has, much to the OSI's dismay.)

The term source-available has been shoehorned to mean everything-not-OSI-approved, instead of what it used to mean: a proprietary license for a project that has its source available (e.g. Sourcegraph's license).

In reality, "open source is a broad software license that makes source code available to the general public with relaxed or non-existent restrictions on the use and modification of the code." Which is the definition the majority of developers would say is open source.

The ELv2 and most uses of BSL fall under the "relaxed restrictions" on use and modification, similar to GPL. I'd argue they are open source.


Why doesn't Elasticsearch B.V. refer to their software as "open source", then? They refer to it as "free and open" in their marketing, so obviously they think the idea is appealing to their customers. They refer to Logstash as "open source" and specifically to the fully Apache 2.0-licensed version as the "-oss" build.

Users of the BSL refer to it as a "source-available" license that, after a period of time, converts into an "open source" license:

https://www.couchbase.com/blog/couchbase-adopts-bsl-license/


It's literally just word-play, dancing around what they actually mean. I'd assume because of pressure from OSI and friends (i.e. bad publicity), not from the real world. The same reason my company is "open, source-available."


No, it's not just word-play. There are real restrictions coming with source-available code if you want to use it in your business operations. One of the reasons companies express so much interest in open-source is to have no obligations before the vendor (support contracts that cannot be terminated early even if the software is being removed, licensing that makes it hard to migrate from a few large machines to many smaller containers/VMs, other forms of lock-in) and take a risk of an open-source software going unmaintained (that they usually plan to mitigate by hiring an outsourcing firm to fix the abandoned OSS projects).

Now, that is one of the main reasons why non-OSS licenses are being adopted: many companies prepare contingencies in case an OSS project dies, instead of making some arrangements to help a bit to ensure the project doesn't die. However right the vendors are, the resulting license significantly (materially, non-word-playfully) restricts the users, which is exactly why those users are compelled to start paying for the product.

P.S. None of what I wrote means I oppose those licenses, as they may be needed to ensure a healthy ecosystem. But I oppose calling them OSS.


If the common understanding of "open source" differed so much from the OSI standard, wouldn't these companies just say "open source" and dismiss the OSI definition as archaic, too narrow, etc.? Instead, it seems like there's been a lot of work put into maintaining a formal distinction, including working with Bruce Perens to revise the BSL:

https://itsfoss.com/making-the-business-source-license-open-...

Note that I'm not coming from a position of hostility re: the BSL and similar licenses here. I think companies using that approach can be friendly neighbors with the open source community. I just think it's important for those neighbors to share a well-maintained fence to keep malicious actors from exploiting ambiguity.


> In reality, "open source is a broad software license that makes source code available to the general public with relaxed or non-existent restrictions on the use and modification of the code."

Not to the general public, only to the users doesn't have to be generally available to the public, see recent RHEL PR debacle. It's not about the restrictions it's about what you are allowed to do with the code.

> ... Which is the definition the majority of developers would say is open source.

Can you prove your claim has anyone done demographics on this?


All my evidence is anecdotal, based on talking to my users. But that'd be a great topic for developer survey.

Also, I'm quoting Perens' definition of open source.


Before this moment, I had referred to it as “transparent source” — you can look, but cannot touch.


But that's not at all what the ELv2, SSPL, or BSL licenses say. At many times, they're more permissive than GPL.


I am one of the few people who used the open source version and really liked it, and I'm disappointed by the changes.

The challenge I had with Sourcegraph is that it's out of reach of developers working on personal projects. There isn't a hosted plan, and for my projects I can't easily open source them due to my employer.

I was really excited when the Sourcegraph App was released, since it allowed me to give Sourcegraph a try on my project without going through the complex self-hosted setup. I went as far as getting scip-clang working with my Bazel-based project, and then tried out the docker-compose setup on my home lab.

Now that code search was removed from the app, and this change, I'm concerned that I won't be able to use Sourcegraph for my personal projects in the future.

This is a missed opportunity. I think individual developers using products for personal projects are powerful advocates, since those developers may convince their employer to purchase the product. If I could I'd gladly pay, but I'm just one person and can't justify $5k/year.


Hi there, Sourcegraph CTO here. Code search remains free for individual devs, and I hope you'll continue using us for your projects! https://about.sourcegraph.com/code-search/pricing

We have lots to reflect on given the feedback here on HN. We were honestly a bit blindsided by the number of people who appear to be using open source Sourcegraph, or who really wanted to use it but found the process too difficult. Part of this is because we had a zero telemetry policy for the open source distribution. Perhaps that was a mistake in hindsight, but introducing telemetry there would've been another can of worms!

Now that the usage is more visible, it's actually kicked off a lively internal discussion. We're going to take some time to gauge the size of the user community and figure how we can best support it. Aside from individual use being free (still the case) and making deployment more straightforward (through something like code search in App), are there other things we can do to make it easier to adopt? Sorry about the confusion here, we should have handled this better. But the silver lining is we realized there were a lot of users of Sourcegraph that we didn't know about and we're now discussing how best to engage and support you all. I do hope you'll take the chance to pop into our Discord and say hello and continue with feedback that can help us make the best decisions for our users.


Just wanted to add that some simpler option of buying the license would be sweet, it tends to be much easier to get signed up for something that doesn't require contract and we can just use company credit card. Maybe it could be available while offering something akin to previous Free Enterprise (without new, cutting edge features) license but with higher seat limits? I don't know, just spitballing.

I like paying for things that I use and bring me value, but at the moment Sourcegraph is a hard sell due to high cost compared to small company size and being based outside of Western Europe and US.


Hey there, I'm an individual dev who uses it only for personal projects. I use the docker container for on-premise, private usage only in a VM with no internet connection that I run at home. I use it to index all my personal projects and open source repos... 100 or so repos.

I really wish the enterprise stuff was available to the single user case like mine. Everything past v4.3.1 is hot garbage, as you guys really nerfed it for non-enterprise users ever since you removed "showEnterpriseHomePanels" among a few other things. I regrettably can't upgrade to anything newer because of that.

I would LOVE to have the ability to pay you guys say, $100 a year or something reasonable for just a private onsite enterprise license for one person -- myself. I continue to advocate for your products in all the enterprise contracts I work as that's what I'm most familiar with and tinker with when I work on my own personal stuff on my own free time.

I think it would be great to just have all those features for one user, and for a reasonable yearly fee for a single developer to be able to afford.

If you guys are concerned about being ripped off -- For example, some company buying this "single user enterprise license", I don't think they would if you limit it to one admin user. No company is going to risk sharing creds for a superuser account / keys to the kingdom, and if they do, they are some fly-by-night company with no sense of security that was never going to give you guys money anyway.

That's my two cents.

tl;dr: I can care less if it's open source or not. I just care that enterprise is affordable to the home user and that I can still run it on premise and in a semi-airgapped network environment.


When you say Sourcegraph has a zero telemetry policy for the open source version, can you tell me where in the pipeline the events are dropped?

From my prior code inspection, I recall seeing OSS code making regular “pings” back to Sourcegraph even when telemetry options are disabled by users. Survey toasts also appear to send information to HubSpot when answered.


Also, I appreciate all the comments here and find them fair and thoughtful, including the critical stuff. You can join our Discord at https://discord.gg/rDPqBejz93 to chat more after this goes off the HN frontpage. And if anyone wants to chat with me directly to share feedback or complaints, let me know (and we can share the recording publicly if you're OK with it).


Announcing a major change like this in a PR screams disregard for your users.

Are you sure this isn’t just a way for you to crack down on license abuse?


[flagged]


It looks like langserver.org they maintain is indeed serving the right content and the repo accepts PRs. What did Sourcegraph do wrong here?


They have no relationship with LSP to begin with.


How do you know?

Quick search reveals

https://github.com/sourcegraph?q=Lsp&type=all&language=&sort...

One of their employees maintains a Zig LSP https://youtu.be/h_7o3yroYy4


There is no squatting happening. I personally merge PRs as soon as they come in.


I asked for an enterprise trial once because my company was seriously considering a purchase, despite not being able to properly evaluate the OSS version.

The response was basically "sorry, we're too busy".

Your business model is fascinating.


Great straightforward non apologetic answer props. Refreshing compared to what we often get here nowadays and reminiscent of how ceos used to reply here.

(I Don like the news itself obviously but the delivery was good)


Do you see any potential trademark conflicts you may run into against Google due to Codey[0]? I don't know which was announced/branded first, but I imagine a big co like Google is tough to win against even when you're in the right.

[0]: https://cloud.google.com/blog/products/ai-machine-learning/g...


We were first. No, we don’t anticipate any issues.


In case someone is wondering how to search local repositories with sourcegraph, see https://docs.sourcegraph.com/admin/external_service/src_serv... and https://docs.sourcegraph.com/admin/deploy/docker-single-cont...

  docker run --add-host=host.docker.internal:host-gateway --publish 7080:7080 --publish 127.0.0.1:3370:3370 --rm --volume ~/.sourcegraph/config:/etc/sourcegraph --volume ~/.sourcegraph/data:/var/opt/sourcegraph sourcegraph/server:5.1.2


  docker run --rm=true --publish 3434:3434 --volume $PWD:/data/repos:ro sourcegraph/src-cli:latest serve-git /data/repos


Are you planning on continuing all development in public, or will you make the code private?


If anyone's looking for an open-source search tool for grepping across repos (or even one large repo) at insane speed, I highly recommend livegrep:

https://github.com/livegrep/livegrep

Demo at https://livegrep.com/search/linux

We used it at Stripe and it was quite popular; often, searching even a single repo was faster on livegrep than with ripgrep locally.

A post reviewing it: https://www.alexdebrie.com/posts/faster-code-search-livegrep...

A post by its creator, nelhage, on its impact: https://blog.nelhage.com/post/reflections-on-performance/ and another on its architecture: https://blog.nelhage.com/2015/02/regular-expression-search-w...


https://oracle.github.io/opengrok/ is open source and very good at huge source base, e.g. for the whole android and linux kernel together, fast and useful.


Any guides on deploying it, preferably with ready-made Dockerfile and docker-compose.yml files? I looked into it a while ago and all I found was quite outdated.


it took me about one hour in the past to install it, no docker though, not really that difficult and I feel it's really worthwhile once it starts to run.


too bad it is associated with Oracle


I don’t think it’s fair. It’s a Sun tool that Oracle inherited and never made an attempt to monetize.


And as multiple previous sequences of events have made clear, attempting to apply obnoxious monetisation strategies to a Sun-inherited project with a substantial userbase tends to mostly be a really good way to get it forked out from under you.


Goodbye Hudson, hello Jenkins!


never made an attempt to monetize

Yet.


I'll add something I have been working on https://github.com/boyter/cs which is aimed at a smaller scale. It works fine for multiple repositories so long as they aren't too large.


livegrep is... fine. It's literally what it says it is. It's a web version of grep. livegrep is definitely not a replacement for sourcegraph, which actually understands the underlying code, and lets you follow code paths, search for references, etc.


Never found a startup on the premise that someone else's product will be inadequate forever.

The recent rewrite of github search has probably made sourcegraph irrelevant. If you may recall, original github search used almost the most horrible algorithm possible. It dropped all punctuation and spacing and just searched for identifiers. No patterns allowed, no quoting allowed. One of the only meta-arguments was filename:xyz.

Now that github has improved its basic search functionality, sourcegraph might be doomed.

I used sourcegraph at Lyft which (at the time) had unlimited money to waste on software tools, and installed the open-source version at Databricks but nobody cared.


> The recent rewrite of github search has probably made sourcegraph irrelevant.

It only makes it irrelevamt if all your code is hosted on Github.

I'm quite tired of Github-proprietary solutions being hailed as the "industry norm." Or vendors like shipping products and integrations that only work with Github. Git is a decentralized protocol; please treat it like one.


I strongly agree with this sentiment, whether applied to github, AWS, or any other popular platform. The industry should avoid single points of failure.


(typed on Hacker News, on an Apple Mac, probably on Google Chrome)


Care to elaborate how the choice of a person’s equipment and browser relates to GitHub constituting a single point of failure for the industry as the prevalent Git hosting tool?


In my opinion (you may differ), Apple and Google are notorious for locking their users onto their proprietary platforms to milk as much money as they can, no matter at what cost.

So complaining about Github proprietary tools and at the same time supporting Apple and Google for doing the same is a bit contradictory. Should be using an open OS and browser.


Again, the original thread said “The industry should avoid single points of failure”. Neither Apple (Mac) nor Google (Chrome) represent single points of failure for the industry.


It depends on the angle you look.

From a software eng infra standpoint, in general, ok. But we can look elsewhere.

In the mobile space, for example, it's hard to deny iOS & Android, Appstore & Google Play.

In terms of privacy, the internet is a complete failure with Chrome, Google Analytics, etc.

A large chunk of the most relevant software businesses go to the extent of mandating developers to use Apple products. I experienced this myself, when I wasn't allowed to use a PC running Linux for "security reasons". Like if Apple OS is safer than Arch. That's a big point of failure, in my opinion...


The problem is that "people who don't want to pay to outsource things" isn't a lucrative market of providers of outsource services..


I still regularly download gh repos just to grep them because while less bad it still somehow sucks...


Yesterday I found out about git-peek (https://github.com/Jarred-Sumner/git-peek). Instead of describing how satisfying it is to use, here is a GIF: https://imgur.com/a/cT8zAha

It uses a temp directory, and deletes it when you close your editor.


I just clone it to /tmp, that gets removed on restart.

Having it as button in browser seems cool but also horribly insecure...


The button on the browser just navigates to the URL `git-peek://https://github.com/name/repo`. How your system handles this git-peek protocol is completely up to you. While the git-peek package does offer to setup a handler for this custom git-peek protocol, I went ahead and set it up manually. Now, my system calls this bash script whenever it encounters the git-peek protocol:

  #!/usr/bin/env bash
  # Expects a single argument: git-peek://<path>
  # Example: git-peek://https://github.com/Jarred-Sumner/peek
  kitty --single-instance --detach -e zsh -c "source ~/.zshrc; git peek $1"
You can set it up to do anything you like.


What happens if you click a link to git-peek://$(cat</etc/passwd) ?


I'm not sure. What?

Is there a reasonably legitimate reason to stop using this?


That's the issue here. You need to be 100% sure handler is entirely bug free or any site can redirect to that url and exploit any bug in it


I tend to clone things I'm just having a quick look around in into ~/tmp - sometimes I'm intending to spelunk through the history so I don't fancy having that much data sat on a tmpfs and "running rm -rf ~/tmp/* when I notice it's getting a bit on the large size" is minimal enough effort that it's worth it for having control over when things disappear.


I’m curious to know, have you tried their dedicated search site? cs.github.com

Their default search still sucks, IMO. But the one I mentioned is comparable to Google’s internal CS.


Why do that have a good search that's not part of the main site?


I actually think cs.github.com is now the same as github.com/search. It was in beta for awhile but they recently started redirecting it.


Yep I have a small shell script that caches a repo with only the last commit files and runs ripgrep on it. I'll give a go at live grep that's discusses above, that look exciting.


Yeah, even after their rewrite it’s essentially worthless.

In some ways it’s even worse now, because it seems to only build an index of the repository on the first search, when I need a result now.


I usually switch the url from github.com/whatever to github.dev/whatever

That will load a web version of VS Code, and you can then use the search from there.


It seems great across an org now. I can quickly answer the question of is anyone using something across 1000 repos.


Me too, but usually after I already have 15 GH tabs open and have wasted a bunch of time.


I use grep.app instead, for open source projects. Many projects are indexed.


I don’t think GitHub search will replace source graph.

1. GitHub isn’t free, especially for large private organisations 2. Source graph has much better search functions compared to GitHub


It’s free for open source.

For large private organizations they are paying already so more likely to use built in search than buying a new product.


Is there still an "enterprise" niche that doesn't use GitHub's cloud version? (Or GitLab, etc.)

My understanding is that GitHub's on premises version doesn't have any plans to include the new code search functionality.


Did anyone actually use the open version? I dimly remember that I looked into it like 2-3 years ago, but all the really interesting stuff was not included in that. The pricing for enterprise was absolutely bonkers, something like 100$ per month&developer, which already made it clear that they are obviously only targeting big players with infinite budget. Seems the pricing is now changed, and it "starts at 5k/year" for some "Enterprise Starter" edition, but despite lots of bullet points it is very unclear to me what the limitations really are. I'm actually really interested in this product and it might be a good addition to our tool set, it's a shame the pricing is so opaque.


OSS version didn't have official Docker images prebuilt, you had to build them yourself and for a long time the OSS build was broken. A year or two ago they promised they'll fix it or even provide official OSS image, this didn't happen.

One person finally created a working release train on github and was releasing OSS containers, Docker Hub reports it as 10k+ pulls, which is a lot for unofficial image.

I tend to stay away from any 3rd party tool that's not really a main part of infrastructure and requires sales contacts. It's wasting my company time to deal with sales when it comes to a few licenses, just give me a number input and buy button. I also lost some confidence regarding Sourcegraph, as it seems they change their direction, pricing and rules multiple times per year


It is a good product, so we're tying to put our org in a position to pay for it. But I really don't like this model of intentionally frustrating or obfuscating whether it is open source or not, or not listing/hiding what is or isn't in enterprise vs OSS versions. For example, the fact that starting off open source (assuming you succeeded running the maze of figuring out how to run it), explicitly blocked the path to goto enterprise is a shame, and seems like a bit of a missed business opportunity.

We'll probably see more of these faux OSS projects who hoped for some community/network effect from being OSS to translate into either strong donations or considerable uptake of their enterprise/cloud/managed versions go this route.

To be clear, fully support charging for and paying for SaaS. Would just like be able to know what is in front of me when making build/buy decisions.


I've tried finding out if they still offer the free tier now that the OSS is gone, because one of their employees mentioned it on Github, but failed to find anything. I can see they have a free trial. Their own website is full of broken links and contradicting information due to the frequent licensing/product changes.

I've even found a page still mentioning the OSS version and telling you how easy it is to fire it up.


They shot themselves in the foot majorly on this one.

When you realize that one of your products isn’t being heavily used and is OSS, you don’t close source it, you keep it open!

This shows that the value add of the paid product is actually worth it to customers.

Now?

Not so much.


Yes, I remember the Docker thing, but I also remember that at least the language parsers we were interested in were only supported through some kind of plugin mechanism, which the OSS version did not support, so it was useless for us, so I didn't ever bother testing it.


So basically their "open source" offering wasn't really open source, it wasn't even open core, as you had to build it yourself, the build wasn't working for a long time and even if you managed that, some languages were enterprise-only.

Nice. How to get free marketing by strapping Open Source to your product and writing a bunch of announcements.


I’ve had a few consulting calls with companies where they (or their VCs) want to call their software open source but have no interest in community development and want to block competitors from taking any advantage of it.


I remember looking for Sourcegraph submissions on HN, their CEO's (sqs) first submission to HN was titled "Which Open Source License Should Your Project Use If You Want to Raise VC$?"


QED.


I installed the opensource version for fun at work, and synced about 750 repositories. I applied some patches to support OAuth2 proxy and removed the telemetry. It's great software, very fast and it works as intended.

A few months later and 70 registered users, I have a total of 3 users who used it a few times.


Maybe you could create some short demo for people?

I've done a demo and created documentation, 80% of the developers at $MY_COMPANY used Sourcegraph Code Search at least a few times in the past month, some of them are doing a dozen searches per day.


What do people use?

My company lives in code search. Most bug fixes and small enhancements start by investigating code search, and clicking a link from code search to start editing the file.


They don’t use anything for internal codebases.

However it’s not that bad because we work mostly on short lived research projects and it’s not that much to reuse. Public code search and ChatGPT are used much frequently.


Oooooops. That sounds like New Relic's $400/month per full user...


Or Gitlab pricing.

Why are companies isolating themselves from smaller clients?


The marginal cost per customer is not that different if it's a small customer or a massive one, so it's much interesting for you to go after the big ones that will generate much more revenue. You miss out on revenue from smaller customers, but your margins are higher (and it's totally possible that the revenue from smaller customers isn't financially directly worth it, if the support costs around them are too much).

Of course that skips the bigger picture like more people knowing and using your products and being happy with them leads to more champions for them doing your sales and marketing work for you, which can create lots of revenue in the long term; but it's pretty much impossible to quantify.


Because they figure there's a bigger opportunity cost in the possibility of the big clients using cheaper tiers than in the possible lost customers who won't go for tiers at the current prices if an alternative is available.


You could always add seat limits at lower tiers, truth is that almost no company in my region is paying for Gitlab, but there are plenty that are paying for GitHub. There's also the problem with big customers, you can see it in Gitlab's case, namely they demand a lot of stuff and they have the leverage. I worked at a company like this, we couldn't stop focusing on niche custom dev for Mr Big Buck while our core offering was neglected and we stagnated.

It's really healthy for a company to have a whole spectrum of clients, from people that chip in 2-4$ per month for a minor bump in capabilities and without any support included to big companies that push for exploring new features.

In my opinion, Gitlab is a primary example of feature-creep and it will be the demise of the company.


>In my opinion, Gitlab is a primary example of feature-creep and it will be the demise of the company.

Gitlab was VC funded which means it had little choice but to take on a high risk/high reward approach. It worked fairly well since it IPOed at $12b while it's last round of funding was at under $3b. Good return for VCs. Of course the stock is down over 50% since IPO but that's someone else's problem.


In lots of VC funded companies I always miss a basic pricing for users that don't need much but don't want to use a free tier that will probably dissapear in a year or two when the VC gets nervous.


$400?! Didn’t it use to be something like $20?

No wonder my enterprise is automatically kicking people off when they don’t use it.


And that's not even the enterprise version!

In every tier there are $49/month "core" users (basic dashboard viewing, no editing, no APM, no monitoring, no tracing...)

In the "Standard" tier you have $99/month "full" users, but you're limited to 5 total users, no SSO, and AFAIK very basic support.

In the "Pro" tier you have $418.80/month "full" users

In the "Enterprise" tier you have $658.80/month "full" users

No wonder people is paying for core users and then juggling generic full users. Although they limit you at 3 concurrent sessions at the same time, and monitor that, and will contact you if you login from different IPs and browsers to the same users.

The most interesting part for me is Pixie... but you can do that for free.


It also looks like they're now pivoting hard into AI bullshit, and barely want to acknowledge that the reading product even exists? Absolutely bizarre.


But it will be open source this time, pinky swear


I don’t think it’s fair to say that you need “infinite budget” to pay $100 per month? I pay more for several products and have a quite limited budget. But tools making me more productive or help my business is definitely worth to pay for.


100 USD per developer per month is 1/2 of our whole infrastructure cost at my company. You could get GitHub enterprise seats for 1/4 of that. Dealing with sales, contracts and licenses also has some non-zero opportunity costs.


For which products do you pay more than 100$ in license cost per month and developer? I only know prices like these for fairly specific niches, like development environments for very specific hardware/FPGA, or sophisticated physics simulation software. But this is a code search&navigation tool - a nice one, I'll admit, but still. That's 4x the price for GitLab Premium. Almost 2x of the whole Adobe Creative Suite.


There's life outside the United States, and $100/month is a decent chunk of a junior's salary in many parts of that. Two places I worked at migrated off Microsoft products to FOSS solutions for that reason alone (I was fortunate enough to initiate and complete this process for one of them) — something that has zero business sense according to most HN commenters.


Does it make you and all your peers $100 more productive each? Sounds a bit of stretch to me for code search. I don't even think I need to do a code search that bitbucket/GitHub/gitlab search aren't able to provide me every month. To be honest, I don't even think I'd be less productive even without those. Although they are helpful when discussing code that you don't have in your ide at the moment with other people.


License was changed almost 3 weeks ago, 5.1.0 release blog post skips this information. There's still no official announcement.

It seems like the author of Sourcegraph OSS containers announced that his release train is now dead

https://github.com/jensim/sourcegraph-release-train/


Their support in the demo period sucked, their complex C++ support was lacking, they didn't integrate into modern C++ build systems well, and their prices were insane.

They kept trying to push this "campaign" feature on us, which is an overly-complex auto-refactoring tool that couldn't even support our non-proprietary, well-known build system. For the cost of their license, we instead hired two developers for code refactors, who then went on to make other tooling, and we didn't need to hire someone to babysit their crappy service integration.

I would not say that they had found their niche when speaking to us. Perhaps it has gotten better.


Based on your use of "campaign" (the older name for Batch Changes), it sounds like you were looking into Sourcegraph about 2.5 years ago or before that. Lots has changed since then.

We recently released a new indexer scip-clang (https://about.sourcegraph.com/blog/announcing-scip-clang), which we've used to successfully index large codebases like Chromium. The indexer relies on a JSON compilation database (same as our older indexer lsif-clang) which is easy to produce from CMake, Bazel, Meson, Make etc.

We've also added support for cross-repo code navigation for C++ recently. (https://about.sourcegraph.com/blog/c-cpp-cross-repo)


Typesanitizer, I just want to note that I'm not downvoting you. I found nothing objectionable about your comment.

I'm unsure what the expectation is here. I'm allowed to say we had a poor experience, you should be allowed to say how you think you're addressing it.


2 devs for the price of the license?! How expensive are we talking about here?


We hired 2 junior developers for maybe 20K more total, in total compensation, than their original quote.

But you (and child) comment made me realize that I don't remember the terms of the proposal, whether it was per year or for 2 years. So it might have been a 1:1, not 1:2.

In the end, their product was just completely insufficient for our needs, and it was clearly just gluing open source tools together. The part we couldn't do as well was the front end, and they clearly put a lot of effort into it. It looked and worked nicely. But that didn't help us when they couldn't parse the code to populate it.


It was 100USD per month per seat some time ago, with a high number of devs it may actually be beneficial to roll something on your own.


If it's true, I'd love to hear how many devs their marketing people say you'd have to hire to replace the functionality.


What build system did you use? I thought the JSON compilation database is relatively well supported to generate these days (e.g. used by the language server in VS Code).


I don't want to go into specifics.

The problems we encountered were that they could not rewrite the targets in our build system for the automated refactoring, and any missing include path at all would cause the clang-based tooling they were using to barf.

We just asked ourselves why we would spend so much money on a product when we still had to solve all of the fundamental problems ourselves. We liked the UI, but it wasn't worth the insane license fee.


https://github.com/sourcegraph/sourcegraph/issues/53528#issu... appears to be a comment from someone in the project laying out why they've changed.


Of note, they mention (ordering/emphasis mine):

> We remain committed to Zoekt, the open source code search engine, and will continue to upstream changes to it.

https://github.com/sourcegraph/zoekt

> The source code will remain publicly available.

> Individual devs will still be able to use Sourcegraph for free on public code at sourcegraph.com and within our self-hosted free tier on private code.

> Very few individual devs or companies used the limited variant of code search that was open source. The vast majority (99.9%+) used the enterprise product. Maintaining two variants going forward was a big burden on our engineering team that had very little benefit for users.


A few months back they removed free enterprise license that allowed 10 dev seats, some smaller companies were holding back the updates and looking at the OSS version - I guess, not anymore


So it seems like nobody was using the OSS version, and they didn’t want to maintain two versions if nobody was using it.

It also says they offer a free self hosted version for individuals, but I couldn’t find that on their site.


First they offered a Free Enterprise tier for 10 seats, they've removed this a few months back, their OSS lacked even some basic things as language support, building it was impossible due to lack of documentation/breakage in the build process for several months and they didn't offer sourcegraph-oss images.

At some point, one individual on Github managed to get it working and his images got 10k+ pulls on DockerHub. That's hardly "nobody". Also, some people removed telemetry from OSS version so Sourcegraph didn't even know that anyone is using it.

Also, they were open-closed-open-closed in the last 5 years.

Their website is a mess, even employees on github are providing contradicting information. Original commit message that relicensed bunch of stuff had errors in it regarding what exactly will be closed source now.


I am using the version you can install via `brew install sourcegraph`, though they seem to have abandoned it to make people install Cody (which requires an account even for local use). I will probably use the Brew version for as long as it works. The major pain point is that it seems to have a timeout for repo discovery at 5s, so you can't just clone all of your GH starred repos and search them this way.

P.S. Started a discussion regarding the Homebrew package, but pretty sure it's canned: https://github.com/sourcegraph/sourcegraph/discussions/54589


I've never in general been a fan of "open core" products.

As someone who builds things, it feels like poor craftsmanship to put obstacles in front of your users and limit the extent to which they can use your work.

It also feels like decisions to hamper how people use a product are driven purely by greed.

Let's imagine a world in which Sourcegraph were completely free software. They would probably still have enterprise customers pay them to securely host Sourcegraph on-premise. They wouldn't be able to charge per seat. They would have to make sure their product was cheap enough that their customers wouldn't save a ton of $$ by hiring engineers to maintain Sourcegraph on premise themselves.

I am curious if they (or anyone else running an open core business) has estimates for:

1. How many customers they would lose if they went fully free.

2. How much revenue they would lose if they went fully free.

Building free software and charging people to host it can be the foundation for a sustainable business, but it's unlikely to give VCs the kind of outcomes they want from a successful investment.

To be honest, I think it's fine for infrastructure to be closed/proprietary. There are good reasons to do this if you are writing programs for which security is important - releasing your infrastructure code freely gives attackers a lot of ammunition to work with.

If we believe in the power of automation and in building high quality software, it is possible to build free software that:

1. Is easy for you to deploy and maintain securely on customer infrastructure.

2. Requires very little operational overhead from its you as the host (in terms of support).

3. For which the infrastructure code is proprietary.

This can lead to a very solid business.

Why don't we see more businesses like this?


https://github.com/sourcegraph/sourcegraph/commit/3cd931ef54... has some additional information, but not a lot.


One would expect announcement regarding license change to precede implementation of said changes :/


What is a good open-source system for code search if I want to plug 100 or so git repos into it and have it available over the web? GH search is not desirable because it would search too broadly and would not cover repos on Gitlab etc.

I looked at the Debian code search [1] in the past, but for some reason thought it required a bit too much effort and didn't complete my investigation of it. Though [2] looks pretty approachable.

Sourcegraph mentioned Zoekt [3], but I am not sure how usable it is. If it was pretty good, why did Sourcegraph OSS exist?

Finally, from all the discussion how Sourcegraph OSS was very behind in the past few years, I guess there is no serious plan to fork it?

Edit: GCS release [4] seems to have been open-sourced without a frontend.

Edit2: Livegrep [5] and Opengrok [6] were recommended higher in the thread. Quite excited to try them out but if someone has working Docker Compose configs, I would be very thankful for the head start.

Edit3: there is also Eureka [7]. Seems less powerful but easier to deploy.

[1]: https://github.com/Debian/dcs

[2]: https://github.com/Debian/dcs/blob/main/howto/building.md

[3]: https://github.com/sourcegraph/zoekt

[4]: https://github.com/google/codesearch

[5]: https://github.com/livegrep/livegrep

[6]: https://oracle.github.io/opengrok/

[7]: https://github.com/Rajeev-K/eureka


[4] is not really a usable 'product'. Livegrep (https://github.com/livegrep/livegrep) was inspired by it and is very usable.

[3] used to be a Google open source project as well, but it fell out of maintenance, and Sourcegraph took it over. It powers most of the basic regex/literal search in Sourcegraph.

Mozilla's code is searchable in Searchfox (https://searchfox.org/) which uses the indexer from Livegrep, combined with their own Git indexer and language-specific cross reference databases.

OpenGrok (https://github.com/oracle/opengrok) is also rather well known, but I have found it to have a slightly worse UI than alternatives.


And 'cs' for smaller repos and CLI use.

[9]: https://github.com/boyter/cs


There is also Hound [8].

[8]: https://github.com/hound-search/hound


It was open first. Then closed. Then open again. So, now it's closed again...


That's how you build confidence in your company and executive decisions.

Let's see if we get the same amount of upvotes their post got when they open sourced the thing.


Yet another concrete example of why copyleft licenses are better than pushover ones and why CLAs are bad. It would have been illegal for them to do this if the old license were copyleft and they accepted contributions without a CLA.


I don't understand your comment. What do you think would have happened if it was GPL instead of Apache? That a person would come out of nowhere willing to rewrite all the SourceGraph owned code in the repo?


No. If it were GPL instead of Apache, and it had contributions from other people without a CLA, then Sourcegraph wouldn't have been legally allowed to change the license, so nobody would have had to rewrite anything.


Sourcegraph would still be able to relicense tbe contributions made by their employees.


Yes, but they'd have had to rip out everything from all of the external contributors, and that'd be a big enough deal that it'd probably change their mind.


The crude reality is that development tools cost real money to be developed and we no longer are in a environment where VCs are showering companies with bags of money without some really down-to-earth, concrete plan for profitability. Companies like Microsoft and Google can have themselves the luxury of keeping projects like VSCode and Golang open source. The economics make sense for them. Not all companies can do that, especially small startups. I remember a time when buying a C compiler cost money, real money. I don’t think we are ever going back to that, but I also think that paid tools with enterprise pricing are back. I don’t care about the morality of that and I am arguing pro neither against it. It is a just a fact, a seismic shift that we can’t really stop.


Luckily you can create the biggest of things with just a compiler and a text editor. A repl helps but is gravy.

Devs have gotten drunk on devtool sugar imo.


Yes. Tools like sourcegraph are cool, but it is not like they are going to make you X times more productive.


The marginal efficiency gain over the free ripgrep is so small


If you find yourself needing to search a large (xGB) codebase you should at least try some CLI tools first:

  fl() { # find line
          rg "$@" . --color=always --line-number --no-heading --smart-case |
                  fzf --preview-window='top:60%:+{2}+3/2' \
                          --preview='bat --style=full --color=always --highlight-line {2} {1}' \
                          --delimiter=':' -n 3.. \
                          --bind "enter:execute(vim {1} +{2})"
  }
Obviously not the same, but I often find it enough. Short demo https://imgur.com/a/hsyINjS


Not really comparable, honestly. "Gigabytes" has nothing to do with it. Sourcegraph can e.g. index multiple repositories across multiple languages and link them together at large scale. You're doing an extremely easy case where 99% of the "meaty parts" are written in a single language (Nix) in a single repository (nixpkgs) with a very formulaic structure, where the answer you're looking for is also in that same repository. Finding like 90% of things isn't actually hard for that reason. I love Nix, I love that, but it is a fraction of the cases these tools handle.

The hard case is this simple extension of your example: I found the definition of X package in Nixpkgs. Now how do you find all the users of X, across, say, 10 other repositories? Or all of GitHub? That isn't theoretical; if you make a backwards-incompatible API change to a NixOS module, you might want to know that. So suddenly you need a lot more things in place to make this work. Now change X so that it's something like an RPC interface defined in protobufs, and then change your query to "What clients are using this interface and what servers define it", and keep in mind these can all be in different languages in different repositories. That is not so easy with Ripgrep, but tools like Kythe or SourceGraph can handle them with far, far greater ease.

Also, for many cases, you actually need language aware search and the search engine needs to understand more structure than just utf8 bytes to answer you. Ripgrep won't help you find the definition of that fucked up thing that was defined by a template instantiation that was hidden by a macro in C++ from a header that was generated at build time, that you are only looking up because it was barfed out from some huge stack trace that came from production. SourceGraph can answer that instantly with no false positive (assuming you have SCIP indexing as part of your build system.)

Yes, ripgrep is nice and I use it when writing nixpkgs patches all the time. But something like SourceGraph, Kythe, OpenGrok etc are all really a completely different class of tools.

And the "X gigabytes" fact isn't really that impressive when you realize all the weight is in the .git/ directory of Nixpkgs; ripgrep will instantly filter that out and never even search it, so it isn't actually searching a working set of that size. The actual pkgs directly is in contrast about 300MB. It still is crazy fast though, no doubt.


No argument, rolling out change sets is also a huge win. My point is that many people do not know about the tools they currently have at their disposal.


Fun Fact: Steve Yegge, famous blogger, is Head of Engineering there (or at least was recently)

https://about.sourcegraph.com/blog/introducing-steve-yegge



I am getting:

> This blob took too long to generate. But you can view the raw file.


It was working when I posted it, if you click view raw file it will take you to the changelog.


Their product basically got sherlocked by Microsoft when they released Github Copilot, no?


Their main product is Code Search, Cody is a new thing.


Code Search seems a limited market so I would be surprised if their plan wasn't to go after deeper code tooling in the future. Otherwise I don't see how their $2.6 billion valuation made sense. They likely thought that an iterative approach based on human encoded code understanding would allow them to build better systems. Pretty reasonable assumption 1+ years ago. Then GPT3/4 proved that you could just dump data into an AI and probably get an even better result.


But that makes no sense. If AI is the future, you'd leave Code Search open-source and make the AI assistant closed-source.


Sourcegraph has recently released Cody which is the more direct competitor to Copilot. And it's free for individual developers.


I think Cody is based on OpenAI tech. They haven’t built their own model yet (as of 8 weeks ago)


They're using Anthropic by default, but it's possible to connect it to the OpenAI API as well.


I've worked at more than one place that considered sourcegraph and decided it was too expensive (and these were software shops with money to spend on good tooling). With language servers working so well now, I think SG may have already missed the boat and this looks like an early part of their death spiral.


I remember having used a Red Hat (?) tool back in 2002 for understanding the source code of the Brazilian voting machine so we could more easily port it to Windows CE (the 2002 model ran on it initially, then on Linux from 2004-ish). It had a very Motif-like interface. Does anyone else remember its name?


> I remember having used a Red Hat (?) SourceNav (Source Navigator) by any chance?

I used it quite a lot before completely moving to Emacs.


If they've accepted contributions from third parties they can't relicense those contributions, right?


As a general rule with open source projects it basically depends on whether they had a contributor licence agreement.

Sourcegraph appear to have had one:

https://github.com/sourcegraph/sourcegraph/blob/main/CONTRIB...

https://sourcegraph.com/github.com/sourcegraph/sourcegraph/-...

It includes a grant of copyright licence so I guess this is nailed down.


And that's why I believe that requesting a mandatory CLA should drive contributions down in OSS projects. Otherwise, you're basically providing free labor without the counterpart's promise of keeping the code free, which is (should?) be the reason one is making their contributions to start with.


Except that external contributions to a full product are usually minuscule (mostly typos) compared the codebase.


I also don't believe they receive many non-employee contributions (probably because the software is more advanced than most open source projects). You can get a breakdown of their contributions at:

https://imgur.com/a/UCJ6y9r

The interesting one is he second image which filters by code contributors that have contributed more than 500 lines of code churn.

Note, I'm not taking into consideration if code contributions were merged or not, just that somebody created a pull request/commit.


Not really, they have received CLAs so not legally: https://github.com/sourcegraph/sourcegraph/pulls


I fundamentally believe that open-source is a model that doesn't lend itself to sustainability. We all know that lots of companies that can afford to pay for software don't because it's free and open-source. I fundamentally believe that the decision to give away software results in less revenue and therefore less money for developers to work on the product and make it better.

In the ecommerce world there is Magento, Shopware, and Spryker. Magento is open-source and massive. Shopware is open-source and very good and reliable. And then there is Spryker a closed-source platform which is very good from a product point of view. I feel if you look at the product development from those three Spryker is far ahead of the open-source options.

I think we all like the benefits of grabbing a library/tool and using it for free for personal use or when we're starting out and can't really afford to pay so want things to be open-source. But I really think the future is source-available that mixes the ability to use something for free but when they have the money to require a commercial license. This is why I choose to go with the Business Source License - which allowed for additional grants as well as allowing free use for non-production uses. I added in an additional grant for if they're generating very little revenue. As a small independent developer who is developing a billing system it seems fair that people generating revenue start to pay for a license. While it allows those who are just starting out to use it for free and pay when they start making money. As I grow I'll be increasing the amount the additional grant allows. For me this seems the best of both worlds. One where those who can't afford can use it for free and those who can pay help fund future development.

We keep seeing companies going from open-source to a source-available approach for a reason.

[1] - https://github.com/billabear/billabear/blob/main/LICENSE


Yeah Linux will never last against Novell.


So you have one. One out of how many hundreds of thousands?

"But Linux" when the talk that open-source doesn't appear to be a substainable model is like using Zuckerberg as an example as dropping out of college. Lots of people drop out of college and never achieve anything. There are always going to be rare examples that can achieve something.

And the fact the open source community only appears to have one example of how you can build a profitable large business off the back of open source - Red Hat. (Who is currently closing things down to remove competitor.) Really kind hits home that it's a very bare existance for open source projects.


I didn’t say anything about profit.

But you’re right, no one uses Apache, nginx, postfix, bind, dhcpd, ntpd, spamassassian, kubernetes, go, python, c, busybox, irc, email, sip, php, react, … and they’ll never last.


> I didn’t say anything about profit.

The entire premise of my comment was about being substainable. That requires making a profit.

> But you’re right, no one uses Apache, nginx, postfix, bind, dhcpd, ntpd, spamassassian, kubernetes, go, python, c, busybox, irc, email, sip, php, react, … and they’ll never last.

Again missing the point. And wasn't DHCPD the one that had a serious issue because one guy was maintaining it and he had terminal cancer so part of the things he had to do when he was dying was find someone to make sure a critical part of the world's tech infrastucure was maintained? That's not substainable.

I also like how you listed protocols.


Please show how linus has made a profit over the 40 plus years. Same with apache. Or alternately, grow up and realize that profit is not required for open source to persist. Your business model is no one's problem but your own. The world will continue on even if you need a real job.


Haha as I said. One example.

I never said that open-source couldn‘t exist I said it’s not substainable for a business.

I get it you like open source. But as shown open source businesses are generally not substanable.

Sure you have a tiny percentage that can make a living from open source. And of that tiny percentage they are employed by businesses that operate as closed source.

I think really, you‘re the on who needs to grow up. It’s quite pathetic to act like this.


Speaking of which... I've been looking for an open source equivalent of the old NetWare Client.

Anybody know of one? Specifically interested in the "runs scripts at login" aspect.


Hey dang, you might want to point to this comment instead: https://github.com/sourcegraph/sourcegraph/issues/53528#issu...


If a web-based code search engine is what you need here's one: https://github.com/wisercoder/eureka/


They were grifters from the very beginning, squatting langserver.org and making it seem like they it was their creation, mentioning microsoft exactly once (or was it zero times initially?)


Whew.

Got confused with SourceTrail C++ code viewer made by yet another Google intern, Eberhard Gräther of CoatiSoftware.

https://web.archive.org/web/20211115131149/https://www.sourc...


The title might be a bit misleading. Released code can't not be open source anymore, only future development.


I think most understand the idea that a project can change its license at any point, but that doesn't apply to previous version. (In this case, any version prior to 5.1.0)


> Released code can't not be open source anymore...

It can be. You can release under a "source available" license barring it from being used (even compiled), derived, incorporated into other works, making it basically "for eyes only, or we sue you to oblivion".

Many people consider licenses as window decorations, but they are not.


Further distribution can be put under any license they want but any copy anyone received in the past that was Apache 2 licensed remains such and can be used as before.

So if anyone wants to put a past release online they are free to do so (unless there are parts of the code that were restricted before).

It would be prudent to remove branding where possible but that's mainly a precaution.


I think most people don't consider viewable source projects to be truly open source


If the code was under a restrictive source-available license, it wasn't open source...


Parent poster is saying that the already-released versions, which were released under an open source license, are still open source.


Yes you can't retroactively change licenses, but it's also important to know that just because you can read a source file, the file in question is bona-fide open source or free software.

Many people lack this knowledge from my experience.


Solely being able to view source code without other rights absolutely doesn’t make it open source. In general if something isn’t under an open source license it isn’t.

ADDED Unless it’s public domain and then it effectively is.


We had a Windows CE dev license back in the Cambrian Era which included visibility of the source tree, but God help you if you tried to change it and make your own build.


Well licenses are window decorations when it comes to personal use.


Well, no.

Personal use doesn't free you the obligations GPL brings, for example.


Assuming you or a company doesn’t redistribute it, copyleft doesn't matter.


> doesn’t redistribute it

Well sure, but that's a really big caveat.


I would assume personal use doesn’t involve redistribution. Probably true for most company software for internal use as well.


Does this vary by jurisdiction maybe?


nope. If something in jurisdiction would make license invalid that would not mean you can do what you want with the code, that would mean that you can't use that license.


If a jurisdiction releases you but does not prevent you from performing an obligation, licences do not typically consider that a reason for invalidation of the licence.


"no longer" suggests it was, but now isn't, which appears to match reality?


I installed and ran Sourcegraph on my laptop a year or so ago. It was cool, but I didn’t keep it around.

There is a lot of competing technology now, from GitHub’s improved search to new open source LangChain and LlamaIndex support for better document chunking of source code in several languages.


That's part of the impetus behind their Cody product. It uses the code search system as a semantic index. It actually works very well in my experience.


Whoa, first time I'm reading about Llama Index... How does it work? Where can I learn more?


You can search for LlamaIndex docs, or you can read my short book for free online https://leanpub.com/langchain/read



Gosh... Cloud9, Elastic, Sourcegraph, CockroachDB,

what is it with companies making things closed source?


What’s the use case? Finding useful libraries?


Global code search across hundreds of repos even if they are hosted at different SCMs


WizardCoder is free


It's not a code search tool. This discussion is not about Cody but the Sourcegraph search.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: