Hacker News new | past | comments | ask | show | jobs | submit login
[flagged] We compete with GitHub. Bing does not show our website (codeium.com)
184 points by aunch on Feb 9, 2023 | hide | past | favorite | 91 comments



This is pure clickbait and really doesn't leave a good impression of this team. Writing a whole article implying they're not being indexed because they are a Github Copilot competitor when they, themselves, don't believe it is farcical.

Title:

  We compete with GitHub. Bing does not show our website
FTA:

  Now, is it really because we compete with Github? Honestly, probably not, but controversy drums up interest and we need interest so that someone out there on the Internet can tell us what we're doing wrong. [...] we only care about getting as many software engineers to experience the power of generative AI for software development via our free product.


If you came from a nation that never gets snow. You practice in a simulated environment with the bobsled, go to the olympics and participate, did you compete?

Competing doesn't mean top of field. It might just mean making enough to eek out a living.


I understand the sentiment and why it might come across that way, but we genuinely have gotten a lot more reasonable pointers in these comments than we were able to rack our brains about. And it seems like SEO attacks, Bing incompetence, and Bing actually going after some terms (like Chrome) actually have happened, and the main point here was that we genuinely don't get any feedback from Bing.


> but we genuinely have gotten a lot more reasonable pointers in these comments than we were able to rack our brains about

So the ends justifies the means?


Your website isn't quick to find by Duckduckgo or Qwant either. Yahoo doesn't seem to show much either. Startpage.com does find it, though.

I do consistently get your VS Code addon in the first page of search results, though.

When I open up your website I don't really see much. A brand name and some download links are all I can find. I'm not sure if there's much for search engines to find, really.

Your homepage also massively lags my browser for some reason, that can't help in terms of search rankings. It's also continuously downloading megabytes of data for some reason. You should try out your website on a cheap Android phone, your target audience may generally not use those but search engines definitely optimize for them.

> Now, is it really because we compete with Github? Honestly, probably not, but controversy drums up interest and we need interest so that someone out there on the Internet can tell us what we're doing wrong.

Mission succeeded, I suppose. You've got another popular link to your website through HN and you're getting free SEO advice on top. Sadly, you won't be able to get Bing support through here, HN tech support posts mostly attract Google and Stripe employees.


> Mission succeeded, I suppose.

Looks like it did although I really wish we didn't reward these type of brazen stunts to get attention. Outrage-baiting is still very effective I guess.


DDG is powered by Bing, FWIW.


yep, DDG is honestly what got us worrying about this due to its popularity in dev circles, which is who we are trying to reach


Qwant and Yahoo are powered by Bing as well.


Yeah someone else brought up webpage performance, which we will totally look into - definitely haven't fully tested on cheaper hardware. And wrt text, we have lots of blog posts, FAQ, etc to make sure that the term "codeium" amongst others appears a lot.


Yeah even if their product is "better than Co-pilot", I will never use it after this brazen publicity stunt.


To each their own! We were motivated to blog about it by others who have had issues like this: https://daverupert.com/2023/02/solved-the-case-of-the-bing-b...

And at the end of the day, we got a lot of great SEO tips here. Thanks HN!


Which publicity stunt? Not getting listed by Bing or tell about it?


> Which publicity stunt?

The one that they themselves admitted at the end of their post:

  Now, is it really because we compete with Github? Honestly, probably not, but controversy drums up interest and we need interest ...
Another way to get the SEO feedback that they claim that they were seeking is to start an "Ask HN" thread. Surely they would have received useful feedback without as much fanfare and the emotional manipulation.


> Your website isn't quick to find by Duckduckgo or Qwant either. Yahoo doesn't seem to show much either. Startpage.com does find it, though.

StartPage is Google and DDG is Bing, more or less.


Opening that website caused my CPU fan to spin up


DDG is not much different from Bing nowadays.


just with more search terms censored, apparently


There's loads of stuff that Bing doesn't index very well.

If you turn of JS there's almost no actual text on the site; only the FAQ has any serious amount of text. This probably doesn't help. Having the page open also really slows down my laptop and constantly uses ~50%-75% CPU and ~130M memory. No idea if that's a factor, but it takes quite a lot of computing resources to actually make the homepage render any text and perhaps Bing doesn't deal well with that (and even then, it's not all that much text).

Admittedly I have a somewhat slow laptop, but I can't recall the last time merely opening a webpage slowed down my machine this much, and this includes webapps, so it's really an outlier.

Either way, Microsoft specifically blocking it is the least likely explanation by far.


JS-dependent sites are crawlable by bingbot, and Googlebot, and have been for years now. Otherwise, much of the modern web would be absent from search. Things can still happen, but it doesn't look like OP is doing anything unusual with rendering. If Google can crawl, I'd be surprised if Bing was entirely stumped (plus: meta descriptions).

> Having the page open also really slows down my laptop and constantly uses ~50%-75% CPU

OK here, for contrast: https://i.imgur.com/Qb2ux3o.png

> it takes quite a lot of computing resources to actually make the homepage render any text

On my own laptop, I see text in 100-200ms. A 3G connection on a simulated cheap mobile in Sao Paolo gets 2.3s: https://www.webpagetest.org/result/230209_AiDc5N_FKY/

These are both fine times: even a cheap device on a bad connection can get text rendering quickly. It's not a very taxing JS payload.


> JS-dependent sites are crawlable

Yes, I know; but there are limits on this, otherwise people would be mining Bitcoin via search indexers and whatnot. Like I said: I don't know if this is a factor here, but it might be. It's something that should probably be addressed either way.

The initial render time is fine. It just constantly uses CPU, according to htop and the Firefox process viewer, and it noticeably slows down everything.


at a previous company (2019-20) we did some tests on this.

The main problem with relying on js rendering is that it happened much much later than the crawl. Days at least and sometimes weeks. For us, fresh content was key, so js rendering was not possible.


Oh interesting stats, we will look into that. We actually have lots of blog posts with text so not sure about that one though.


There's a lot of stuff that Google doesn't index too, btw, that Bing do. I think what we all need to agree is that... different search index will inevitable have different results anyway.


> Now, is it really because we compete with Github? Honestly, probably not, but controversy drums up interest and we need interest so that someone out there on the Internet can tell us what we're doing wrong.

It's pretty slimy to imply in the clickbait title that you're being blocked due to competing with MS, when you know well that it's not the reason. Admitting it's clickbait in the last paragraph doesn't make it any better.

The HN title should really be edited to remove the "We compete with GitHub" part.


How does he know well that this is not the reason? No explanation has been given for Bing's very different treatment of the site than Google's. He says "probably not" because he isn't assuming guilt, but it may well be relevant.


To put it bluntly, while these guys might think they're competing with Github, the inverse is unlikely to be true. Nobody has heard of them until now. Their search volume on Google Trends doesn't even register compared to Copilot:

https://trends.google.com/trends/explore?date=today%205-y&q=...

Now, there is no shame in not having the top product in your niche. Everyone needs to start somewhere! But for somebody at Bing to risk going against their stated policies and blocklist a competitor, they at a bare minimum need to be somebody MS would consider a competitor. Clearly not the case here.


Using the razor, the most likely explanation is incompetence. Bing has always sucked in general.


Or, as others have pointed out, the incompetence may not be with Bing. Websites that take a long time to load, download megabytes of data, and cause CPU fans to spin up are often down-ranked by search engines.


Last year, Bing and Edge erroneously flagged our website https://sheetjs.com/ as "dangerous": https://i.imgur.com/BvA3zrk.png

At the time, there was no "Safety Report" to indicate why Bing thought it was dangerous. The report page linked to https://www.bing.com/toolbox/bing-site-safety?url=https%3a%2... and it said "That web page doesn't exist"

To fix it, we had to register with "Bing Webmaster Tools" (https://www.bing.com/webmasters/about) and raise a support ticket.

Within a few days, the issue "resolved itself". It's possible that raising a ticket forced some automatic refresh of the indexed data for the domain.


This happened to a team I was on with Google Safe Browsing. The company was a competitor to Google in certain industries. An internal domain was flagged after a security researcher set up an intentionally vulnerable service and visited it in Chrome. The particular domain was shared with some other internal services which caused an internal outage. We registered in the Google Search Console, then leveraged some industry contacts and later our legal team (which had a friendly backchannel with Google's legal team) to escalate within Google. I believe our domains were given special treatment after that.


oof that's rough. good to know what it took for you though, thanks!


For all the hate Google gets, it's crawlers are like 10x smarter than any other search engine. Wouldn't surprise me at all if there was some type of crawling issue that Google figured out and Bing didn't.

This is a case where I'm not ready to attribute to malice what is very likely incompetence.


Operating 8.8.8.8 likely gives Google a near-complete list of every active domain on the internet, and new domains show up there the second any 8.8.8.8 user queries it - which is likely well before anyone links to it. I have no idea if Google use this data for hints to their crawler and indexer, but if I worked there I sure would.


I doubt they actually discover domains that way; for a search engine just knowing a domain exists isn't all that useful. Besides, loads of spam crawlers can find new domains plenty fast enough: create a brand new domain with a blank page and a brand new email address, link it once somewhere, and see how long it takes to get spam because some spam crawler bot found it.


This actually makes perfect sense! I've always wondered why they operate this service. There's nothing in the privacy policy that suggests they won't use the aggregated anonymized information for other purposes.


It would seem very odd to not use this tool which effectively gives a huge competitive advantage over a lot of other engines, although for obvious reasons I doubt they'd shout it from the rooftops.


The strange thing is that other search engines like Baidu and Yandex get this right...


"The inspected URL is known to Bing, but has some issues with indexation"

1) Does "indexation" mean what whoever wrote this thinks it means? :P

2) Why not document what the issues actually are on that page, so that the website owner doesn't have to guess?


> 2) Why not document what the issues actually are on that page, so that the website owner doesn't have to guess?

To be fair, why a website doesn't crawl/index isn't always obvious from a search engine's end.


Haha I didn't even notice "indexation." But yes, it would be very helpful so that we know what the issues actually are :)


Someone else recently notice their website wasn't indexed by Bing and they've got a similar message. The whole story is thoroughly documented. Maybe Codeium can follow their steps to resolve the issue?

https://daverupert.com/2023/02/solved-the-case-of-the-bing-b...


We did sign up for Webmaster tool, verified the site, and did the few things that they told us to do :(


It's showing for me.


If you experience negative seo attacks against your site then being de-indexed by Bing/DDG at some point is to be expected. They are simply not as smart as Google in this respect.

You can go through the procedure of registering with their webmaster tools equivalent, submit a ticket and wait.

Or you can simply just wait.

In my limited experience (6 sites) the result was the same. Eventually it will resolve itself and you'll be back in the index, albeit with some very weird links showing up (an effect of the original attack). This does, at least, alert you to some vulnerabilities you may have previously been unaware of.


thats unfortunate. yeah, we will submit a ticket - thanks for the pointer


Start with a robots.txt?

https://www.codeium.com/robots.txt

Update: I also see that you serve your site off GitHub... so that might be part of it as well? Bing might not index those sites?


No robots.txt is basically the same as the default contents of a robots.txt saying "All bots welcome"


No robots.txt is unpredictable because it depends on the implementation of the bot.


A missing robots.txt has always meant "allow all", so it would be really weird for this to be the cause. Robots.txt is primarily for disallowing (or really, discouraging) crawlers from crawling certain paths.


Why would Bing require a robots.txt when all the other major search engines are fine with a sitemap.xml? Or are these separate somehow?


Because this is the lowest-hanging fruit imaginable, why not take the 5 minutes to add a robots.txt and then use https://www.bing.com/webmasters/robotstxttester to confirm that Bing can fetch/parse it?


You're certainly getting a lot of free SEO help today. I'm not the one who has problems getting my site indexed, but it seems like they are different things.

  # Allowing all web crawlers access to all content
  User-agent: * 
  Disallow: 
https://moz.com/learn/seo/robotstxt

At least you'll be able to look in your logs and see if BingBot is grabbing the file or not.


A day later, I see you added the robots.txt and it shows up first now.


The funny thing is, if you search "copilot" on Bing, on the right you'll see "A second pilot of an aircraft". Google returns Github Copilot


Yep, the reality is that Bing simply delivers an inferior product. I don't think they are actually trying to be malicious in this case. There is a reason Microsoft bribes you with gift cards to use their product ;)


honestly if you look at the incident history for github, coupled with the ham-fisted tantrum Bing throws if you ask for Chrome ("There's no need to download a new web browser.") , it feels pretty on-point to say Redmond is throwing knuckle sandwiches to stay in the running for any relevance on the internet these days

https://www.githubstatus.com/history


"tantrum Bing throws if you ask for Chrome" is honestly a mild offense compared to the dang Windows 10 Operating System warning you that you that installing a non-Edge browser was going to make things slower and unsafe. I'm still grumpy about that one.


It's easy to underestimate just how difficult it is to get your context indexed and ranked well. Personally I was under the delusion that if you make a solid website, you'll get some organic visitors through search engines from people searching related terms. Turns out that might well be wrong. In the first few months of launching my fairly serious effort website I received a total of 3 organic visitors. While proper SEO and link spamming would surely get me some more, I found it disappointing enough I just gave up


> Instead, Codium, an algae genus, gets the sidebar slot while Codiaeum Variegatum, an (admittedly pretty) houseplant takes the fourth spot.

That makes sense. I never heard of "Codeium" before. Perhaps "Codeium programing" can give a better result for a programing tool. https://www.bing.com/search?q=Codeium+programing

[Edit: fixed name and bing link]


Please note that the OP was codeium (with an E). There website is codeium.com and is not indexed by bing (and is therefore not in your results either). It isn't even shown when you search for 'codeium.com': https://www.bing.com/search?q=codeium.com


It gives results about VSCodium which ironically a community fork of VSCode (without MS telemetry)


I had an agency client in 2021 with an identical issue. In that case, the agency was able to contact a support representative for Bing's Webmaster Tools that provided a little more info, and some further investigation on my end revealed that the Bing crawler was doing something idiotic that was triggering http error response codes from the application, and the crawler took that to mean that the site was broken or down.

It was an intersection of multiple stupid things, not maliciousness.


Bing's weird with some websites. For instance it will refuse to show the bun homepage (bun.sh) with anything short of a direct copy of the title: "Bun — fast all-in-one JavaScript runtime", even then it gives you a link with a `?ref=hackernoon.com` query parameter for some reason?

It is happy to show its github repo and a bunch of blogspam about it though.

It is even happy to link you to bun.sh/install, which immediately downloads a shell script to your computer upon clicking. Bizarre.


Bing is known for removing websites for no apparent reason. My website was de-listed for 2-3 months one year ago (personal blog, writing for 10 years, no spam, no ads). It happened recently to another dev blogger: https://daverupert.com/2023/02/solved-the-case-of-the-bing-b...


At risk of encouraging the trend of HN turning into tech support, or worse ( guerilla-marketing bait), you should try Bing Webmaster Tools: https://www.bing.com/webmasters/about . If it's anything like Google's Search Console, there's plenty of information and tools to diagnose indexing issues.


Why should this codeium suddenly rocket past the already established codeium product that was indexed? Complaining is easy. SEO takes work.

When searching bing with: codeium coding

That search will bring this product up in the first position. It's a shame that the specific "indexation" issues aren't shown, but did the author read the linked bing page to see the list of possible issues?


Not sure what you mean here - our product does show up in the first position in Bing, it's just not our website, it's our listing on the VSCode marketplace. Our actual website doesn't even show up on Bing at all, irrespective of ranking.


It also doesn't show google scholar and twitter for me when I recently switched to try out bing instead of google. With google the scholar and twitter results were in top 3 but couldn't find them with bing, even 4 or 5 pages down the line, while github and pytorch forum posts were available on the first page itself.


1) there is no robots.txt 2) JSON+LD markup is minimal and there is none on the home page 3) your navigation is not fully crawlable - About and Pricing break


Maybe try a 5 minute conversation with someone who does SEO for a living before writing up a conspiracy blog. You have essentially no backlinks to your brand new site. Popping you into ahrefs, it looks like you guys weren't even on the map until December. If I filter out the spam domains and isolate just domains that are at least DR 20 with any traffic at all and exclude subdomains, you have links from 16 domains and all but one are automated junk and non-editorial.

Additionally you have no robots.txt file, and your homepage has no self referencing canonical. TL;DR you don't show up in the index because you are not yet notable. Do some digital marketing.


Also why do you have four H1's? You should have exactly 1, and on your home page it should be your brand name. This whole site is a mess. Instead of trying to be fancy I'd recommend you just make your marketing site on something like WordPress which is going to get 99% of this stuff right for you out of the box. You don't need React/Nextjs for a marketing brochure site and rolling your own solution when you don't know understand the fundamentals is not going to lead to a good result.


> Now, is it really because we compete with Github? Honestly, probably not

There's no way of really knowing. Bing is a black box with no transparency, like most search engines.

One thing though: `Codeium` seems like a generic and vague word. Also I won't remember how to spell it because of the weird `e` before the `i`. I can see why Bing has trouble even recognizing the word. It's optimizing for the correct spelling. Try a rebrand, something catchy, and not something that has a hundred other words that sound like it.


I'll throw in my critique, write a <title> or at least read up on the SEO importance of doing so


Not sure if changed something, if a MSFT employee saw... but Codeium is the top #1 on my bing search.


Bing doesn't show my website either and my product doesn't compete with M$.


4chan also does not show on Bing for me, but it does for some people it seems.


You weren't kidding. Even with "safe search" off, it doesn't appear there [0] LOL. However weird, maybe that's a good thing.

[0]: https://www.bing.com/search?q=4chan


I clicked your link and it came up on the very top of the page without having to scroll or anything.

https://i.imgur.com/yfmXHSa.png

I almost never use Bing, and am pretty sure I’ve never even opened Bing on my phone at all before. Settings show that “Safe search” was at “Moderate”, presumably the default.

https://i.imgur.com/1QB5PUI.png

On an unrelated note, Bing uses purple as the default color for all links? That’s confusing to me. Purple to me on the web used to mean previously visited. The only board on 4chan I visit these days is /g/ and only very rarely as well. Scrolling the Bing results, all results have purple color.


Sounds like codeium need some of that copium.


Their robots.txt page gives a 404 error.


Curious how Codeium compares to Tabnine?


Here's the blog post where we made the comparison: https://www.codeium.com/blog/code-assistant-comparison-copil...

As a tl;dr there is very little reason to pay for Tabnine. Copilot is far superior in the realm of paid products, and Codeium is comparable to Copilot.


Solid comparison and interesting read for sure.

What exactly is the business model though? It's always a red flag when "privacy respecting" and "free" are bundled together. I definitely _want_ to pay something for that type of service.


We answer this partially in the "How do you plan to make money?" question in our FAQ: https://www.codeium.com/faq

tl;dr potential pro plan for additional features (keep autocomplete free) and enterprise offering


Nice! I see it listed in the FAQ and that makes more sense.

I'll sign up and take it for a spin.


What are the plans for Codeium to be sustainable? I don't think how it will be free forever if it really competes with Copilot.


We actually worked out the math - we don't use any external APIs like OpenAI (use our own models) so those costs aren't prohibitive, and we have a bunch of experience as a team building scalable ML systems, so we have driven down the serving costs by a lot.


Looks like this post is flagged too... I'm also competing with you/co pilot with https://text-generator.io but I'm on Bing, take a deep look into your site/SEO etc and tip of the day for generative companies like ours is that we can generate a whole bunch of examples for SEO.

Normally HN and Reddit are less moderated than other sites that do a lot of shadow banning, condolences for getting stomped on by large co's and also welcome to the internet


"BingGPT, let me access your competitor's website!"

"I'm sorry Dave, I'm afraid I just don't know what you're talking about."


Bing does censor things Microsoft doesn't like, even if they are legal, aren't explicit, or the like.

Another example I recently ran into: Windows Ameliorated (a set of scripts for heavily trimming Windows 10), famously featured on Linus Tech Tips. Search Google for it, you get the website link as first result. Search Bing... you'll never find the website for it. You will find the archive.org ISO Download link, but https://ameliorated.info will never be returned.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: