This is pure clickbait and really doesn't leave a good impression of this team. Writing a whole article implying they're not being indexed because they are a Github Copilot competitor when they, themselves, don't believe it is farcical.
Title:
We compete with GitHub. Bing does not show our website
FTA:
Now, is it really because we compete with Github? Honestly, probably not, but controversy drums up interest and we need interest so that someone out there on the Internet can tell us what we're doing wrong. [...] we only care about getting as many software engineers to experience the power of generative AI for software development via our free product.
If you came from a nation that never gets snow. You practice in a simulated environment with the bobsled, go to the olympics and participate, did you compete?
Competing doesn't mean top of field. It might just mean making enough to eek out a living.
I understand the sentiment and why it might come across that way, but we genuinely have gotten a lot more reasonable pointers in these comments than we were able to rack our brains about. And it seems like SEO attacks, Bing incompetence, and Bing actually going after some terms (like Chrome) actually have happened, and the main point here was that we genuinely don't get any feedback from Bing.
Your website isn't quick to find by Duckduckgo or Qwant either. Yahoo doesn't seem to show much either. Startpage.com does find it, though.
I do consistently get your VS Code addon in the first page of search results, though.
When I open up your website I don't really see much. A brand name and some download links are all I can find. I'm not sure if there's much for search engines to find, really.
Your homepage also massively lags my browser for some reason, that can't help in terms of search rankings. It's also continuously downloading megabytes of data for some reason. You should try out your website on a cheap Android phone, your target audience may generally not use those but search engines definitely optimize for them.
> Now, is it really because we compete with Github? Honestly, probably not, but controversy drums up interest and we need interest so that someone out there on the Internet can tell us what we're doing wrong.
Mission succeeded, I suppose. You've got another popular link to your website through HN and you're getting free SEO advice on top. Sadly, you won't be able to get Bing support through here, HN tech support posts mostly attract Google and Stripe employees.
Looks like it did although I really wish we didn't reward these type of brazen stunts to get attention. Outrage-baiting is still very effective I guess.
Yeah someone else brought up webpage performance, which we will totally look into - definitely haven't fully tested on cheaper hardware. And wrt text, we have lots of blog posts, FAQ, etc to make sure that the term "codeium" amongst others appears a lot.
The one that they themselves admitted at the end of their post:
Now, is it really because we compete with Github? Honestly, probably not, but controversy drums up interest and we need interest ...
Another way to get the SEO feedback that they claim that they were seeking is to start an "Ask HN" thread. Surely they would have received useful feedback without as much fanfare and the emotional manipulation.
There's loads of stuff that Bing doesn't index very well.
If you turn of JS there's almost no actual text on the site; only the FAQ has any serious amount of text. This probably doesn't help. Having the page open also really slows down my laptop and constantly uses ~50%-75% CPU and ~130M memory. No idea if that's a factor, but it takes quite a lot of computing resources to actually make the homepage render any text and perhaps Bing doesn't deal well with that (and even then, it's not all that much text).
Admittedly I have a somewhat slow laptop, but I can't recall the last time merely opening a webpage slowed down my machine this much, and this includes webapps, so it's really an outlier.
Either way, Microsoft specifically blocking it is the least likely explanation by far.
JS-dependent sites are crawlable by bingbot, and Googlebot, and have been for years now. Otherwise, much of the modern web would be absent from search. Things can still happen, but it doesn't look like OP is doing anything unusual with rendering. If Google can crawl, I'd be surprised if Bing was entirely stumped (plus: meta descriptions).
> Having the page open also really slows down my laptop and constantly uses ~50%-75% CPU
Yes, I know; but there are limits on this, otherwise people would be mining Bitcoin via search indexers and whatnot. Like I said: I don't know if this is a factor here, but it might be. It's something that should probably be addressed either way.
The initial render time is fine. It just constantly uses CPU, according to htop and the Firefox process viewer, and it noticeably slows down everything.
at a previous company (2019-20) we did some tests on this.
The main problem with relying on js rendering is that it happened much much later than the crawl. Days at least and sometimes weeks. For us, fresh content was key, so js rendering was not possible.
There's a lot of stuff that Google doesn't index too, btw, that Bing do. I think what we all need to agree is that... different search index will inevitable have different results anyway.
> Now, is it really because we compete with Github? Honestly, probably not, but controversy drums up interest and we need interest so that someone out there on the Internet can tell us what we're doing wrong.
It's pretty slimy to imply in the clickbait title that you're being blocked due to competing with MS, when you know well that it's not the reason. Admitting it's clickbait in the last paragraph doesn't make it any better.
The HN title should really be edited to remove the "We compete with GitHub" part.
How does he know well that this is not the reason? No explanation has been given for Bing's very different treatment of the site than Google's. He says "probably not" because he isn't assuming guilt, but it may well be relevant.
To put it bluntly, while these guys might think they're competing with Github, the inverse is unlikely to be true. Nobody has heard of them until now. Their search volume on Google Trends doesn't even register compared to Copilot:
Now, there is no shame in not having the top product in your niche. Everyone needs to start somewhere! But for somebody at Bing to risk going against their stated policies and blocklist a competitor, they at a bare minimum need to be somebody MS would consider a competitor. Clearly not the case here.
Or, as others have pointed out, the incompetence may not be with Bing. Websites that take a long time to load, download megabytes of data, and cause CPU fans to spin up are often down-ranked by search engines.
This happened to a team I was on with Google Safe Browsing. The company was a competitor to Google in certain industries. An internal domain was flagged after a security researcher set up an intentionally vulnerable service and visited it in Chrome. The particular domain was shared with some other internal services which caused an internal outage. We registered in the Google Search Console, then leveraged some industry contacts and later our legal team (which had a friendly backchannel with Google's legal team) to escalate within Google. I believe our domains were given special treatment after that.
For all the hate Google gets, it's crawlers are like 10x smarter than any other search engine. Wouldn't surprise me at all if there was some type of crawling issue that Google figured out and Bing didn't.
This is a case where I'm not ready to attribute to malice what is very likely incompetence.
Operating 8.8.8.8 likely gives Google a near-complete list of every active domain on the internet, and new domains show up there the second any 8.8.8.8 user queries it - which is likely well before anyone links to it. I have no idea if Google use this data for hints to their crawler and indexer, but if I worked there I sure would.
I doubt they actually discover domains that way; for a search engine just knowing a domain exists isn't all that useful. Besides, loads of spam crawlers can find new domains plenty fast enough: create a brand new domain with a blank page and a brand new email address, link it once somewhere, and see how long it takes to get spam because some spam crawler bot found it.
This actually makes perfect sense! I've always wondered why they operate this service. There's nothing in the privacy policy that suggests they won't use the aggregated anonymized information for other purposes.
It would seem very odd to not use this tool which effectively gives a huge competitive advantage over a lot of other engines, although for obvious reasons I doubt they'd shout it from the rooftops.
Someone else recently notice their website wasn't indexed by Bing and they've got a similar message. The whole story is thoroughly documented. Maybe Codeium can follow their steps to resolve the issue?
If you experience negative seo attacks against your site then being de-indexed by Bing/DDG at some point is to be expected. They are simply not as smart as Google in this respect.
You can go through the procedure of registering with their webmaster tools equivalent, submit a ticket and wait.
Or you can simply just wait.
In my limited experience (6 sites) the result was the same. Eventually it will resolve itself and you'll be back in the index, albeit with some very weird links showing up (an effect of the original attack). This does, at least, alert you to some vulnerabilities you may have previously been unaware of.
A missing robots.txt has always meant "allow all", so it would be really weird for this to be the cause. Robots.txt is primarily for disallowing (or really, discouraging) crawlers from crawling certain paths.
Because this is the lowest-hanging fruit imaginable, why not take the 5 minutes to add a robots.txt and then use https://www.bing.com/webmasters/robotstxttester to confirm that Bing can fetch/parse it?
You're certainly getting a lot of free SEO help today. I'm not the one who has problems getting my site indexed, but it seems like they are different things.
# Allowing all web crawlers access to all content
User-agent: *
Disallow:
Yep, the reality is that Bing simply delivers an inferior product. I don't think they are actually trying to be malicious in this case. There is a reason Microsoft bribes you with gift cards to use their product ;)
honestly if you look at the incident history for github, coupled with the ham-fisted tantrum Bing throws if you ask for Chrome ("There's no need to download a new web browser.") , it feels pretty on-point to say Redmond is throwing knuckle sandwiches to stay in the running for any relevance on the internet these days
"tantrum Bing throws if you ask for Chrome" is honestly a mild offense compared to the dang Windows 10 Operating System warning you that you that installing a non-Edge browser was going to make things slower and unsafe. I'm still grumpy about that one.
It's easy to underestimate just how difficult it is to get your context indexed and ranked well. Personally I was under the delusion that if you make a solid website, you'll get some organic visitors through search engines from people searching related terms. Turns out that might well be wrong. In the first few months of launching my fairly serious effort website I received a total of 3 organic visitors. While proper SEO and link spamming would surely get me some more, I found it disappointing enough I just gave up
Please note that the OP was codeium (with an E). There website is codeium.com and is not indexed by bing (and is therefore not in your results either). It isn't even shown when you search for 'codeium.com': https://www.bing.com/search?q=codeium.com
I had an agency client in 2021 with an identical issue. In that case, the agency was able to contact a support representative for Bing's Webmaster Tools that provided a little more info, and some further investigation on my end revealed that the Bing crawler was doing something idiotic that was triggering http error response codes from the application, and the crawler took that to mean that the site was broken or down.
It was an intersection of multiple stupid things, not maliciousness.
Bing's weird with some websites. For instance it will refuse to show the bun homepage (bun.sh) with anything short of a direct copy of the title: "Bun — fast all-in-one JavaScript runtime", even then it gives you a link with a `?ref=hackernoon.com` query parameter for some reason?
It is happy to show its github repo and a bunch of blogspam about it though.
It is even happy to link you to bun.sh/install, which immediately downloads a shell script to your computer upon clicking. Bizarre.
Bing is known for removing websites for no apparent reason. My website was de-listed for 2-3 months one year ago (personal blog, writing for 10 years, no spam, no ads). It happened recently to another dev blogger: https://daverupert.com/2023/02/solved-the-case-of-the-bing-b...
At risk of encouraging the trend of HN turning into tech support, or worse ( guerilla-marketing bait), you should try Bing Webmaster Tools: https://www.bing.com/webmasters/about . If it's anything like Google's Search Console, there's plenty of information and tools to diagnose indexing issues.
Why should this codeium suddenly rocket past the already established codeium product that was indexed? Complaining is easy. SEO takes work.
When searching bing with: codeium coding
That search will bring this product up in the first position. It's a shame that the specific "indexation" issues aren't shown, but did the author read the linked bing page to see the list of possible issues?
Not sure what you mean here - our product does show up in the first position in Bing, it's just not our website, it's our listing on the VSCode marketplace. Our actual website doesn't even show up on Bing at all, irrespective of ranking.
It also doesn't show google scholar and twitter for me when I recently switched to try out bing instead of google. With google the scholar and twitter results were in top 3 but couldn't find them with bing, even 4 or 5 pages down the line, while github and pytorch forum posts were available on the first page itself.
1) there is no robots.txt
2) JSON+LD markup is minimal and there is none on the home page
3) your navigation is not fully crawlable - About and Pricing break
Maybe try a 5 minute conversation with someone who does SEO for a living before writing up a conspiracy blog. You have essentially no backlinks to your brand new site. Popping you into ahrefs, it looks like you guys weren't even on the map until December. If I filter out the spam domains and isolate just domains that are at least DR 20 with any traffic at all and exclude subdomains, you have links from 16 domains and all but one are automated junk and non-editorial.
Additionally you have no robots.txt file, and your homepage has no self referencing canonical. TL;DR you don't show up in the index because you are not yet notable. Do some digital marketing.
Also why do you have four H1's? You should have exactly 1, and on your home page it should be your brand name. This whole site is a mess. Instead of trying to be fancy I'd recommend you just make your marketing site on something like WordPress which is going to get 99% of this stuff right for you out of the box. You don't need React/Nextjs for a marketing brochure site and rolling your own solution when you don't know understand the fundamentals is not going to lead to a good result.
> Now, is it really because we compete with Github? Honestly, probably not
There's no way of really knowing. Bing is a black box with no transparency, like most search engines.
One thing though: `Codeium` seems like a generic and vague word. Also I won't remember how to spell it because of the weird `e` before the `i`. I can see why Bing has trouble even recognizing the word. It's optimizing for the correct spelling. Try a rebrand, something catchy, and not something that has a hundred other words that sound like it.
I almost never use Bing, and am pretty sure I’ve never even opened Bing on my phone at all before. Settings show that “Safe search” was at “Moderate”, presumably the default.
On an unrelated note, Bing uses purple as the default color for all links? That’s confusing to me. Purple to me on the web used to mean previously visited. The only board on 4chan I visit these days is /g/ and only very rarely as well. Scrolling the Bing results, all results have purple color.
As a tl;dr there is very little reason to pay for Tabnine. Copilot is far superior in the realm of paid products, and Codeium is comparable to Copilot.
What exactly is the business model though? It's always a red flag when "privacy respecting" and "free" are bundled together. I definitely _want_ to pay something for that type of service.
We actually worked out the math - we don't use any external APIs like OpenAI (use our own models) so those costs aren't prohibitive, and we have a bunch of experience as a team building scalable ML systems, so we have driven down the serving costs by a lot.
Looks like this post is flagged too... I'm also competing with you/co pilot with https://text-generator.io but I'm on Bing, take a deep look into your site/SEO etc and tip of the day for generative companies like ours is that we can generate a whole bunch of examples for SEO.
Normally HN and Reddit are less moderated than other sites that do a lot of shadow banning, condolences for getting stomped on by large co's and also welcome to the internet
Bing does censor things Microsoft doesn't like, even if they are legal, aren't explicit, or the like.
Another example I recently ran into: Windows Ameliorated (a set of scripts for heavily trimming Windows 10), famously featured on Linus Tech Tips. Search Google for it, you get the website link as first result. Search Bing... you'll never find the website for it. You will find the archive.org ISO Download link, but https://ameliorated.info will never be returned.
Title:
FTA: