Hacker News new | past | comments | ask | show | jobs | submit login

Jude from Golden here. Agreed that WP is one of the most amazing things ever built and interesting to see your various lenses on our mission.

To the cynical self: see dropbox launch on HN back in the day. PS I’m no way claiming we are dropbox :>

To the angry self: There are various constraints that we want to release ourselves from in working on this problem by starting fresh. We believe the constraint space is too high to not build something new here. There are things that can be reused to build on what has been done already (linking out to WP, WP linking back to us when appropriate, the name space being similar/forked, various policies being built on/forked or rewritten, lessons learnt, content summarization with AI, fact cross checking etc).

To the academic self: we want to cover 10bn+ topics, google knowledge graph is around 3bn+ entities. We are not attempting to map all lamp posts in san francisco which would make a useful data set for a self driving car company but we do want to map all businesses, concepts, science topics, people of interest, species, products, services, etc etc. Instead of notability, we are aiming more at a validation model ie ‘does this entity exist’. There is also a difference between ‘article’ of WP and ‘entity’ of Golden for our model. So I believe there is space for positive coexistence between Golden and WP. We will still want discussion around the validation and ‘what next after 10bn entities are done’ debate.

In terms of the morality part, we wish to be at a more open standard than WP. The trade being for the common user: we open up all the pages on CC-4.0-BY-SA, go hopefully 1000x more entity cardinality, open source useful queries in exchange for less work per topic than alternatives (due to the leverage of the automation on alternatives). So I think we are on strong moral ground here, otherwise I would not work on it. We also have paid helpers as well to fill in gaps on our side to increase content and using part of our revenue to increase content at an ever faster speed. We reviewed the micropayments model and we don’t think it will work (see lunyr failure and others on that front).




Thanks for replying! I agree micropayments are incredibly difficult to make work, as is evidenced by the race to the bottom advertising model that seems to be everywhere all-the-time.

I hear from you that you want to essentially cover 10 billion topics, and essentially validate that they exist, but that says nothing about validating the content of what someone is saying about it, nor organizing it, etc.

I hear lots of AI buzzwords, but essentially I don't see any staff that would leverage all the thinking humanity has done on organizing, validating, and cataloging information. Where are the information scientists? The librarians? The archivists? The journalists? The philosophers? Etc etc etc.

Essentially you are talking about a profoundly /human/ endeavor that requires input (IMHO) from many corners of human knowledge to do in any way that begins to approach wikipedia (or even an encyclopedia, much less a library) in terms of quality and scale.

I hear buzzwords, and see an alarming lack of acknowledgement of how difficult these questions are (or even that they exist).

However, you have the $$ and the people, and I'm just here hiding behind a keyboard criticizing. Clearly you've convinced more people of your ideas than I have of mine, and by all means it's a noble goal, so I wish you the best of luck and will be interesting to see what you are and what Golden looks like in a decade or so!


Thanks so much, in terms of the hard questions after todays madness of launch comes down a little I'll tackle the hardest comments/questions here. In the coming months we will do some technical blog posts to explain how we will tackle the problem space. Many of the problems we have not figured out yet and welcome the community to contact us with new ideas. I 100% agree some of the problems are very hard. In terms of giving a glimpse into some problems we have solved so far, please test out the AI assisted editor, the magic table cells in the editor for auto filling tables, the citation tool by pasting a academic paper in the citation UI, the event detection on the timeline UI and the AI suggestions as well to get to some of the early results we have on automating the problem. Topic prediction, taxonomic detection, claim validation, structured data extraction, auto field detection/suggestions, crosslinking, spelling/grammar checking, sentiment checking, event detection, tense detection, quality on human edit feedbacks and ultimately prose writing (see recent open AI auto writing research) [non exhaustive list] - some we have solved and some not yet, but we will keep working on it. Generally speaking, keen to work on something difficult for the next 10 years...


Hi Jude,

On your page you said "If an extremely niche topic is valuable to just a handful of people and positively contributes to society, it will have a home on Golden."

Who will make that judgement call of what "contributes to society", and who will be paying their salary?

You also said "We believe this advanced query tool is extremely useful for investment funds, large consultancies and large companies, so please get in touch if you want to experience one of the best query tools out there."

That sounds great but its a far cry from "human knowledge". There wasn't much about advanced query tools for academics, nonprofits, activists, or government employees.

Sorry to be so cynical but one can only hear so much of "making the world a better place", to quote Silicon Valley.


> To the academic self: we want to cover 10bn+ topics, google knowledge graph is around 3bn+ entities. We are not attempting to map all lamp posts in san francisco which would make a useful data set for a self driving car company but we do want to map all businesses, concepts, science topics, people of interest, species, products, services, etc etc. Instead of notability, we are aiming more at a validation model ie ‘does this entity exist’. There is also a difference between ‘article’ of WP and ‘entity’ of Golden for our model. So I believe there is space for positive coexistence between Golden and WP. We will still want discussion around the validation and ‘what next after 10bn entities are done’ debate.

How does this differ from wikidata.org?


it will have to be profitable to pay back investors. for one difference.


> we wish to be at a more open standard than WP

As far as I can see, CC-BY-SA license only applies to the text and not the knowledge graph that users contribute.


Hey judegomila,

You're using my images - hundreds of them, it seems - in breach of the licence. Where do I send my invoice?

I'm sure as you wish to be "on strong moral ground", you won't want to deny me what you owe me.


So, you're aiming for an exit as per Metaweb/Freebase?


JG here. No, just a damn good website for you all.


If you've taken VC, then surely there's an exit in about 5 years?


JG: I hope not :>


Like a lot of others here I really wish you good luck, this can become amazing!

Like a lot of others here I'm also afraid hat will happen as VC starts to demand profit, now.

For the record, I'm in no way against successful companies being wildly profitable, quite the contrary: I see that as a guard against being forced to do dumb things.

What I am wary of however is companies being forced by VC to do all kinds of crazy stuff, like back when Quora decided to publish everything one looked at and I left there and then never to return, or when short after WhatsApp joined Facebook talk would start about "integration" and I would immediately start moving my account and all groups elsewhere.

What I could hope for[0] - especially with companies that are hoping to crowdsource a lot of data - would be some kind of effective guarantee and/or escrow to prevent short sighted plays by VCs or hostile takeover by Google, Facebook or similar companies, i.e. that companies would "tie themselves to the mast" to escape the siren songs.

[0]: but don't really expect in most cases as it would limit a number of profitable exits. An upside I could see would be that it would be easier to get crowdsourced data, from both companies as well as from individual contributors if one could believe that the data would stay accessible and not be abused.


Don't let your VC see this because they're definitely expecting to see a return on their investment reasonably soon.


All the best.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: