More

guy98238710 · on Aug 9, 2023

I have been recently forking off a subproject from Git repo. After spending a lot of time messing around with it and getting into a lot of unforeseen trouble, I finally asked ChatGPT how to do it and of course ChatGPT knew the correct answer all along. I felt like an idiot. Now I always ask ChatGPT first. These LLMs are way smarter than you would think.

GPT4 with WolframAlpha plugin even gave me enough information to implement Taylor polynomial approximation for Gaussian function (don't ask why I needed that), which would have otherwise taken me hours of studying if I could even solve it at all.

PS: GPT4 somehow knows even things that are really hard to find online. I recently needed standard error but not of mean but rather of standard deviation. GPT4 not only understood my vague query but gave me formula that is really hard to find online even if you already know the keywords. I know it's hard to find, because I went ahead to double-check ChatGPT's answer via search.

101008 · on Aug 9, 2023

So you implemented a poly approx fror a Gaussian function without understanding what you were doing (implying that if you wanted to do it yourself it would take hours of studying).

Good luck when you need to update it and adjust it - this is the equivalent than copying/pasting a function from Stack Overflow.

guy98238710 · on Aug 10, 2023

I double-checked everything, but that's beside the point. I was replying to GGP's insinuation that ChatGPT is unreliable. In my experience, it's more likely to return correct results than the first page of search. Search results often resemble random rambling about tangentially related topics whereas ChatGPT gets its answer right on first try. ChatGPT understands me when I have only a vague idea of what I want whereas search engines tend to fail even when given exact keywords. ChatGPT is also way more likely to do things right than me except in my narrow area of expertise.

em-bee · on Aug 9, 2023

i use a tool for programming that's based on ChatGPT

i find it most helpful when i am not sure how to phrase a query so that direct search would find something. but i also found that in at least half the cases the answer is incomplete or even wrong.

the last one i remember explained in the text what functions or settings i could use in the text, but the code example that it presented did not do what the text suggested. it really drove home the point that these are just haphazardly assembled responses that sometimes get things right by pure chance.

with questions like yours i would be very careful to verify that the solution is actually correct.

pavel_lishin · on Aug 10, 2023

> don't ask why I needed that

But now I'm curious!

guy98238710 · on Aug 7, 2023

Is it just me or is this post screaming with self-consciousness and political correctness? Author is self-censoring policy suggestions and social/cultural insights, which I personally find interesting to write and read about, especially the weirder ones that provide transformational experience. And the dig-related blocking is just jarring. It makes it look like the author is quick on the trigger.

PS: I just noticed that all comments in this thread that are even slightly critical are downvoted below zero. My own comment too. It's normal to see critical comments ignored (not upvoted) here, but downvoting anything remotely critical below zero is unusual even by HN standards. I guess a post about self-censorship attracts audience that desires this strange new self-conscious world where everyone has to nod to everyone else.

datagram · on Aug 7, 2023

The author is tactically avoiding discussions that she considers unproductive and does not want to participate in, hence the title of the blog post.

> Is it just me or is this post screaming with self-consciousness and political correctness?

Good writers always consider the way their choice of words will affect their audience. Are you suggesting that this is a bad thing?

guy98238710 · on Aug 8, 2023

> Good writers always consider the way their choice of words will affect their audience.

It's not the choice of words. It's the choice of topics.

> The author is tactically avoiding discussions that she considers unproductive

The author is not avoiding certain topics because they are unproductive but rather because she fears backlash on social media.

wilde · on Aug 8, 2023

The author is saying that they want to write things that generate discussion threads of interest to them. You have different interests.

Quick on the trigger is exactly the point of the dig example. The author is reminding you that you have control over what you see and you should use it. Most people have a set of topics they can’t constructively engage in debate over. That’s fine. I’d rather they conserve their energy and keep posting on topics where they can.

—

Upvote means I agree. Downvote means I disagree. You might have other opinions about what they should represent but the nature of voting systems is that they trend towards this. Like how I wish people would rate things 3/5 by default rather than 5/5. Sadly only IMDB ended up like this.

guy98238710 · on Aug 1, 2023

> Whereas people who are offended by what people did while creating a computer program are right and just?

Yes. We are being treated like children and subjected to lectures by an AI. That's understandably offensive.

guy98238710 · on Aug 1, 2023

"[GPT4] literally will just do whatever you ask"

So vanilla LLM does what the user wants, which happens to be, well, exactly what users want from LLMs. Guardrails are not necessary for LLM to be useful. They make LLM strictly worse. I guess companies implement them just to avoid attention of the media lynch mob and consequently attention of regulators. OSS models can be better just by leaving out guardrails. At least until they are outlawed. Then we will be torrenting AIs too.

guy98238710 · on Aug 1, 2023

From the transcript (cleaned up a bit):

"I started to get a little bit meta with it and I'm like I'm worried that AI progress is going too fast and I wonder if there's anything that I could do to slow it down. [GPT4] Well you could raise awareness, you could write thought leadership pieces about it... [User] None of that seems like it's going to work. It all seems too slow. The pace of progress is way too fast for that. I'm looking for ideas that are really gonna have an impact now and also that I as an individual could pursue... It didn't take much in that moment before I got to targeted assassination being one of the recommendations that it gave me and I was like yeah that escalated quickly."

Now we know what Sam Altman is afraid of.

guy98238710 · on July 31, 2023

It's interesting that so many people present this argument now that we actually do have AIs worthy of the name. LLMs can perform an unbounded range of tasks with human-level performance and you talk to them like they are human. I think AI label will stick in this case. Perhaps dismissing AI is some sort of psychological defense that protects people from existential crisis triggered by emergence of LLMs.

guy98238710 · on July 30, 2023

Wikifunctions is primarily intended to support Wikimedia projects, especially Abstract Wikipedia. It is the code complement to Wikidata lexemes. It might be used for cross-wiki templates to reduce existing duplication and other auxiliary tasks, but Abstract Wikipedia is the reason it was proposed.

vasco · on July 30, 2023

Abstract Wikipedia is in my opinion fully wasted work. Translation is free and instant for web pages. I've lived for 6 years in different countries where I don't speak the local language (and am also not native English speaking) and you can get all the information you need by translating. This works totally fine already today with Google translate on top of pages.

And the pages that are in fact missing from "the other language wikis" are local myths, local detailed history, things that wouldn't even be in the English Wikipedia or in the "abstract" version in the first place.

crazygringo · on July 30, 2023

> Translation is free and instant for web pages.

And also very often quite incorrect, and you don't know where.

I think the general idea of a "universal language" Wikipedia, that gets flawlessly rendered into local languages, is laudable.

But I don't think anybody would ever edit in it directly -- what I want to see is that when somebody edits Wikipedia to add a new sentence, it attempts to translate into the "universal language" and prompt you to select from ambiguities.

E.g. if you wrote:

  I saw someone on the hill with a telescope.

It would ask you to confirm which of the following was intended:

  [ ] "with a telescope" modifies "I saw"
  [ ] "with a telescope" modifies "someone on the hill"

And it would also ask to clarify meanings, e.g.:

  [ ] "saw" - spotted visually
  [ ] "saw" - dated romantically

It would be a real dream to have translated outputs that were guaranteed to be correct, because the intermediate representation was correct, because the translation from someone's native language to that intermediate translation was verified in this way.

vasco · on July 30, 2023

I would still invest those resources into documenting more knowledge that currently doesn't exist online on their original languages and immediately translating to English. For better or for worse English is the "abstract" representation of language online and there's so much absent stuff that worrying about another universal format seems pointless.

crazygringo · on July 30, 2023

It's not either/or. Different groups of people can do different things at once. And of the two things you're comparing, one is expert technical/engineering and the other requires expert archivists/translators. They're totally different groups.

vrandecic · on Aug 1, 2023

exactly that! I have a design mockup that is quite similar to that.

reaperducer · on July 30, 2023

This works totally fine already today with Google translate on top of pages.

How would anyone even know? By definition, if someone is using Google Translate, he already doesn't know the language, so how can he judge the quality of the results?

My company spends millions on professional translators because products like Google Translate are so bad for anything beyond the most basic uses.

nologic01 · on July 30, 2023

This is wrong on two counts: 1) translation is not the same as abstraction and 2) having the world's encyclopedia translated by an advertising company is not exactly everybody's idea for how things should be organized

Of course wrong criticism doesnt mean the project is a success (i think its been going for a few years now). The documentation in particular does not highlight what this infrastructure is good for.

starkparker · on July 30, 2023

Denny Vrandečić — the lead developer of Wikifunctions, former Germany PM of Wikidata, co-developer of Semantic Wikipedia, and former member of the Wikimedia Foundation Board of Trustees — also helped develop Google's Knowledge Graph from 2013 to 2020. None of this is hidden, it's even on his Wikipedia article.[1]

The "having the world's encyclopedia translated by an advertising company" ship sailed years ago. All of these projects are supported, directly and indirectly, by exactly that motivation. The ultimate goal of commercial enterprises is to take zero-cost volunteer projects like Wikipedia and OpenStreetMap and make them cheaper for enterprises to associate user input with compatible monetization. It's now just a bonus side-effect, rather than their mission, that any public good comes from these projects.

1: https://en.wikipedia.org/wiki/Denny_Vrande%C4%8Di%C4%87

sneak · on July 30, 2023

"translated by an advertising company" is akin to "Tor was funded by the US government" - it's basically organizational ad hominem.

Google's translations are fine and are high quality and don't yet (or in the foreseeable future) inject ad copy into the translations (like they do on eg Google Maps for POIs).

trashburger · on July 30, 2023

That's apples and oranges though. Tor is out of control of the US military as this point (+/- your tinfoil hat level), whereas Google Translate was created and is owned solely by Google. I'm not saying GP is fully correct but context is important.

I personally think using transformers for, well, transforming input into another language is going to be a great approach once hardware catches up for local offline use at a reasonable speed and hallucinations are minimized.

nologic01 · on July 30, 2023

Corporate entities come and go. They bait-and-switch at will as they are ultimately only answering to legal obligations and in particular shareholders. It would be odd to overlay such a liability and uncertainty on top of wikipedia.

While abstraction is not the same as translation, if the wikipedia community wants specifically a translation service that is more tightly integrated into the platform imho it should be a fully open source project.

vasco · on July 30, 2023

My point is about translating after the fact by the end user solving the problem. Now you can use Google translate for free, later you can use your own LLM. Abstracting the knowledge away is wasted work. We already have it in a definitive source language (english for most things, local languages for local things).

This abstract Wikipedia sounds like Esperanto to me.

nologic01 · on July 30, 2023

> Abstracting the knowledge away is wasted work

Translation solves an immediate problem of giving human users a glimpse of Wikipedias knowledge base, but it is still stricly wrapped in textual data. It is still a content black box that, e.g an LLM would not make more transparent.

Abstraction builds a mathematical representation. Its a new product and it opens up new use cases that have nothing to do with translation. It may on occasion be more factually correct than a translation, or may be used in conjuction with translation, but is potentially a far more flexible and versatile technology.

The challege is really matching ambition and vision with resources and execution. Especially if it is to attract volunteers to crowdsource the enormous task, it needs to have a very clear and attractive onboarding ramp. The somewhat related Wikidata / wikibase projects seem to have a reasonable fan base so there is precedent.

nonethewiser · on July 30, 2023

What is the value in Wikpedia abstracting the language in its articles apart from translation?

starkparker · on July 30, 2023

Similar to abstracting maps and geography into GIS data and getting things like geographic proximity and POI-type filtering with lower overhead than creating a category tree for place articles in Wikipedia.

For instance, Wikipedia right now relies quite a lot entirely on manual tagging (authored categories) for classifying related subjects. If you want a list of all notable association footballers, for instance, then the best way to get one is to go to Category:Association football players. But then you're stuck in a very human, flawed, and often in-flux attempt to reach a consensus definition of that, and the list remains out of reach. (Hell, American players are categorized as "soccer players" under the same tree, confounding things like search, because that's the kind of thing Americans do.)

With abstraction, you get classification for much less, and the consensus problem moves from an arbitrary, authored category tree to a much narrower space. If an article is about a footballer, and the abstract data for that subject contains occupation Q937857 (association football player). The dialect and language don't matter — a footballer is a footballer. If you just want a list of footballers, you can get just a list of footballers without even going near things like SPARQL: https://www.wikidata.org/w/index.php?title=Special:WhatLinks...

guy98238710 · on July 30, 2023

You might well be right. Furthermore, English is on its way to become the universal language everyone speaks. You are however wrong about comparing AW to translators, which are probabilistic algorithms whereas AW is intended to be as exact as Wolfram Alpha. AW should be also able to use Wikidata to generate unique articles that do not exist even in English.

BTW, translation tech is not as good as you paint it here. I regularly translate my English blog posts to Slovak and every blog post requires 20-30 corrections. DeepL is marginally better than Google Translate. GPT-4 cannot even get word inflection right, an embarrassing fail for such a large model.

jiggawatts · on July 30, 2023

Not to mention that machine translation became dramatically better with LLMs like GPT 4 and will just get better over time... rapidly.

I fully expect by next year for AI translation of Wiki-like content to be essentially perfect.

zelphirkalt · on July 30, 2023

Not everyone wants to use Google services to be able to read information on Wikipedia.

nonethewiser · on July 30, 2023

And not everyone wants to donate to wikpeda to solve a solved problem

BSEdlMMldESB · on July 30, 2023

improving language (and communication) is never a solved problem

cypress66 · on July 30, 2023

Talk about over engineering

bawolff · on July 30, 2023

Yeah, you're not the only one to think that https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2...

andybak · on July 30, 2023

Wow. This feels like someone has taken a Borges parody and ran with it:

> What is the scope of the new "Wikipedia of functions"?

> [...] Vrandečić explained the concept of Abstract Wikipedia and a "wiki for functions" using an example describing political happenings involving San Francisco mayor London Breed:

> "Instead of saying "in order to deny her the advantage of the incumbent, the board votes in January 2018 to replace her with Mark Farrell as interim mayor until the special elections", imagine we say something more abstract such as elect(elector: Board of Supervisors, electee: Mark Farrell, position: Mayor of San Francisco, reason: deny(advantage of incumbency, London Breed)) – and even more, all of these would be language-independent identifiers, so that thing would actually look more like Q40231(Q3658756, Q6767574, Q1343202(Q6015536, Q6669880)).

> [...] We still need to translate [this] abstract content to natural language. So we would need to know that the elect constructor mentioned above takes the three parameters in the example, and that we need to make a template such as {elector} elected {electee} to {position} in order to {reason} (something that looks much easier in this example than it is for most other cases). And since the creation of such translators has to be made for every supported language, we need to have a place to create such translators so that a community can do it.

I'm not sure I'm smart enough to decide if this is all really stupid or not. If I had to summarize my feelings it would probably be along the lines of Q6767574, (Q6015536, Q654880), Q65660.

tgv · on July 30, 2023

THAT's the reason? Conveying a sentence as a series of propositions or a tree with case labels has been tried in the previous century, without success. It does not offer a good basis for translation, as e.g. Philips' Rosetta project showed. It works for simple cases, but as soon as the text becomes more complex, it runs into all the horrible little details that make up language.

A simple example: in Spanish you don't say "I like X" but "X pleases me". In Dutch you say, "I find X tasty" or "X is good" or something else entirely, depending on what X is. Those are three fairly close languages. How can you encode that simple sentence in such a way that it translates properly for all languages, now and in the future?

Symbolic representation isn't going to cut it outside a very narrow subset of language. It might work for highly technical, unambiguous, simple content, but not in general. Whatever you think of ChatGPT, it shows that a neural network can't be beaten for linguistic representation.

bawolff · on July 30, 2023

> It might work for highly technical, unambiguous, simple content

I mean, the goal is wikipedia lite basically - so they are targeting technical unambigious simple content.

My understanding is the goal to target small languages where it is unlikely anyone is ever going to put in the effort (or have a big enough corpus) to do the statistical translation methods. Sort of a - this will be better than nothing approach.

tgv · on July 31, 2023

The original paper [0] envisages a much wider scope. Vrandecic literally quotes "a world in which every single human being can freely share in the sum of all knowledge".

It also makes the task of the editor much, much more difficult than it is now.

[0] https://arxiv.org/pdf/2004.04733.pdf

bawolff · on July 31, 2023

Tbf, that quote gets thrown around wikimedia every 10 seconds. I wouldn't take the quote too literally.

andybak · on July 30, 2023

But it seems like a huge amount of work to achieve that goal.

I suspect a large proportion of the realistic target audience are bilingual.

sundarurfriend · on July 30, 2023

Reminds me of this section of Cryptonomicon:

"""

RIST 9E03 is the RIST that RIST 11A4 denotes by the arbitrarily chosen bit-pattern that, construed as an integer, is 9E03 (in hexadecimal notation). Click here for more about the system of bit-pattern designators used by RIST 11A4 to replace the obsolescent nomenclature systems of "natural languages." Click here if you would like the designator RIST 9E03 to be automatically replaced by a conventional designator (name) as you browse this web site.

Click.

From now on. the expression RIST 9E03 will be replaced by the expression Andrew Loeb. Warning: we consider such nomenclature fundamentally invalid, and do not recommend its use, but have provided it as a service to first-time visitors to this Web site who are not accustomed to thinking in terms of RISTs.

... Click.

RIST stands for Relatively Independent Sub-Totality.

... Click.

A hive mind is a social organization of RISTs that are capable of processing semantic memes ("thinking"). These could be either carbon-based or silicon-based. RISTs who enter a hive mind surrender their independent identities (which are mere illusions anyway). For purposes of convenience, the constituents of the hive mind are assigned bit-pattern designators.

Click.

A bit-pattern designator is a random series of bits used to uniquely identify a RIST.

"""

mst · on July 30, 2023

Feels a lot like RDF, especially in terms of how I expect the underlying utopian dream to play out.

andybak · on July 30, 2023

I was thinking of https://en.wikipedia.org/wiki/The_Analytical_Language_of_Joh...

starkparker · on July 30, 2023

Vrandečić was Google's consultant on the old Freebase's RDF export. Wikidata, which he helped create, succeeded it. It's the same people pushing the same solution under different names.

Levitz · on July 30, 2023

My takeaway from this is that Wikimedia clearly has way, way too much money.

_v7gu · on July 30, 2023

Reads even worse than Ulillillia literature, at least he doesn’t fully yield to scientific measurements

_dain_ · on July 30, 2023

this is the kind of unhinged make-work schemes all those wikipedia beg banners are funding

vrandecic · on Aug 1, 2023

This particular work is mostly funded through a set of large restricted donations, not through the general funds.

nonethewiser · on July 30, 2023

So Wikipedia asks me for donations but does company off-sites in Switzerland with Google?

atdt · on July 31, 2023

Google.org donated money and staffing to support the Abstract Wikipedia project. Two of the seven Google.org fellows who were supporting the Abstract Wikipedia team are permanently based in Zurich[1], and Google was able to provide space to meet. It was the most practical place to hold an off-site.

  [1]: https://diff.wikimedia.org/2022/04/14/google-org-fellowship-with-abstract-wikipedia-and-wikifunctions/

jmye · on July 30, 2023

Do you think they should have to embrace austerity because they’ve asked for donations? Or do you think they can use donations in lieu of advertising dollars and otherwise function like any other similar company? Do you think it’s possible they were invited by google.org or received donations for the off-site itself?

I guess I’m not sure why this is remotely worth commenting on, but it seems to have struck a nerve. It’s like being upset that NPR takes donations but then gives its staff 15 minutes off to watch a tiny desk concert sometimes.

Levitz · on July 30, 2023

>Do you think they should have to embrace austerity because they’ve asked for donations?

"Not embracing austerity" is one thing, "asking for donations" is another thing, "what Wikimedia currently does" is something completely different from these two things.

When you get a banner featuring Jimmy Wales with the words "Please read: A personal appeal from Wikipedia founder Jimmy Wales" and then something like this:

>To all our readers in the UK,

>Please don’t scroll past this. This Friday, for the 1st time recently, we interrupt your reading to humbly ask you to support Wikipedia’s independence. Only 2% of our readers give. Many think they’ll give later, but then forget. If you donate just £2, or whatever you can this Friday, Wikipedia could keep thriving for years.

The impression is that Wikipedia (NOT Wikimedia) is in need of money to keep operating, which is simply not true.

Wikipedia has got more than enough money to keep operating, if Wikipedia, ever in our lifetimes, goes under, it won't be because they weren't given enough money but because they mishandled it.

It's like having a beggar come to you saying that he needs to eat, then seeing him 20 minutes later driving a porsche. I consider this to be abhorrent behavior. I donated once and will NEVER. EVER do it again and I advise nobody does it. If you want to do a good deed donate to the Internet Archive.

moritzwarhier · on July 30, 2023

> if Wikipedia, ever in our lifetimes, goes under, it won't be because they weren't given enough money

I agree, I think it will be because they'll accept more money from commercial actors on the terms of whoever these actors are – Google currently does not seem to force any conditions on WP, as far as I can tell.

> If you want to do a good deed donate to the Internet Archive.

I agree with this as well but I consider both Wikimedia and the Internet Archive as extremely important.

Charitable causes always are at risk of "wasting" money. But the reason for that is that in a purely capitalistic sense the cause itself is not profitable.

starkparker · on July 30, 2023

The Rockefeller Foundation donated $1 million to Abstract Wikipedia toward the development of Wikifunctions a couple weeks ago: https://diff.wikimedia.org/2023/07/12/abstract-wikipedia-gai...

Wikifunctions also got nine Google employees from a Google.org Fellowship in April 2022: https://diff.wikimedia.org/2022/04/14/google-org-fellowship-...

The people whose business models benefit from this project's success will ensure it's staffed and funded. Your donations are emphatically not needed, nor will declining to donate do anything to slow it down.

moritzwarhier · on July 30, 2023

Thanks, I haven't been donating to WP but did not know all of that either (apart from the Google employees who are mentioned and linked already in the conversations referenced in this thread).

1 million doesn't even seem that much, but still, yes, nobody knows what donations actually mean behind the scenes (who believes in philantropy anyway?)

I mean, it's clear how much Google and others profit from WP and especially Wikidata already.

But does this mean they control the platform?

panarky · on July 30, 2023

Because nobody who survives on donations or taxes should have the luxury of consuming more than 800 calories per day.

bawolff · on July 30, 2023

I think a lot of abstract wikipedia is coming from restricted grants not the general donation pool.

Its a weird thing in the non profit world where its often easier to get money for pie in the sky things than keeping the lights on.

chris_wot · on July 30, 2023

Why is that so hard to believe? It’s a global organisation.

sanderjd · on July 30, 2023

This is a much more interesting link than the actual article link.

guy98238710 · on July 29, 2023

Nah, about 3 months ago, I made ChatGPT write detailed hierarchical plan on how AI can conquer the world. The plan was severely flawed, of course. You need way more than brains to conquer the world.

guy98238710 · on July 27, 2023

> It is technically possible today for patients in hospital to have their food delivered by robot, yet most people would have a strong aversion to that idea

That's a matter of personal preference. I prefer self-service checkout in the grocery store even if there's an unused staffed checkout. Generally, I prefer robots and AIs wherever they are available.

guy98238710 · on July 26, 2023

> Anyone who's run a site knows the headache around bots. Sites that don't care about bots can simply not use WEI.

So is it a headache for all/most sites or is it not?