I have been recently forking off a subproject from Git repo. After spending a lot of time messing around with it and getting into a lot of unforeseen trouble, I finally asked ChatGPT how to do it and of course ChatGPT knew the correct answer all along. I felt like an idiot. Now I always ask ChatGPT first. These LLMs are way smarter than you would think.
GPT4 with WolframAlpha plugin even gave me enough information to implement Taylor polynomial approximation for Gaussian function (don't ask why I needed that), which would have otherwise taken me hours of studying if I could even solve it at all.
PS: GPT4 somehow knows even things that are really hard to find online. I recently needed standard error but not of mean but rather of standard deviation. GPT4 not only understood my vague query but gave me formula that is really hard to find online even if you already know the keywords. I know it's hard to find, because I went ahead to double-check ChatGPT's answer via search.
So you implemented a poly approx fror a Gaussian function without understanding what you were doing (implying that if you wanted to do it yourself it would take hours of studying).
Good luck when you need to update it and adjust it - this is the equivalent than copying/pasting a function from Stack Overflow.
I double-checked everything, but that's beside the point. I was replying to GGP's insinuation that ChatGPT is unreliable. In my experience, it's more likely to return correct results than the first page of search. Search results often resemble random rambling about tangentially related topics whereas ChatGPT gets its answer right on first try. ChatGPT understands me when I have only a vague idea of what I want whereas search engines tend to fail even when given exact keywords. ChatGPT is also way more likely to do things right than me except in my narrow area of expertise.
i use a tool for programming that's based on ChatGPT
i find it most helpful when i am not sure how to phrase a query so that direct search would find something. but i also found that in at least half the cases the answer is incomplete or even wrong.
the last one i remember explained in the text what functions or settings i could use in the text, but the code example that it presented did not do what the text suggested. it really drove home the point that these are just haphazardly assembled responses that sometimes get things right by pure chance.
with questions like yours i would be very careful to verify that the solution is actually correct.
Is it just me or is this post screaming with self-consciousness and political correctness? Author is self-censoring policy suggestions and social/cultural insights, which I personally find interesting to write and read about, especially the weirder ones that provide transformational experience. And the dig-related blocking is just jarring. It makes it look like the author is quick on the trigger.
PS: I just noticed that all comments in this thread that are even slightly critical are downvoted below zero. My own comment too. It's normal to see critical comments ignored (not upvoted) here, but downvoting anything remotely critical below zero is unusual even by HN standards. I guess a post about self-censorship attracts audience that desires this strange new self-conscious world where everyone has to nod to everyone else.
The author is saying that they want to write things that generate discussion threads of interest to them. You have different interests.
Quick on the trigger is exactly the point of the dig example. The author is reminding you that you have control over what you see and you should use it. Most people have a set of topics they can’t constructively engage in debate over. That’s fine. I’d rather they conserve their energy and keep posting on topics where they can.
—
Upvote means I agree. Downvote means I disagree. You might have other opinions about what they should represent but the nature of voting systems is that they trend towards this. Like how I wish people would rate things 3/5 by default rather than 5/5. Sadly only IMDB ended up like this.
So vanilla LLM does what the user wants, which happens to be, well, exactly what users want from LLMs. Guardrails are not necessary for LLM to be useful. They make LLM strictly worse. I guess companies implement them just to avoid attention of the media lynch mob and consequently attention of regulators. OSS models can be better just by leaving out guardrails. At least until they are outlawed. Then we will be torrenting AIs too.
"I started to get a little bit meta with it and I'm like I'm worried that AI progress is going too fast and I wonder if there's anything that I could do to slow it down. [GPT4] Well you could raise awareness, you could write thought leadership pieces about it... [User] None of that seems like it's going to work. It all seems too slow. The pace of progress is way too fast for that. I'm looking for ideas that are really gonna have an impact now and also that I as an individual could pursue... It didn't take much in that moment before I got to targeted assassination being one of the recommendations that it gave me and I was like yeah that escalated quickly."
It's interesting that so many people present this argument now that we actually do have AIs worthy of the name. LLMs can perform an unbounded range of tasks with human-level performance and you talk to them like they are human. I think AI label will stick in this case. Perhaps dismissing AI is some sort of psychological defense that protects people from existential crisis triggered by emergence of LLMs.
Wikifunctions is primarily intended to support Wikimedia projects, especially Abstract Wikipedia. It is the code complement to Wikidata lexemes. It might be used for cross-wiki templates to reduce existing duplication and other auxiliary tasks, but Abstract Wikipedia is the reason it was proposed.
Abstract Wikipedia is in my opinion fully wasted work. Translation is free and instant for web pages. I've lived for 6 years in different countries where I don't speak the local language (and am also not native English speaking) and you can get all the information you need by translating. This works totally fine already today with Google translate on top of pages.
And the pages that are in fact missing from "the other language wikis" are local myths, local detailed history, things that wouldn't even be in the English Wikipedia or in the "abstract" version in the first place.
And also very often quite incorrect, and you don't know where.
I think the general idea of a "universal language" Wikipedia, that gets flawlessly rendered into local languages, is laudable.
But I don't think anybody would ever edit in it directly -- what I want to see is that when somebody edits Wikipedia to add a new sentence, it attempts to translate into the "universal language" and prompt you to select from ambiguities.
E.g. if you wrote:
I saw someone on the hill with a telescope.
It would ask you to confirm which of the following was intended:
[ ] "with a telescope" modifies "I saw"
[ ] "with a telescope" modifies "someone on the hill"
It would be a real dream to have translated outputs that were guaranteed to be correct, because the intermediate representation was correct, because the translation from someone's native language to that intermediate translation was verified in this way.
I would still invest those resources into documenting more knowledge that currently doesn't exist online on their original languages and immediately translating to English. For better or for worse English is the "abstract" representation of language online and there's so much absent stuff that worrying about another universal format seems pointless.
It's not either/or. Different groups of people can do different things at once. And of the two things you're comparing, one is expert technical/engineering and the other requires expert archivists/translators. They're totally different groups.
This works totally fine already today with Google translate on top of pages.
How would anyone even know? By definition, if someone is using Google Translate, he already doesn't know the language, so how can he judge the quality of the results?
My company spends millions on professional translators because products like Google Translate are so bad for anything beyond the most basic uses.
This is wrong on two counts: 1) translation is not the same as abstraction and 2) having the world's encyclopedia translated by an advertising company is not exactly everybody's idea for how things should be organized
Of course wrong criticism doesnt mean the project is a success (i think its been going for a few years now). The documentation in particular does not highlight what this infrastructure is good for.
Denny Vrandečić — the lead developer of Wikifunctions, former Germany PM of Wikidata, co-developer of Semantic Wikipedia, and former member of the Wikimedia Foundation Board of Trustees — also helped develop Google's Knowledge Graph from 2013 to 2020. None of this is hidden, it's even on his Wikipedia article.[1]
The "having the world's encyclopedia translated by an advertising company" ship sailed years ago. All of these projects are supported, directly and indirectly, by exactly that motivation. The ultimate goal of commercial enterprises is to take zero-cost volunteer projects like Wikipedia and OpenStreetMap and make them cheaper for enterprises to associate user input with compatible monetization. It's now just a bonus side-effect, rather than their mission, that any public good comes from these projects.
"translated by an advertising company" is akin to "Tor was funded by the US government" - it's basically organizational ad hominem.
Google's translations are fine and are high quality and don't yet (or in the foreseeable future) inject ad copy into the translations (like they do on eg Google Maps for POIs).
That's apples and oranges though. Tor is out of control of the US military as this point (+/- your tinfoil hat level), whereas Google Translate was created and is owned solely by Google. I'm not saying GP is fully correct but context is important.
I personally think using transformers for, well, transforming input into another language is going to be a great approach once hardware catches up for local offline use at a reasonable speed and hallucinations are minimized.
Corporate entities come and go. They bait-and-switch at will as they are ultimately only answering to legal obligations and in particular shareholders. It would be odd to overlay such a liability and uncertainty on top of wikipedia.
While abstraction is not the same as translation, if the wikipedia community wants specifically a translation service that is more tightly integrated into the platform imho it should be a fully open source project.
My point is about translating after the fact by the end user solving the problem. Now you can use Google translate for free, later you can use your own LLM. Abstracting the knowledge away is wasted work. We already have it in a definitive source language (english for most things, local languages for local things).
This abstract Wikipedia sounds like Esperanto to me.
Translation solves an immediate problem of giving human users a glimpse of Wikipedias knowledge base, but it is still stricly wrapped in textual data. It is still a content black box that, e.g an LLM would not make more transparent.
Abstraction builds a mathematical representation. Its a new product and it opens up new use cases that have nothing to do with translation. It may on occasion be more factually correct than a translation, or may be used in conjuction with translation, but is potentially a far more flexible and versatile technology.
The challege is really matching ambition and vision with resources and execution. Especially if it is to attract volunteers to crowdsource the enormous task, it needs to have a very clear and attractive onboarding ramp. The somewhat related Wikidata / wikibase projects seem to have a reasonable fan base so there is precedent.
Similar to abstracting maps and geography into GIS data and getting things like geographic proximity and POI-type filtering with lower overhead than creating a category tree for place articles in Wikipedia.
For instance, Wikipedia right now relies quite a lot entirely on manual tagging (authored categories) for classifying related subjects. If you want a list of all notable association footballers, for instance, then the best way to get one is to go to Category:Association football players. But then you're stuck in a very human, flawed, and often in-flux attempt to reach a consensus definition of that, and the list remains out of reach. (Hell, American players are categorized as "soccer players" under the same tree, confounding things like search, because that's the kind of thing Americans do.)
With abstraction, you get classification for much less, and the consensus problem moves from an arbitrary, authored category tree to a much narrower space. If an article is about a footballer, and the abstract data for that subject contains occupation Q937857 (association football player). The dialect and language don't matter — a footballer is a footballer. If you just want a list of footballers, you can get just a list of footballers without even going near things like SPARQL: https://www.wikidata.org/w/index.php?title=Special:WhatLinks...
You might well be right. Furthermore, English is on its way to become the universal language everyone speaks. You are however wrong about comparing AW to translators, which are probabilistic algorithms whereas AW is intended to be as exact as Wolfram Alpha. AW should be also able to use Wikidata to generate unique articles that do not exist even in English.
BTW, translation tech is not as good as you paint it here. I regularly translate my English blog posts to Slovak and every blog post requires 20-30 corrections. DeepL is marginally better than Google Translate. GPT-4 cannot even get word inflection right, an embarrassing fail for such a large model.
Wow. This feels like someone has taken a Borges parody and ran with it:
> What is the scope of the new "Wikipedia of functions"?
> [...] Vrandečić explained the concept of Abstract Wikipedia and a "wiki for functions" using an example describing political happenings involving San Francisco mayor London Breed:
> "Instead of saying "in order to deny her the advantage of the incumbent, the board votes in January 2018 to replace her with Mark Farrell as interim mayor until the special elections", imagine we say something more abstract such as elect(elector: Board of Supervisors, electee: Mark Farrell, position: Mayor of San Francisco, reason: deny(advantage of incumbency, London Breed)) – and even more, all of these would be language-independent identifiers, so that thing would actually look more like Q40231(Q3658756, Q6767574, Q1343202(Q6015536, Q6669880)).
> [...] We still need to translate [this] abstract content to natural language. So we would need to know that the elect constructor mentioned above takes the three parameters in the example, and that we need to make a template such as {elector} elected {electee} to {position} in order to {reason} (something that looks much easier in this example than it is for most other cases). And since the creation of such translators has to be made for every supported language, we need to have a place to create such translators so that a community can do it.
I'm not sure I'm smart enough to decide if this is all really stupid or not. If I had to summarize my feelings it would probably be along the lines of Q6767574, (Q6015536, Q654880), Q65660.
THAT's the reason? Conveying a sentence as a series of propositions or a tree with case labels has been tried in the previous century, without success. It does not offer a good basis for translation, as e.g. Philips' Rosetta project showed. It works for simple cases, but as soon as the text becomes more complex, it runs into all the horrible little details that make up language.
A simple example: in Spanish you don't say "I like X" but "X pleases me". In Dutch you say, "I find X tasty" or "X is good" or something else entirely, depending on what X is. Those are three fairly close languages. How can you encode that simple sentence in such a way that it translates properly for all languages, now and in the future?
Symbolic representation isn't going to cut it outside a very narrow subset of language. It might work for highly technical, unambiguous, simple content, but not in general. Whatever you think of ChatGPT, it shows that a neural network can't be beaten for linguistic representation.
> It might work for highly technical, unambiguous, simple content
I mean, the goal is wikipedia lite basically - so they are targeting technical unambigious simple content.
My understanding is the goal to target small languages where it is unlikely anyone is ever going to put in the effort (or have a big enough corpus) to do the statistical translation methods. Sort of a - this will be better than nothing approach.
The original paper [0] envisages a much wider scope. Vrandecic literally quotes "a world in which every single human being can freely share in the sum of
all knowledge".
It also makes the task of the editor much, much more difficult than it is now.
RIST 9E03 is the RIST that RIST 11A4 denotes by the arbitrarily chosen bit-pattern that, construed as an integer, is 9E03 (in hexadecimal notation). Click here for more about the system of bit-pattern designators used by RIST 11A4 to replace the obsolescent nomenclature systems of "natural languages." Click here if you would like the designator RIST 9E03 to be automatically replaced by a conventional designator (name) as you browse this web site.
Click.
From now on. the expression RIST 9E03 will be replaced by the expression Andrew Loeb. Warning: we consider such nomenclature fundamentally invalid, and do not recommend its use, but have provided it as a service to first-time visitors to this Web site who are not accustomed to thinking in terms of RISTs.
... Click.
RIST stands for Relatively Independent Sub-Totality.
... Click.
A hive mind is a social organization of RISTs that are capable of processing semantic memes ("thinking"). These could be either carbon-based or silicon-based. RISTs who enter a hive mind surrender their independent identities (which are mere illusions anyway). For purposes of convenience, the constituents of the hive mind are assigned bit-pattern designators.
Click.
A bit-pattern designator is a random series of bits used to uniquely identify a RIST.
Vrandečić was Google's consultant on the old Freebase's RDF export. Wikidata, which he helped create, succeeded it. It's the same people pushing the same solution under different names.
Google.org donated money and staffing to support the Abstract Wikipedia project. Two of the seven Google.org fellows who were supporting the Abstract Wikipedia team are permanently based in Zurich[1], and Google was able to provide space to meet. It was the most practical place to hold an off-site.
Do you think they should have to embrace austerity because they’ve asked for donations? Or do you think they can use donations in lieu of advertising dollars and otherwise function like any other similar company? Do you think it’s possible they were invited by google.org or received donations for the off-site itself?
I guess I’m not sure why this is remotely worth commenting on, but it seems to have struck a nerve. It’s like being upset that NPR takes donations but then gives its staff 15 minutes off to watch a tiny desk concert sometimes.
>Do you think they should have to embrace austerity because they’ve asked for donations?
"Not embracing austerity" is one thing, "asking for donations" is another thing, "what Wikimedia currently does" is something completely different from these two things.
When you get a banner featuring Jimmy Wales with the words "Please read: A personal appeal from Wikipedia founder Jimmy Wales" and then something like this:
>To all our readers in the UK,
>Please don’t scroll past this. This Friday, for the 1st time recently, we interrupt your reading to humbly ask you to support Wikipedia’s independence. Only 2% of our readers give. Many think they’ll give later, but then forget. If you donate just £2, or whatever you can this Friday, Wikipedia could keep thriving for years.
The impression is that Wikipedia (NOT Wikimedia) is in need of money to keep operating, which is simply not true.
Wikipedia has got more than enough money to keep operating, if Wikipedia, ever in our lifetimes, goes under, it won't be because they weren't given enough money but because they mishandled it.
It's like having a beggar come to you saying that he needs to eat, then seeing him 20 minutes later driving a porsche. I consider this to be abhorrent behavior. I donated once and will NEVER. EVER do it again and I advise nobody does it. If you want to do a good deed donate to the Internet Archive.
> if Wikipedia, ever in our lifetimes, goes under, it won't be because they weren't given enough money
I agree, I think it will be because they'll accept more money from commercial actors on the terms of whoever these actors are – Google currently does not seem to force any conditions on WP, as far as I can tell.
> If you want to do a good deed donate to the Internet Archive.
I agree with this as well but I consider both Wikimedia and the Internet Archive as extremely important.
Charitable causes always are at risk of "wasting" money. But the reason for that is that in a purely capitalistic sense the cause itself is not profitable.
The people whose business models benefit from this project's success will ensure it's staffed and funded. Your donations are emphatically not needed, nor will declining to donate do anything to slow it down.
Thanks, I haven't been donating to WP but did not know all of that either (apart from the Google employees who are mentioned and linked already in the conversations referenced in this thread).
1 million doesn't even seem that much, but still, yes, nobody knows what donations actually mean behind the scenes (who believes in philantropy anyway?)
I mean, it's clear how much Google and others profit from WP and especially Wikidata already.
Nah, about 3 months ago, I made ChatGPT write detailed hierarchical plan on how AI can conquer the world. The plan was severely flawed, of course. You need way more than brains to conquer the world.
> It is technically possible today for patients in hospital to have their food delivered by robot, yet most people would have a strong aversion to that idea
That's a matter of personal preference. I prefer self-service checkout in the grocery store even if there's an unused staffed checkout. Generally, I prefer robots and AIs wherever they are available.
GPT4 with WolframAlpha plugin even gave me enough information to implement Taylor polynomial approximation for Gaussian function (don't ask why I needed that), which would have otherwise taken me hours of studying if I could even solve it at all.
PS: GPT4 somehow knows even things that are really hard to find online. I recently needed standard error but not of mean but rather of standard deviation. GPT4 not only understood my vague query but gave me formula that is really hard to find online even if you already know the keywords. I know it's hard to find, because I went ahead to double-check ChatGPT's answer via search.