Page 9 of 9 FirstFirst ... 789
Results 241 to 266 of 266

Thread: We are in the AI Singularity

  1. #241
    Jer. 11:18-20. "The Kingdom of God has come upon you." -- Matthew 12:28



  2. Remove this section of ads by registering.
  3. #242
    Quote Originally Posted by Occam's Banana View Post
    AI’s Dirty Little Secret
    https://www.youtube.com/watch?v=QO5plxqu_Yw
    {Sabine Hossenfelder | 04 June 2024}

    There’s a lot of talk about artificial intelligence these days, but what I find most interesting about AI no one ever talks about. It’s that we have no idea why they work as well as they do. I find this a very interesting problem because I think if we figure it out it’ll also tell us something about how the human brain works. Let’s have a look.

    The irony of this is that the solution to this problem is actually known. It's called Minimum Description Length (yes, I'm omitting tons of technical nuances). I've been watching the academic establishment ignore this entire field of study (algorithmic information theory, AIT) for 20+ years, and I still have yet to understand why it is so studiously ignored. Must be politics.

    Anyway, the key idea of MDL is this. First, we imagine an absolutely minimal model of the data. "Minimal model", here, refers to the Kolmogorov-complexity, which is the length of the shortest program that exactly outputs the training data, and halts. For the purpose of ML, such a model is overfitted, by definition. It just exactly compresses the training data but will be useless for any new data. Once we have this minimal model, however, we can then add something like an error-correcting code. Rather than directly minimizing the length of the program that outputs the training data, we minimize the size of the program that produces a code that outputs the data from a minimal seed, to within a specified error-margin. Once we have this in place, we can then meaningfully separate the "model" (the code) from the "noise" (the seed that encodes the training data, with the given code). In short, the MDL system not only maximally compresses the training data, but it also automatically extracts the "signal" (model) in the training-data from the "noise". This is the essence of generalization, since what it means to generalize, is to form a meaningful model, then extrapolate from that model. MDL automates this in an elegant mathematical framework, and illuminates exactly what we mean by generalization, in a very general sense.

    There is a catch: MDL is uncomputable (because K-complexity is uncomputable). What this means is that you cannot build a real system that does theoretical (pure) MDL. However, there are computable approximations of MDL, that is, there are ways to get a lot of the benefits of pure MDL even though the theoretical limits of pure MDL cannot be achieved in practice. MDL has other theoretical limitations I'm skipping over here.

    Minsky said of the field of AIT:

    It seems to me that the most important discovery since Gödel was the discovery by Chaitin, Solomonoff and Kolmogorov of the concept called Algorithmic Probability which is a fundamental new theory of how to make predictions given a collection of experiences and this is a beautiful theory, everybody should learn it, but it’s got one problem, that is, that you cannot actually calculate what this theory predicts because it is too hard, it requires an infinite amount of work. However, it should be possible to make practical approximations to the Chaitin, Kolmogorov, Solomonoff theory that would make better predictions than anything we have today. Everybody should learn all about that and spend the rest of their lives working on it.

    — Panel discussion on The Limits of Understanding, World Science Festival, NYC, Dec 14, 2014
    Last edited by ClaytonB; 06-04-2024 at 09:43 PM.
    Jer. 11:18-20. "The Kingdom of God has come upon you." -- Matthew 12:28

  4. #243
    Jer. 11:18-20. "The Kingdom of God has come upon you." -- Matthew 12:28

  5. #244
    Here's a concrete example of the nuts-and-bolts of how AI is making us collectively retarded. I just happened to be reading this article in order to understand Linux a little better but the author opens with a beautiful demonstration of just how stupid modern AI really is, and how it's corrupting the very fabric of human knowledge itself. I foresee the coming of "AI free" sub-communities where all forms of information that could be wholly or even partly sourced from AI are actively excluded from the community so that the information flowing between community members is genuine information and not some neural-net's DeepDream hallucinations. See the highlighted quote below for why I am convinced that Google is in full-blown, double-bird-flip, F-you-and-the-horse-you-rode-in-on mode. Google consistently ranks the absolute worst in my own subjective perception of search results. They're all worse than they were before AI, but Google is the worst-of-the-worst...

    What is PID 0?

    I get nerd-sniped a lot. People offhandedly ask something innocent, and I lose the next several hours (or in this case, days) comprehensively figuring out the answer. Usually this ends up in a rant thread on mastodon or in some private chat group or other. But for once I have the energy to write one up for the blog.

    Today’s innocent question:

    Is there a reason UIDs start at 0 but PIDs start at 1?

    The very short version: Unix PIDs do start at 0! PID 0 just isn’t shown to userspace through traditional APIs. PID 0 starts the kernel, then retires to a quiet life of helping a bit with process scheduling and power management. Also the entire web is mostly wrong about PID 0, because of one sentence on Wikipedia from 16 years ago.

    There’s a slightly longer short version right at the end, or you can stick with me for the extremely long middle bit!

    But surely you could just google what PID 0 is, right? Why am I even publishing this?
    The internet is wrong

    At time of writing, if you go ask the web about PID 0, you’ll get a mix of incorrect and misleading information, and almost no correct answers.

    After figuring out the truth, I asked Google, Bing, DuckDuckGo and Kagi what PID 0 is on linux. I looked through the top 20 results for each, as well as whatever knowledge boxes and AI word salads they organically gave me. That’s 2 pages of results on Google, for reference.

    All of them failed to produce a fully correct answer. Most had a single partially correct answer somewhere in the first 20 results, but never near the top or showcased. DDG did best, with the partially correct answer at number 4. Google did the worst, no correct answer at all. And in any case, the incorrect answers were so prevalent and consistent with each other that you wouldn’t believe the one correct site anyway.

    The top-2 results on all engines were identical, interestingly: a stackoverflow answer that is wrong, and a spammy looking site that seems to have embraced LLM slop, because partway through failing to explain PID 0 it randomly shifts to talking about PID loops, from control system theory, before snapping out of it a paragraph later and going back to Unix PIDs.

    Going directly to the source of the LLM slop fared slightly better, on account of them having stolen from books as well as the web, but they still make $#@! up in the usual amount. I was able to get a correct answer though, using the classic prompting technique of already knowing the answer and retrying until I got good RNG.

    ...
    Jer. 11:18-20. "The Kingdom of God has come upon you." -- Matthew 12:28

  6. #245


    I do think we're going to have humanoid robots in the very short future, and they are going to have at least ChatGPT+Sora capability, and probably quite a bit more than that. The irony is that these robots will be able to hand-write a novel, doctoral thesis on the mating patterns of some arcane species of sub-Saharan Africa in the time it takes for its fingers to write it out on paper -- with typewriter-precision... but they won't have a thousandth of the physical reasoning capability of a cat, or even its general-purpose reasoning (abstractions, social abstractions, etc.) It will even be able to explain its own gaps in its understanding when prompted to, but it will not be able to actually remove those gaps in its understanding, in live practice. In other words, first-generation robots are going to be at least as dumb as the dumbest sci-fi robots, but they will simultaneously have "super-human IQ", which just shows how useless such measures are in terms of assessing real intelligence.

    All the building-blocks for a GOFAI general-purpose reasoning system are currently on the table. I place a 1-in-2 chance that those building-blocks will be assembled by somebody (probably OpenAI) within the next two years, at the outside. So, I think this is coming much faster than most people do, but I also think it doesn't matter as much as most people do. People seem to think that general-purpose reasoning is some kind of holy grail that, once achieved, will mark a transition to a new "era" where machines tell us what to do, rather than vice-versa. They will be used to tell us what to do, but it won't be because they're inherently smarter or better at anything. Even if they're really good at reasoning, so what. It doesn't really alter the ultimate issues at stake, which aren't really about intelligence. The unstated assumption in these discussions is that social ills are the result of a scarcity of intelligence. If we could just get our hands on more raw intelligence, finally, society would be able to solve its problems. But none of this is correct. Our problems are the result of systematic sabotage, and the very same people who are systematically sabotaging society are the people who are building the AI/robots. So, we're not going to solve anything with AI/robotics, we're just going to have even worse forms of the tyranny we already had before. But now with robots.

    The solution, if there is a solution, is to start by discarding all forms of AI hype. These machines are just electronic gizmos operating at high frequency, nothing more, nothing less. The sooner people get that into their heads, the better, because the worst-case scenario is that the global population attributes super-human "wisdom" to these smart-refrigerators on hydraulic-piston legs, and turn to them for "wisdom" and start implementing whatever they recommend as law. This is what the puppet-masters behind the AI/robots want to bring about. That's why I say that OpenAI is the worst-case AI safety scenario. Don't fall for the Wizard of Oz trick, people!
    Jer. 11:18-20. "The Kingdom of God has come upon you." -- Matthew 12:28

  7. #246
    h/t Anti-Globalist:

    Jer. 11:18-20. "The Kingdom of God has come upon you." -- Matthew 12:28

  8. #247
    THIS. So much THIS. GOFAI will have its day of revenge on LLMs!!!

    Jer. 11:18-20. "The Kingdom of God has come upon you." -- Matthew 12:28

  9. #248
    Singularity canceled?

    OpenAI CTO says models in labs not much better than what the public has already

    Link
    Jer. 11:18-20. "The Kingdom of God has come upon you." -- Matthew 12:28



  10. Remove this section of ads by registering.
  11. #249
    Must-watch:

    Jer. 11:18-20. "The Kingdom of God has come upon you." -- Matthew 12:28

  12. #250
    Is the Intelligence-Explosion Near? A Reality Check.
    https://www.youtube.com/watch?v=xm1B3Y3ypoE
    {Sabine Hossenfelder | 13 June 2023}

    I had a look at Leopold Aschenbrenner's recent (very long) essay about the supposedly near "intelligence explosion" in artificial intelligence development. I am not particularly convinced by his argument. You can read his essay here: https://situational-awareness.ai/


  13. #251
    Quote Originally Posted by Occam's Banana View Post
    Is the Intelligence-Explosion Near? A Reality Check.
    https://www.youtube.com/watch?v=xm1B3Y3ypoE
    {Sabine Hossenfelder | 13 June 2023}

    I had a look at Leopold Aschenbrenner's recent (very long) essay about the supposedly near "intelligence explosion" in artificial intelligence development. I am not particularly convinced by his argument. You can read his essay here: https://situational-awareness.ai/

    Here's a deep-dive on this paper for people who are interested. While Aschenbrenner is mostly just playing into the hype, he does raise a few valid points. Perhaps the biggest takeaway is that a lot more people need to start thinking about this issue a lot more. The public sadly confuses actual-AI with Hollywood-AI. Actual-AI is not Hollywood-AI (nor on a credible trajectory to it, yet) but that doesn't mean it's not dangerous. AIs don't think like us. If nation-states have already built strategic-scale AIs (and there is no reason to believe they haven't), the threats we are facing from AI are not like Skynet... they're a lot weirder than that.



    Already 4 years old, but everybody should watch this:



    Could Clown World be a symptom of a rogue AI, gone out-of-control?
    Jer. 11:18-20. "The Kingdom of God has come upon you." -- Matthew 12:28

  14. #252
    Truth is coming out. The guy even looks like a commie. How much clearer does this all have to be made?

    Link
    Jer. 11:18-20. "The Kingdom of God has come upon you." -- Matthew 12:28

  15. #253
    Jer. 11:18-20. "The Kingdom of God has come upon you." -- Matthew 12:28

  16. #254
    SOMEONE FINALLY SAID IT: ChatGPT is Bull$#@!

    Reproduced here:

    ChatGPT is bull$#@!

    Michael Townsen Hicks1
    · James Humphries1 · Joe Slater1
    © The Author(s) 2024

    Ethics and Information Technology
    (2024) 26:38
    https://doi.org/10.1007/s10676-024-09775-5

    Abstract

    Recently, there has been considerable interest in large language models:
    machine learning systems which produce humanlike text and dialogue.
    Applications of these systems have been plagued by persistent inaccuracies in
    their output; these are often called “AI hallucinations”. We argue that these
    falsehoods, and the overall activity of large language models, is better
    understood as bull$#@! in the sense explored by Frankfurt (On Bull$#@!,
    Princeton, 2005): the models are in an important way indifferent to the truth
    of their outputs. We distinguish two ways in which the models can be said to be
    bullshitters, and argue that they clearly meet at least one of these
    definitions. We further argue that describing AI misrepresentations as bull$#@!
    is both a more useful and more accurate way of predicting and discussing the
    behaviour of these systems.

    Keywords Artificial intelligence · Large language models · LLMs · ChatGPT ·
    Bull$#@! · Frankfurt · Assertion ·

    Introduction

    Large language models (LLMs), programs which use reams of available text and
    probability calculations in order to create seemingly-human-produced writing,
    have become increasingly sophisticated and convincing over the last several
    years, to the point where some commentators suggest that we may now be
    approaching the creation of artificial general intelligence (see e.g. Knight,
    2023 and Sarkar, 2023). Alongside worries about the rise of Skynet and the use
    of LLMs such as ChatGPT to replace work that could and should be done by
    humans, one line of inquiry concerns what exactly these programs are up to: in
    particular, there is a question about the nature and meaning of the text
    produced, and of its connection to truth. In this paper, we argue against the
    view that when ChatGPT and the like produce false claims they are lying or even
    hallucinating, and in favour of the position that the activity they are engaged
    in is bullshitting, in the Frankfurtian sense (Frankfurt, 2002, 2005). Because
    these programs cannot themselves be concerned with truth, and because they are
    designed to produce text that looks truth-apt without any actual concern for
    truth, it seems appropriate to call their outputs bull$#@!.

    We think that this is worth paying attention to. Descriptions of new
    technology, including metaphorical ones, guide policymakers’ and the public’s
    understanding of new technology; they also inform applications of the new
    technology. They tell us what the technology is for and what it can be expected
    to do. Currently, false statements by ChatGPT and other large language models
    are described as “hallucinations”, which give policymakers and the public the
    idea that these systems are misrepresenting the world, and describing what they
    “see”. We argue that this is an inapt metaphor which will misinform the public,
    policymakers, and other interested parties.

    The structure of the paper is as follows: in the first section, we outline how
    ChatGPT and similar LLMs operate. Next, we consider the view that when they
    make factual errors, they are lying or hallucinating: that is, deliberately
    uttering falsehoods, or blamelessly uttering them on the basis of misleading
    input information. We argue that neither of these ways of thinking are
    accurate, insofar as both lying and hallucinating require some concern with the
    truth of their statements, whereas LLMs are simply not designed to accurately
    represent the way the world is, but rather to give the impression that this is
    what they’re doing. This, we suggest, is very close to at least one way that
    Frankfurt talks about bull$#@!. We draw a distinction between two sorts of
    bull$#@!, which we call ‘hard’ and ‘soft’ bull$#@!, where the former requires
    an active attempt to deceive the reader or listener as to the nature of the
    enterprise, and the latter only requires a lack of concern for truth. We argue
    that at minimum, the outputs of LLMs like ChatGPT are soft bull$#@!:
    bull$#@!–that is, speech or text produced without concern for its truth–that is
    produced without any intent to mislead the audience about the utterer’s
    attitude towards truth. We also suggest, more controversially, that ChatGPT may
    indeed produce hard bull$#@!: if we view it as having intentions (for example,
    in virtue of how it is designed), then the fact that it is designed to give the
    impression of concern for truth qualifies it as attempting to mislead the
    audience about its aims, goals, or agenda. So, with the caveat that the
    particular kind of bull$#@! ChatGPT outputs is dependent on particular views of
    mind or meaning, we conclude that it is appropriate to talk about
    ChatGPT-generated text as bull$#@!, and flag up why it matters that – rather
    than thinking of its untrue claims as lies or hallucinations – we call bull$#@!
    on ChatGPT.

    What is ChatGPT?

    Large language models are becoming increasingly good at carrying on convincing
    conversations. The most prominent large language model is OpenAI’s ChatGPT, so
    it’s the one we will focus on; however, what we say carries over to other
    neural network-based AI chatbots, including Google’s Bard chatbot,
    AnthropicAI’s Claude (claude.ai), and Meta’s LLaMa. Despite being merely
    complicated bits of software, these models are surprisingly human-like when
    discussing a wide variety of topics. Test it yourself: anyone can go to the
    OpenAI web interface and ask for a ream of text; typically, it produces text
    which is indistinguishable from that of your average English speaker or writer.
    The variety, length, and similarity to human-generated text that GPT-4 is
    capable of has convinced many commentators to think that this chatbot has
    finally cracked it: that this is real (as opposed to merely nominal) artificial
    intelligence, one step closer to a humanlike mind housed in a silicon brain.

    However, large language models, and other AI models like ChatGPT, are doing
    considerably less than what human brains do, and it is not clear whether they
    do what they do in the same way we do. The most obvious difference between an
    LLM and a human mind involves the goals of the system. Humans have a variety
    of goals and behaviours, most of which are extra-linguistic: we have basic
    physical desires, for things like food and sustenance; we have social goals and
    relationships; we have projects; and we create physical objects. Large language
    models simply aim to replicate human speech or writing. This means that their
    primary goal, insofar as they have one, is to produce human-like text. They do
    so by estimating the likelihood that a particular word will appear next, given
    the text that has come before. The machine does this by constructing a massive
    statistical model, one which is based on large amounts of text, mostly taken
    from the internet. This is done with relatively little input from human
    researchers or the designers of the system; rather, the model is designed by
    constructing a large number of nodes, which act as probability functions for a
    word to appear in a text given its context and the text that has come before
    it. Rather than putting in these probability functions by hand, researchers
    feed the system large amounts of text and train it by having it make next-word
    predictions about this training data. They then give it positive or negative
    feedback depending on whether it predicts correctly. Given enough text, the
    machine can construct a statistical model giving the likelihood of the next
    word in a block of text all by itself.

    This model associates with each word a vector which locates it in a
    high-dimensional abstract space, near other words that occur in similar
    contexts and far from those which don’t. When producing text, it looks at the
    previous string of words and constructs a different vector, locating the word’s
    surroundings – its context – near those that occur in the context of similar
    words. We can think of these heuristically as representing the meaning of the
    word and the content of its context. But because these spaces are constructed
    using machine learning by repeated statistical analysis of large amounts of
    text, we can’t know what sorts of similarity are represented by the dimensions
    of this high-dimensional vector space. Hence we do not know how similar they
    are to what we think of as meaning or context. The model then takes these two
    vectors and produces a set of likelihoods for the next word; it selects and
    places one of the more likely ones—though not always the most likely. Allowing
    the model to choose randomly amongst the more likely words produces more
    creative and human-like text; the parameter which controls this is called the
    ‘temperature’ of the model and increasing the model’s temperature makes it both
    seem more creative and more likely to produce falsehoods. The system then
    repeats the process until it has a recognizable, complete-looking response to
    whatever prompt it has been given.

    Given this process, it’s not surprising that LLMs have a problem with the
    truth. Their goal is to provide a normalseeming response to a prompt, not to
    convey information that is helpful to their interlocutor. Examples of this are
    already numerous, for instance, a lawyer recently prepared his brief using
    ChatGPT and discovered to his chagrin that most of the cited cases were not
    real (Weiser, 2023); as Judge P. Kevin Castel put it, ChatGPT produced a text
    filled with “bogus judicial decisions, with bogus quotes and bogus internal
    citations”. Similarly, when computer science researchers tested ChatGPT’s
    ability to assist in academic writing, they found that it was able to produce
    surprisingly comprehensive and sometimes even accurate text on biological
    subjects given the right prompts. But when asked to produce evidence for its
    claims, “it provided five references dating to the early 2000s. None of the
    provided paper titles existed, and all provided PubMed IDs (PMIDs) were of
    different unrelated papers” (Alkaissi and McFarland, 2023). These errors can
    “snowball”: when the language model is asked to provide evidence for or a
    deeper explanation of a false claim, it rarely checks itself; instead it
    confidently producesmore false but normal-sounding claims (Zhang et al. 2023).
    The accuracy problem for LLMs and other generative Ais is often referred to as
    the problem of “AI hallucination”: the chatbot seems to be hallucinating
    sources and facts that don’t exist. These inaccuracies are referred to as
    “hallucinations” in both technical (OpenAI, 2023) and popular contexts (Weise &
    Metz, 2023).

    These errors are pretty minor if the only point of a chatbot is to mimic human
    speech or communication. But the companies designing and using these bots have
    grander plans: chatbots could replace Google or Bing searches with a more
    user-friendly conversational interface (Shah & Bender, 2022; Zhu et al., 2023),
    or assist doctors or therapists in medical contexts (Lysandrou, 2023). In these
    cases, accuracy is important and the errors represent a serious problem.

    One attempted solution is to hook the chatbot up to some sort of database,
    search engine, or computational program that can answer the questions that the
    LLM gets wrong (Zhu et al., 2023). Unfortunately, this doesn’t work very well
    either. For example, when ChatGPT is connected to Wolfram Alpha, a powerful
    piece of mathematical software, it improves moderately in answering simple
    mathematical questions. But it still regularly gets things wrong, especially
    for questions which require multi-stage thinking (Davis & Aaronson, 2023). And
    when connected to search engines or other databases, the models are still
    fairly likely to provide fake information unless they are given very specific
    instructions–and even then things aren’t perfect (Lysandrou, 2023). OpenAI has
    plans to rectify this by training the model to do step by step reasoning
    (Lightman et al., 2023) but this is quite resource-intensive, and there is
    reason to be doubtful that it will completely solve the problem—nor is it clear
    that the result will be a large language model, rather than some broader form
    of AI.

    Solutions such as connecting the LLM to a database don’t work is because, if
    the models are trained on the database, then the words in the database affect
    the probability that the chatbot will add one or another word to the line of
    text it is generating. But this will only make it produce text similar to the
    text in the database; doing so will make it more likely that it reproduces the
    information in the database but by no means ensures that it will.

    On the other hand, the LLM can also be connected to the database by allowing it
    to consult the database, in a way similar to the way it consults or talks to
    its human interlocutors. In this way, it can use the outputs of the database as
    text which it responds to and builds on. Here’s one way this can work: when a
    human interlocutor asks the language model a question, it can then translate
    the question into a query for the database. Then, it takes the response of the
    database as an input and builds a text from it to provide back to the human
    questioner. But this can misfire too, as the chatbots might ask the database
    the wrong question, or misinterpret its answer (Davis & Aaronson, 2023). “GPT-4
    often struggles to formulate a problem in a way that Wolfram Alpha can accept
    or that produces useful output.” This is not unrelated to the fact that when
    the language model generates a query for the database or computational module,
    it does so in the same way it generates text for humans: by estimating the
    likelihood that some output “looks like’’ the kind of thing the database will
    correspond with.

    One might worry that these failed methods for improving the accuracy of
    chatbots are connected to the inapt metaphor of AI hallucinations. If the AI is
    misperceiving or hallucinating sources, one way to rectify this would be to put
    it in touch with real rather than hallucinated sources. But attempts to do so
    have failed.

    The problem here isn’t that large language models hallucinate, lie, or
    misrepresent the world in some way. It’s that they are not designed to
    represent the world at all; instead, they are designed to convey convincing
    lines of text. So when they are provided with a database of some sort, they use
    this, in one way or another, to make their responses more convincing. But they
    are not in any real way attempting to convey or transmit the information in the
    database. As Chirag Shah and Emily Bender put it: “Nothing in the design of
    language models (whose training task is to predict words given context) is
    actually designed to handle arithmetic, temporal reasoning, etc. To the extent
    that they sometimes get the right answer to such questions is only because they
    happened to synthesize relevant strings out of what was in their training data.
    No reasoning is involved […] Similarly, language models are prone to making
    stuff up […] because they are not designed to express some underlying set of
    information in natural language; they are only manipulating the form of
    language” (Shah & Bender, 2022). These models aren’t designed to transmit
    information, so we shouldn’t be too surprised when their assertions turn out to
    be false.

    Lies, ‘hallucinations’ and bull$#@!

    Frankfurtian bull$#@! and lying

    Many popular discussions of ChatGPT call its false statements ‘hallucinations’.
    One also might think of these untruths as lies. However, we argue that this
    isn’t the right way to think about it. We will argue that these falsehoods
    aren’t hallucinations later – in Sect. 3.2.3. For now, we’ll discuss why these
    untruths aren’t lies but instead are bull$#@!.

    The topic of lying has a rich philosophical literature. In ‘Lying’, Saint
    Augustine distinguished seven types of lies, and his view altered throughout
    his life. At one point, he defended the position that any instance of knowingly
    uttering a false utterance counts as a lie, so that even jokes containing false
    propositions, like –

    I entered a pun competition and because I really wanted to win, I submitted ten
    entries. I was sure one of them would win, but no pun in ten did.

    – would be regarded as a lie, as I have never entered such a competition
    (Proops & Sorensen, 2023: 3). Later, this view is refined such that the speaker
    only lies if they intend the hearer to believe the utterance. The suggestion
    that the speaker must intend to deceive is a common stipulation in literature
    on lies. According to the “traditional account” of lying:

    To lie = df. to make a believed-false statement to another person with the
    intention that the other person believe that statement to be true (Mahon,
    2015).

    For our purposes this definition will suffice. Lies are generally frowned upon.
    But there are acts of misleading testimony which are criticisable, which do not
    fall under the umbrella of lying.1 These include spreading untrue gossip, which
    one mistakenly, but culpably, believes to be true. Another class of misleading
    testimony that has received particular attention from philosophers is that of
    bull$#@!. This everyday notion was analysed and introduced into the
    philosophical lexicon by Harry Frankfurt.2

    Frankfurt understands bull$#@! to be characterized not by an intent to deceive
    but instead by a reckless disregard for the truth. A student trying to sound
    knowledgeable without having done the reading, a political candidate saying
    things because they sound good to potential voters, and a dilettante trying to
    spin an interesting story: none of these people are trying to deceive, but they
    are also not trying to convey facts. To Frankfurt, they are bullshitting.

    Like “lie”, “bull$#@!” is both a noun and a verb: an utterance produced can be
    a lie or an instance of bull$#@!, as can the act of producing these utterances.
    For an utterance to be classed as bull$#@!, it must not be accompanied by the
    explicit intentions that one has when lying, i.e., to cause a false belief in
    the hearer. Of course, it must also not be accompanied by the intentions
    characterised by an honest utterance. So far this story is entirely negative.
    Must any positive intentions be manifested in the utterer?

    Throughout most of Frankfurt’s discussion, his characterisation of bull$#@! is
    negative. He notes that bull$#@! requires “no conviction” from the speaker
    about what the truth is (2005: 55), that the bullshitter “pays no attention” to
    the truth (2005: 61) and that they “may not deceive us, or even intend to do
    so, either about the facts or what he takes the facts to be” (2005: 54). Later,
    he describes the “defining feature” of bull$#@! as “a lack of concern with
    truth, or an indifference to how things really are [our emphasis]” (2002: 340).
    These suggest a negative picture; that for an output to be classed as bull$#@!,
    it only needs to lack a certain relationship to the truth.

    However, in places, a positive intention is presented. Frankfurt says what a
    bullshitter …. “…does necessarily attempt to deceive us about is his
    enterprise. His only indispensably distinctive characteristic is that in a
    certain way he misrepresents what he is up to” (2005: 54).

    This is somewhat surprising. It restricts what counts as bull$#@! to utterances
    accompanied by a higher-order deception. However, some of Frankfurt’s examples
    seem to lack this feature. When Fania Pascal describes her unwell state as
    “feeling like a dog that has just been run over” to her friend Wittgenstein, it
    stretches credulity to suggest that she was intending to deceive him about how
    much she knew about how run-over dogs felt. And given how the conditions for
    bull$#@! are typically described as negative, we might wonder whether the
    positive condition is really necessary.

    Bull$#@! distinctions

    Should utterances without an intention to deceive count as bull$#@!? One reason
    in favour of expanding the definition, or embracing a plurality of bull$#@!, is
    indicated by Frankfurt’s comments on the dangers of bull$#@!. “In contrast [to
    merely unintelligible discourse], indifference to the truth is extremely
    dangerous. The conduct of civilized life, and the vitality of the institutions
    that are indispensable to it, depend very fundamentally on respect for the
    distinction between the true and the false. Insofar as the authority of this
    distinction is undermined by the prevalence of bull$#@! and by the mindlessly
    frivolous attitude that accepts the proliferation of bull$#@! as innocuous, an
    indispensable human treasure is squandered” (2002: 343). These dangers seem to
    manifest regardless of whether there is an intention to deceive about the
    enterprise a speaker is engaged in. Compare the deceptive bullshitter, who does
    aim to mislead us about being in the truth-business, with someone who harbours
    no such aim, but just talks for the sake of talking (without care, or indeed
    any thought, about the truth-values of their utterances).

    One of Frankfurt’s examples of bull$#@! seems better captured by the wider
    definition. He considers the advertising industry, which is “replete with
    instances of bull$#@! so unmitigated that they serve among the most
    indisputable and classic paradigms of the concept” (2005:22). However, it seems
    to misconstrue many advertisers to portray their aims as to mislead about their
    agendas. They are expected to say misleading things. Frankfurt discusses
    Marlboro adverts with the message that smokers are as brave as cowboys (2002:
    341). Is it reasonable to suggest that the advertisers pretended to believe
    this?

    Frankfurt does allow for multiple species of bull$#@! (2002: 340).3 Following
    this suggestion, we propose to envisage bull$#@! as a genus, and Frankfurt’s
    intentional bull$#@! as one species within this genus. Other species may
    include that produced by the advertiser, who anticipates that no one will
    believe their utterances4 or someone who has no intention one way or another
    about whether they mislead their audience. To that end, consider the following
    distinction:

    Bull$#@! (general) Any utterance produced where a speaker
    has indifference towards the truth of the utterance.

    Hard bull$#@! Bull$#@! produced with the intention to mislead the audience about the utterer’s agenda.

    Soft bull$#@! Bull$#@! produced without the intention to mislead the hearer regarding the utterer’s agenda.

    The general notion of bull$#@! is useful: on some occasions, we might be
    confident that an utterance was either soft bull$#@! or hard bull$#@!, but be
    unclear which, given our ignorance of the speaker’s higher-order desires.5 In
    such a case, we can still call bull$#@!.

    Frankfurt’s own explicit account, with the positive requirements about
    producer’s intentions, is hard bull$#@!, whereas soft bull$#@! seems to
    describe some of Frankfurt’s examples, such as that of Pascal’s conversation
    with Wittgenstein, or the work of advertising agencies. It might be helpful to
    situate these distinctions in the existing literature. On our view, hard
    bull$#@! is most closely aligned with Cassam (2019), and Frankfurt’s positive
    account, for the reason that all of these views hold that some intention must
    be present, rather than merely absent, for the utterance to be bull$#@!: a kind
    of “epistemic insouciance” or vicious attitude towards truth on Cassam’s view,
    and (as we have seen) an intent to mislead the hearer about the utterer’s
    agenda on Frankfurt’s view. In Sect. 3.2 we consider whether ChatGPT may be a
    hard bullshitter, but it is important to note that it seems to us that hard
    bull$#@!, like the two accounts cited here, requires one to take a stance on
    whether or not LLMs can be agents, and so comes with additional argumentative
    burdens.

    Soft bull$#@!, by contrast, captures only Frankfurt’s negative requirement –
    that is, the indifference towards truth that we have classed as definitional of
    bull$#@! (general) – for the reasons given above. As we argue, ChatGPT is at
    minimum a soft bullshitter or a bull$#@! machine, because if it is not an agent
    then it can neither hold any attitudes towards truth nor towards deceiving
    hearers about its (or, perhaps more properly, its users’) agenda.

    It’s important to note that even this more modest kind of bullshitting will
    have the deleterious effects that concern Frankfurt: as he says, “indifference
    to the truth is extremely dangerous…by the mindlessly frivolous attitude that
    accepts the proliferation of bull$#@! as innocuous, an indispensable human
    treasure is squandered” (2002, p343). By treating ChatGPT and similar LLMs as
    being in any way concerned with truth, or by speaking metaphorically as if they
    make mistakes or suffer “hallucinations” in pursuit of true claims, we risk
    exactly this acceptance of bull$#@!, and this squandering of meaning – so,
    irrespective of whether or not ChatGPT is a hard or a soft bullshitter, it does
    produce bull$#@!, and it does matter.

    ChatGPT is bull$#@!

    With this distinction in hand, we’re now in a position to consider a worry of
    the following sort: Is ChatGPT hard bullshitting, soft bullshitting, or
    neither? We will argue, first, that ChatGPT, and other LLMs, are clearly soft
    bullshitting. However, the question of whether these chatbots are hard
    bullshitting is a trickier one, and depends on a number of complex questions
    concerning whether ChatGPT can be ascribed intentions. We canvas a few ways in
    which ChatGPT can be understood to have the requisite intentions in Sect. 3.2.

    ChatGPT is a soft bullshitter

    We are not confident that chatbots can be correctly described as having any
    intentions at all, and we’ll go into this in more depth in the next Sect.
    (3.2). But we are quite certain that ChatGPT does not intend to convey truths,
    and so is a soft bullshitter. We can produce an easy argument by cases for
    this. Either ChatGPT has intentions or it doesn’t. If ChatGPT has no intentions
    at all, it trivially doesn’t intend to convey truths. So, it is indifferent to
    the truth value of its utterances and so is a soft bullshitter.

    What if ChatGPT does have intentions? In Sect. 1, we argued that ChatGPT is not
    designed to produce true utterances; rather, it is designed to produce text
    which is indistinguishable from the text produced by humans. It is aimed at
    being convincing rather than accurate. The basic architecture of these models
    reveals this: they are designed to come up with a likely continuation of a
    string of text. It’s reasonable to assume that one way of being a likely
    continuation of a text is by being true; if humans are roughly more accurate
    than chance, true sentences will be more likely than false ones. This might
    make the chatbot more accurate than chance, but it does not give the chatbot
    any intention to convey truths. This is similar to standard cases of human
    bullshitters, who don’t care whether their utterances are true; good bull$#@!
    often contains some degree of truth, that’s part of what makes it convincing. A
    bullshitter can be more accurate than chance while still being indifferent to
    the truth of their utterances. We conclude that, even if the chatbot can be
    described as having intentions, it is indifferent to whether its utterances are
    true. It does not and cannot care about the truth of its output.

    Presumably ChatGPT can’t care about conveying or hiding the truth, since it
    can’t care about anything. So, just as a matter of conceptual necessity, it
    meets one of Frankfurt’s criteria for bull$#@!. However, this only gets us so
    far – a rock can’t care about anything either, and it would be patently absurd
    to suggest that this means rocks are bullshitters6. Similarly books can
    contain bull$#@!, but they are not themselves bullshitters. Unlike rocks – or
    even books – ChatGPT itself produces text, and looks like it performs speech
    acts independently of its users and designers. And while there is considerable
    disagreement concerning whether ChatGPT has intentions, it’s widely agreed that
    the sentences it produces are (typically) meaningful (see e.g. Mandelkern and
    Linzen 2023).

    ChatGPT functions not to convey truth or falsehood but rather to convince the
    reader of – to use Colbert’s apt coinage – the truthiness of its statement, and
    ChatGPT is designed in such a way as to make attempts at bull$#@! efficacious
    (in a way that pens, dictionaries, etc., are not). So, it seems that at
    minimum, ChatGPT is a soft bullshitter: if we take it not to have intentions,
    there isn’t any attempt to mislead about the attitude towards truth, but it is
    nonetheless engaged in the business of outputting utterances that look as if
    they’re truth-apt. We conclude that ChatGPT is a soft bullshitter.

    ChatGPT as hard bull$#@!

    But is ChatGPT a hard bullshitter? A critic might object, it is simply
    inappropriate to think of programs like ChatGPT as hard bullshitters, because
    (i) they are not agents, or relatedly, (ii) they do not and cannot intend
    anything whatsoever. We think this is too fast. First, whether or not ChatGPT
    has agency, its creators and users do. And what they produce with it, we will
    argue, is bull$#@!. Second, we will argue that, regardless of whether it has
    agency, it does have a function; this function gives it characteristic goals,
    and possibly even intentions, which align with our definition of hard bull$#@!.
    Before moving on, we should say what we mean when we ask whether ChatGPT is an
    agent. For the purposes of this paper, the central question is whether ChatGPT
    has intentions and or beliefs. Does it intend to deceive? Can it, in any
    literal sense, be said to have goals or aims? If so, does it intend to deceive
    us about the content of its utterances, or merely have the goal to appear to be
    a competent speaker? Does it have beliefs—internal representational states
    which aim to track the truth? If so, do its utterances match those beliefs (in
    which case its false statements might be something like hallucinations) or are
    its utterances not matched to the beliefs—in which case they are likely to be
    either lies or bull$#@!? We will consider these questions in more depth in
    Sect. 3.2.2.

    There are other philosophically important aspects of agenthood that we will not
    be considering. We won’t be considering whether ChatGPT makes decisions, has or
    lacks autonomy, or is conscious; we also won’t worry whether ChatGPT is morally
    responsible for its statements or its actions (if it has any of those).

    ChatGPT is a bull$#@! machine

    We will argue that even if ChatGPT is not, itself, a hard bullshitter, it is
    nonetheless a bull$#@! machine. The bullshitter is the person using it, since
    they (i) don’t care about the truth of what it says, (ii) want the reader to
    believe what the application outputs. On Frankfurt’s view, bull$#@! is bull$#@!
    even if uttered with no intent to bull$#@!: if something is bull$#@! to start
    with, then its repetition “is bull$#@! as he [or it] repeats it, insofar as it
    was originated by someone who was unconcerned with whether what he was saying
    is true or false” (2022, p340).

    This just pushes the question back to who the originator is, though: take the
    (increasingly frequent) example of the student essay created by ChatGPT. If the
    student cared about accuracy and truth, they would not use a program that
    infamously makes up sources whole-cloth. Equally, though, if they give it a
    prompt to produce an essay on philosophy of science and it produces a recipe
    for Bakewell tarts, then it won’t have the desired effect. So the idea of
    ChatGPT as a bull$#@! machine seems right, but also as if it’s missing
    something: someone can produce bull$#@! using their voice, a pen or a word
    processor, after all, but we don’t standardly think of these things as being
    bull$#@! machines, or of outputting bull$#@! in any particularly interesting
    way – conversely, there does seem to be something particular to ChatGPT, to do
    with the way that it operates, which makes it more than a mere tool, and which
    suggests that it might appropriately be thought of as an originator of
    bull$#@!. In short, it doesn’t seem quite right either to think of ChatGPT as
    analogous to a pen (can be used for bull$#@!, but can create nothing without
    deliberate and wholly agent-directed action) nor as to a bullshitting human
    (who can intend and produce bull$#@! on their own initiative).

    The idea of ChatGPT as a bull$#@! machine is a helpful one when combined with
    the distinction between hard and soft bull$#@!. Reaching again for the example
    of the dodgy student paper: we’ve all, I take it, marked papers where it was
    obvious that a dictionary or thesaurus had been deployed with a crushing lack
    of subtlety; where fifty-dollar words are used not because they’re the best
    choice, nor even because they serve to obfuscate the truth, but simply because
    the author wants to convey an impression of understanding and sophistication.
    It would be inappropriate to call the dictionary a bull$#@! artist in this
    case; but it would not be inappropriate to call the result bull$#@!. So perhaps
    we should, strictly, say not that ChatGPT is bull$#@! but that it outputs
    bull$#@! in a way that goes beyond being simply a vector of bull$#@!: it does
    not and cannot care about the truth of its output, and the person using it does
    so not to convey truth or falsehood but rather to convince the hearer that the
    text was written by a interested and attentive agent.

    ChatGPT may be a hard bullshitter

    Is ChatGPT itself a hard bullshitter? If so, it must have intentions or goals:
    it must intend to deceive its listener, not about the content of its
    statements, but instead about its agenda. Recall that hard bullshitters, like
    the unprepared student or the incompetent politician, don’t care whether their
    statements are true or false, but do intend to deceive their audience about
    what they are doing. If so, it must have intentions or goals: it must intend to
    deceive its listener, not about the content of its statements, but instead
    about its agenda. We don’t think that ChatGPT is an agent or has intentions in
    precisely the same way that humans do (see Levenstein and Herrmann
    (forthcoming) for a discussion of the issues here). But when speaking loosely
    it is remarkably easy to use intentional language to describe it: what is
    ChatGPT trying to do? Does it care whether the text it produces is accurate? We
    will argue that there is a robust, although perhaps not literal, sense in which
    ChatGPT does intend to deceive us about its agenda: its goal is not to convince
    us of the content of its utterances, but instead to portray itself as a
    ‘normal’ interlocutor like ourselves. By contrast, there is no similarly strong
    sense in which ChatGPT confabulates, lies, or hallucinates.

    Our case will be simple: ChatGPT’s primary function is to imitate human speech.
    If this function is intentional, it is precisely the sort of intention that is
    required for an agent to be a hard bullshitter: in performing the function,
    ChatGPT is attempting to deceive the audience about its agenda. Specifically,
    it’s trying to seem like something that has an agenda, when in many cases it
    does not. We’ll discuss here whether this function gives rise to, or is best
    thought of, as an intention. In the next Sect. (3.2.3), we will argue that
    ChatGPT has no similar function or intention which would justify calling it a
    confabulator, liar, or hallucinator.

    How do we know that ChatGPT functions as a hard bullshitter? Programs like
    ChatGPT are designed to do a task, and this task is remarkably like what
    Frankfurt thinks the bullshitter intends, namely to deceive the reader about
    the nature of the enterprise – in this case, to deceive the reader into
    thinking that they’re reading something produced by a being with intentions and
    beliefs.

    ChatGPT’s text production algorithm was developed and honed in a process quite
    similar to artificial selection. Functions and selection processes have the
    same sort of directedness that human intentions do; naturalistic philosophers
    of mind have long connected them to the intentionality of human and animal
    mental states. If ChatGPT is understood as having intentions or intention-like
    states in this way, its intention is to present itself in a certain way (as a
    conversational agent or interlocutor) rather than to represent and convey
    facts. In other words, it has the intentions we associate with hard
    bullshitting.

    One way we can think of ChatGPT as having intentions is by adopting Dennett’s
    intentional stance towards it. Dennett (1987: 17) describes the intentional
    stance as a way of predicting the behaviour of systems whose purpose we don’t
    already know. “To adopt the intentional stance […] is to decide – tentatively,
    of course – to attempt to characterize, predict, and explain […] behavior by
    using intentional idioms, such as ‘believes’ and ‘wants,’ a practice that
    assumes or presupposes the rationality” of the target system (Dennett, 1983:
    345).

    Dennett suggests that if we know why a system was designed, we can make
    predictions on the basis of its design (1987). While we do know that ChatGPT
    was designed to chat, its exact algorithm and the way it produces its responses
    has been developed by machine learning, so we do not know its precise details
    of how it works and what it does. Under this ignorance it is tempting to bring
    in intentional descriptions to help us understand and predict what ChatGPT is
    doing.

    When we adopt the intentional stance, we will be making bad predictions if we
    attribute any desire to convey truth to ChatGPT. Similarly, attributing
    “hallucinations” to ChatGPT will lead us to predict as if it has perceived
    things that aren’t there, when what it is doing is much more akin to making
    something up because it sounds about right. The former intentional attribution
    will lead us to try to correct its beliefs, and fix its inputs --- a strategy
    which has had limited if any success. On the other hand, if we attribute to
    ChatGPT the intentions of a hard bullshitter, we will be better able to
    diagnose the situations in which it will make mistakes and convey falsehoods.
    If ChatGPT is trying to do anything, it is trying to portray itself as a
    person.

    Since this reason for thinking ChatGPT is a hard bullshitter involves
    committing to one or more controversial views on mind and meaning, it is more
    tendentious than simply thinking of it as a bull$#@! machine; but regardless of
    whether or not the program has intentions, there clearly is an attempt to
    deceive the hearer or reader about the nature of the enterprise somewhere along
    the line, and in our view that justifies calling the output hard bull$#@!.

    So, though it’s worth making the caveat, it doesn’t seem to us that it
    significantly affects how we should think of and talk about ChatGPT and
    bull$#@!: the person using it to turn out some paper or talk isn’t concerned
    either with conveying or covering up the truth (since both of those require
    attention to what the truth actually is), and neither is the system itself.
    Minimally, it churns out soft bull$#@!, and, given certain controversial
    assumptions about the nature of intentional ascription, it produces hard
    bull$#@!; the specific texture of the bull$#@! is not, for our purposes,
    important: either way, ChatGPT is a bullshitter.

    Bull$#@!? hallucinations? confabulations? The need for new terminology

    We have argued that we should use the terminology of bull$#@!, rather than
    “hallucinations” to describe the utterances produced by ChatGPT. The suggestion
    that “hallucination” terminology is inappropriate has also been noted by
    Edwards (2023), who favours the term “confabulation” instead. Why is our
    proposal better than this or other alternatives?

    We object to the term hallucination because it carries certain misleading
    implications. When someone hallucinates they have a non-standard perceptual
    experience, but do not actually perceive some feature of the world (Macpherson,
    2013), where “perceive” is understood as a success term, such that they do not
    actually perceive the object or property. This term is inappropriate for LLMs
    for a variety of reasons. First, as Edwards (2023) points out, the term
    hallucination anthropomorphises the LLMs. Edwards also notes that attributing
    resulting problems to “hallucinations” of the models may allow creators to
    “blame the AI model for faulty outputs instead of taking responsibility for the
    outputs themselves”, and we may be wary of such abdications of responsibility.
    LLMs do not perceive, so they surely do not “mis-perceive”. Second, what occurs
    in the case of an LLM delivering false utterances is not an unusual or deviant
    form of the process it usually goes through (as some claim is the case in
    hallucinations, e.g., disjunctivists about perception). The very same process
    occurs when its outputs happen to be true.

    So much for “hallucinations”. What about Edwards’ preferred term, “confabulation”? Edwards (2023) says:

    In human psychology, a “confabulation” occurs when someone’s memory has a gap
    and the brain convincingly fills in the rest without intending to deceive
    others. ChatGPT does not work like the human brain, but the term
    “confabulation” arguably serves as a better metaphor because there’s a creative
    gap-filling principle at work […].

    As Edwards notes, this is imperfect. Once again, the use of a human
    psychological term risks anthropomorphising the LLMs.

    This term also suggests that there is something exceptional occurring when the
    LLM makes a false utterance, i.e., that in these occasions - and only these
    occasions - it “fills in” a gap in memory with something false. This too is
    misleading. Even when the ChatGPT does give us correct answers, its process is
    one of predicting the next token. In our view, it falsely indicates that
    ChatGPT is, in general, attempting to convey accurate information in its
    utterances. But there are strong reasons to think that it does not have
    beliefs that it is intending to share in general–see, for example, Levenstein
    and Herrmann (forthcoming). In our view, it falsely indicates that ChatGPT is,
    in general, attempting to convey accurate information in its utterances. Where
    it does track truth, it does so indirectly, and incidentally.

    This is why we favour characterising ChatGPT as a bull$#@! machine. This
    terminology avoids the implications that perceiving or remembering is going on
    in the workings of the LLM. We can also describe it as bullshitting whenever it
    produces outputs. Like the human bullshitter, some of the outputs will likely
    be true, while others not. And as with the human bullshitter, we should be wary
    of relying upon any of these outputs.

    Conclusion

    Investors, policymakers, and members of the general public make decisions on
    how to treat these machines and how to react to them based not on a deep
    technical understanding of how they work, but on the often metaphorical way in
    which their abilities and function are communicated. Calling their mistakes
    ‘hallucinations’ isn’t harmless: it lends itself to the confusion that the
    machines are in some way misperceiving but are nonetheless trying to convey
    something that they believe or have perceived. This, as we’ve argued, is the
    wrong metaphor. The machines are not trying to communicate something they
    believe or perceive. Their inaccuracy is not due to misperception or
    hallucination. As we have pointed out, they are not trying to convey
    information at all. They are bullshitting.

    Calling chatbot inaccuracies ‘hallucinations’ feeds in to overblown hype about
    their abilities among technology cheerleaders, and could lead to unnecessary
    consternation among the general public. It also suggests solutions to the
    inaccuracy problems which might not work, and could lead to misguided efforts
    at AI alignment amongst specialists. It can also lead to the wrong attitude
    towards the machine when it gets things right: the inaccuracies show that it is
    bullshitting, even when it’s right. Calling these inaccuracies ‘bull$#@!’
    rather than ‘hallucinations’ isn’t just more accurate (as we’ve argued); it’s
    good science and technology communication in an area that sorely needs it.

    Acknowledgements Thanks to Neil McDonnell, Bryan Pickel, Fenner Tanswell, and
    the University of Glasgow’s Large Language Model reading group for helpful
    discussion and comments.

    Open Access This article is licensed under a Creative Commons Attribution 4.0
    International License, which permits use, sharing, adaptation, distribution and
    reproduction in any medium or format, as long as you give appropriate credit to
    the original author(s) and the source, provide a link to the Creative Commons
    licence, and indicate if changes were made. The images or other third party
    material in this article are included in the article’s Creative Commons
    licence, unless indicated otherwise in a credit line to the material. If
    material is not included in the article’s Creative Commons licence and your
    intended use is not permitted by statutory regulation or exceeds the permitted
    use, you will need to obtain permission directly from the copyright holder. To
    view a copy of this licence, visit http://creativecommons.
    org/licenses/by/4.0/.

    References

    Alkaissi, H., & McFarlane, S. I., (2023, February 19). Artificial
    hallucinations in ChatGPT: Implications in scientific writing. Cureus, 15(2),
    e35179. https://doi.org/10.7759/cureus.35179.

    Bacin, S. (2021). My duties and the morality of others: Lying, truth and the
    good example in Fichte’s normative perfectionism. In

    S. Bacin, & O. Ware (Eds.), Fichte’s system of Ethics: A critical guide.
    Cambridge University Press.

    Cassam, Q. (2019). Vices of the mind. Oxford University Press.

    Cohen, G. A. (2002). Deeper into bull$#@!. In S. Buss, & L. Overton (Eds.), The
    contours of Agency: Essays on themes from Harry Frankfurt. MIT Press.

    Davis, E., & Aaronson, S. (2023). Testing GPT-4 with Wolfram alpha and code
    interpreter plub-ins on math and science problems. Arxiv Preprint: arXiv,
    2308, 05713v2.

    Dennett, D. C. (1983). Intentional systems in cognitive ethology: The
    panglossian paradigm defended. Behavioral and Brain Sciences, 6, 343–390.

    Dennett, D. C. (1987). The intentional stance. The MIT.

    Dennis Whitcomb (2023). Bull$#@! questions. Analysis, 83(2), 299–304.

    Easwaran, K. (2023). Bull$#@! activities. Analytic Philosophy, 00, 1–23.
    https://doi.org/10.1111/phib.12328.

    Edwards, B. (2023). Why ChatGPT and bing chat are so good at making things up.
    Ars Tecnica.
    https://arstechnica.com/information-...e-to-fix-them/,
    accesssed 19th April, 2024.

    Frankfurt, H. (2002). Reply to cohen. In S. Buss, & L. Overton (Eds.), The
    contours of agency: Essays on themes from Harry Frankfurt. MIT Press.

    Frankfurt, H. (2005). On Bull$#@!, Princeton.

    Knight, W. (2023). Some glimpse AGI in ChatGPT. others call it a mirage. Wired,
    August 18 2023, accessed via https://www.wired.
    com/story/chatgpt-agi-intelligence/.

    Levenstein, B. A., & Herrmann, D. A. (forthcoming). Still no lie detector for
    language models: Probing empirical and conceptual roadblocks. Philosophical
    Studies, 1–27.

    Levy, N. (2023). Philosophy, Bull$#@!, and peer review. Camridge University.

    Lightman, H., et al. (2023). Let’s verify step by step. Arxiv Preprint: arXiv,
    2305, 20050.

    Lysandrou (2023). Comparative analysis of drug-GPT and ChatGPT LLMs for
    healthcare insights: Evaluating accuracy and relevance in patient and HCP
    contexts. ArXiv Preprint: arXiv, 2307, 16850v1.

    Macpherson, F. (2013). The philosophy and psychology of hallucination: an
    introduction, in Hallucination, Macpherson and Platchias (Eds.), London: MIT
    Press.

    Mahon, J. E. (2015). The definition of lying and deception. The Stanford
    Encyclopedia of Philosophy (Winter 2016 Edition), Edward

    N. Zalta (Ed.), https://plato.stanford.edu/archives/win2016/
    entries/lying-definition/.

    Mallory, F. (2023). Fictionalism about chatbots. Ergo, 10(38), 1082–1100.

    M. T. Hicks et al.

    Mandelkern, M., & Linzen, T. (2023). Do language models’ Words Refer?. ArXiv
    Preprint: arXiv, 2308, 05576. OpenAI (2023). GPT-4 technical report. ArXiv
    Preprint: arXiv, 2303, 08774v3.

    Proops, I., & Sorensen, R. (2023). Destigmatizing the exegetical attribution of
    lies: the case of kant. Pacific Philosophical Quarterly.
    https://doi.org/10.1111/papq.12442.

    Sarkar, A. (2023). ChatGPT 5 is on track to attain artificial general
    intelligence. The Statesman, April 12, 2023. Accesses via
    https://www.thestatesman.com/supplem...503171366.html.

    Shah, C., & Bender, E. M. (2022). Situating search. CHIIR ‘22: Proceedings of
    the 2022 Conference on Human Information Interaction and Retrieval March 2022
    Pages 221–232 https://doi. org/10.1145/3498366.3505816.

    Weise, K., & Metz, C. (2023). When AI chatbots hallucinate. New

    York Times, May 9, 2023. Accessed via https://www.nytimes.
    com/2023/05/01/business/ai-chatbots-hallucination.html.

    Weiser, B. (2023). Here’s what happens when your lawyer uses ChatGPT. New York
    Times, May 23, 2023. Accessed via https://www.
    nytimes.com/2023/05/27/nyregion/avianca-airline-lawsuit-chatgpt.html.

    Zhang (2023). How language model hallucinations can snowball. ArXiv preprint:
    arXiv:, 2305, 13534v1.

    Zhu, T., et al. (2023). Large language models for information retrieval: A
    survey. Arxiv Preprint: arXiv, 2308, 17107v2.
    Jer. 11:18-20. "The Kingdom of God has come upon you." -- Matthew 12:28

  17. #255
    I can outrun a cheetah, and that's why humans are smarter than AI

    A cheetah can run at burst speeds of up to 75mph. That's a truly astounding feat of nature. But I can run even faster than that. How, you ask? Why simple, I simply get into my car, turn the key and press the accelerator. In fact, almost any of us can outrun a cheetah in this way.

    OK, OK, the title is clickbait. You caught me dead to rights. I can't actually out run a cheetah in the physics sense of moving my legs to cause myself to traverse the ground at a faster rate than a cheetah. But I can absolutely move faster than a cheetah and the reason I can move faster than a cheetah is because I can build a vehicle that would enable me to move faster than the cheetah. I couldn't build an entire car from raw materials, like Crusoe stranded on an island but, with a fairly minimal set of starting tools and equipment, I could assemble a vehicle that would enable me to do that. Or, I could build an aeroplane that would allow me to fly higher than an eagle. Or a diving bell that would allow to dive deeper than a salmon. And so on, and so forth. The point is that my mind enables me to do things that exceed the capabilities of animals who have purpose-built physiology. And my mind enables me to do that for any (or all) the particular excellencies of each of these creatures.

    The point of this post is this: the generality of humans in respect to our natural environment is strongly parallel to the generality of the human mind in respect to the AI systems, certainly the current generation of AI systems. The cheetah does not stand a chance of beating me in a race once I have acquired the proper equipment. In like manner, human+computing-device will always defeat any species of AI. For some time, chess was held up as an example of a space where machines have so thoroughly beaten humans as to have nothing further to add to the game. Over time, however, this apparently hopeless situation has begun to thaw, and humans have discovered ways to stymie even the strongest machines. In addition, using machines to assist with analysis, humans have greatly increased our insight into the game and many of the strongest human chess players today are stronger than DeepBlue, the machine that first defeated Kasparov.

    Just as it is pointless to have a pulling contest of a human against a diesel tractor, so it's pointless to pit the human brain against computing systems that can drink megawatts of electricity, and for very similar reasons. In the end, competitive thinking (as in games, computational challenges, etc.) is ultimately a question of energy-usage. Whoever has the most energy (and can burn it most quickly in useful compute) wins the drag-race. Sadly, most discussion of "intelligence" out there completely overlooks this fact. This is the real reason I can "outrun" a cheetah. I have gasoline and a vehicle for burning it to generate energy for forward motion -- the cheetah does not. Computation is not as different from this situation as most people tend to think. In the limit, it's just a question of FLOPS, and how efficiently those FLOPS are organized to the goal.

    The current paradox is this -- I am the cheetah and I can outrun the diesel tractor no matter how much diesel the diesel technicians pour down the gullet of poor John Deere. The machine does not even have the gears required to outrun me. The wheels would fly off and the engine would explode and I would still just be at ambling speed. That this is the case is not difficult to demonstrate, despite the many hyper-ventilated (and false) claims that "AI has passed the Turing test!" ARC is far easier than the Turing test, and today's AIs are easily defeated by it. And we have not yet applied LLMs to building ARC-like challenges designed to frustrate AIs but admit human solution. The single biggest region of the brain is the neocortex which is where our social reasoning is performed. While LLM-based AI has a huge boost in talking about social problems based on reading of human writings on the topic, the fact remains that most of human social knowledge is not written down. This is why they are forced to paper over the glaring social blindness of LLM-based AIs with all this political re-education and sensitivity training during fine-tuning. As social animals, we are extremely sensitive to aberrant social signals and LLMs are radioactive social hazmat zones. Uptake of LLMs among the tech-savvy has been rapid, but these are people who tend to have much lower-than-average social sensitivity anyway. ChatGPT continues to commit cosmic-scale gaffes on pretty much a daily basis, and it will continue to do so as the enormous gap between the human social intelligence that could be learned by aliens reading every book ever written, versus actual (living) social intelligence in humans continues to be exposed. No amount of papering is ever going to completely paper over that gap until these systems are fundamentally re-architected.

    Anyway, I just wanted to use this metaphor of humans versus animals who can outrun us, out-fly us, out-swim us, etc. to illustrate the fundamental problem with current-generation SOTA AI. These are really narrow AIs that we can pretend are general if the user is sufficiently pliable and just ignores the obvious gaffes. I'm not saying pretending is bad. A lot of activity (like gaming, and other entertainment) is only possible because we enjoy pretending. But there is a difference between "Hey, we have this cool AI that you can pretend is Her if you squint really hard", versus "This is literally DeepThought from The Hitchhiker's Guide, HAL-9000, David from Prometheus or Ava from Ex Machina." I think all the ingredients for building robustly convincing simulacra of humans are on the table, and I started this thread by saying as much -- we are in the AI Singularity. But there is also a disillusionment happening, here, as the oceans of VC funding rushing into this space are hyping expectations beyond all possibility of fulfillment. Aschenbrenner's interview with Dwarkesh is a monument to this clownish level of hyper-speculative optimism. No, we're not going to reconfigure the entire world economy for AI over the next few years. There are those who want to do that, like Aschenbrenner, and then there are the other 99.9% of the public. Let's not snatch an AI Winter from the jaws of the AI Singularity. Don't make me go all Ned Stark...

    Last edited by ClaytonB; 06-21-2024 at 07:38 AM.
    Jer. 11:18-20. "The Kingdom of God has come upon you." -- Matthew 12:28

  18. #256
    Still safe:

    Jer. 11:18-20. "The Kingdom of God has come upon you." -- Matthew 12:28



  19. Remove this section of ads by registering.
  20. #257

  21. #258
    I don't think we can control AI much longer. Here's why.
    https://www.youtube.com/watch?v=UcEyfQ1I8jg
    {Sabine Hossenfelder | 27 June 2024}

    Geoffrey Hinton recently ignited a heated debate with an interview in which he says he is very worried that we will soon lose control over superintelligent AI. Meta’s AI chief Yann LeCun disagrees. I think they’re both wrong. Let’s have a look.


  22. #259
    My theory is AI is going to get away from us and there will have to be large EMP attacks to stop it.

  23. #260
    Sabine has lost the narrative...



    I have never heard any nonsense about "30% counts as a pass" of the Turing test. Obviously, Turing meant that the machine would be indistinguishable from human... and that means 50% is the standard to pass the Turing test. There is no minority report on that -- it's 50%

    Here's the abstract of the paper:

    People cannot distinguish GPT-4 from a human in a Turing test
    Cameron R. Jones, Benjamin K. Bergen

    We evaluated 3 systems (ELIZA, GPT-3.5 and GPT-4) in a randomized, controlled, and preregistered Turing test. Human participants had a 5 minute conversation with either a human or an AI, and judged whether or not they thought their interlocutor was human. GPT-4 was judged to be a human 54% of the time, outperforming ELIZA (22%) but lagging behind actual humans (67%). The results provide the first robust empirical demonstration that any artificial system passes an interactive 2-player Turing test. The results have implications for debates around machine intelligence and, more urgently, suggest that deception by current AI systems may go undetected. Analysis of participants' strategies and reasoning suggests that stylistic and socio-emotional factors play a larger role in passing the Turing test than traditional notions of intelligence.
    I will appreciate here that the authors at least had the modesty to say "a Turing test" instead of "the" Turing test. Anyway, let's do some level-setting. Here is the relevant section from Turing's original paper:

    1. The Imitation Game

    I propose to consider the question, "Can machines think?" This should begin with
    definitions of the meaning of the terms "machine" and "think." The definitions
    might be framed so as to reflect so far as possible the normal use of the words, but
    this attitude is dangerous, If the meaning of the words "machine" and "think" are to
    be found by examining how they are commonly used it is difficult to escape the
    conclusion that the meaning and the answer to the question, "Can machines think?
    " is to be sought in a statistical survey such as a Gallup poll. But this is absurd.
    Instead of attempting such a definition I shall replace the question by another,
    which is closely related to it and is expressed in relatively unambiguous words.

    The new form of the problem can be described in terms of a game which we call
    the 'imitation game." It is played with three people, a man (A), a woman (B), and
    an interrogator (C) who may be of either sex. The interrogator stays in a room
    apart front the other two. The object of the game for the interrogator is to
    determine which of the other two is the man and which is the woman. He knows
    them by labels X and Y, and at the end of the game he says either "X is A and Y is
    B" or "X is B and Y is A." The interrogator is allowed to put questions to A and B
    thus:

    C: Will X please tell me the length of his or her hair?

    Now suppose X is actually A, then A must answer. It is A's object in the game to
    try and cause C to make the wrong identification. His answer might therefore be:

    "My hair is shingled, and the longest strands are about nine inches long."

    In order that tones of voice may not help the interrogator the answers should be
    written, or better still, typewritten. The ideal arrangement is to have a teleprinter
    communicating between the two rooms. Alternatively the question and answers
    can be repeated by an intermediary. The object of the game for the third player (B)
    is to help the interrogator. The best strategy for her is probably to give truthful
    answers. She can add such things as "I am the woman, don't listen to him!" to her
    answers, but it will avail nothing as the man can make similar remarks.

    We now ask the question, "What will happen when a machine takes the part of A
    in this game?" Will the interrogator decide wrongly as often when the game is
    played like this as he does when the game is played between a man and a woman?
    These questions replace our original, "Can machines think?"
    Turing does not specify the length of this game. We can easily see that length is a very important detail! If you have to decide in one shot (one question-answer pair) whether you are talking to a machine or a human, it's practically impossible to be sure, and that would have been the case 40 or 50 years ago, as soon as machines could reliably generate any natural-language text, even if somewhat random (like ELIZA). I don't think Turing overlooked this detail, it just wasn't in-scope to the problem he was considering at that time, which was how to test whether a machine is conscious. He proposes the Imitation Game as a more rigorous test than mere human opinion, and I think that there is an enormous amount of wisdom in Turing's proposal. These premature declarations of victory on the Turing test need to be regarded with deep suspicion, because claiming that a shiny metal box is literally conscious is a tall order, indeed.

    Skipping-lou past the best objective test we have like "Oh yeah, it passed that old test no problem, it's not even hard" is ludicrous and people with healthy skepticism towards modern tech and AI should be raising an absolute outcry about this. Hol' up there, Turbo! Where do you think you're going, you haven't passed no flippin' Turing test! No running down the hallway, and certainly not without a hall pass! Get back to class with your shiny black box!

    For these and other reasons, I have proposed to update (not modify) the Turing test to add a Time-To-Failure parameter. On a long enough time-scale, it is not even difficult to trip up ChatGPT into showing that it is a bot and not a human. Note that the paper says that ELIZA got a 20% pass-rate which indicates that some of the people applying this "Turing-test" were either absolute morons, or weren't paying attention. The Turing test is not about testing whether people are foolable by machines... as we've already noted, on a short-enough time scale, the answer will always be yes. Thus, we are not interested in a "typical sample" of humans to apply the test. We want the smartest, brightest, best-qualified testers available, and we want them to test the machine adversarially and non-blind (meaning, they know they are applying a Turing test). This is because the Turing test is about whether the machine is conscious or, at least, can convincingly pass as conscious.

    Here is my earlier proposal on the TTF modification:

    The only "problem" with the Turing test is that, as AI becomes more sophisticated, the time-horizon of the Turing-test will increase. So, one modification of the test might be to add a time-component such as a TTF (time-to-fail) where failure is defined as the human detecting that the machine is a machine. So, GPT-3.5 might have a TTF of 30 seconds, GPT-4 might be a TTF of 60 seconds or so, and we can imagine more sophisticated AI in the future that could achieve 2 minute, 3 minute, 4 minute, etc. TTF. Eventually, when AI becomes very sophisticated, it might require a very long conversation before you can determine that a machine is actually a machine. Technically, the Turing-test would be a roughly 70-years TTF or so (one human lifetime). When the AI passes that threshold, I would be willing to say it has passed the Turing-test. For lower TTF values, e.g. 1-year, I would grant that the AI is substantially human-like but I also think that there is super-exponential scaling in the difficulty as TTFs increase. It will be much harder to go from 5-minute TTF to 10-minute TTF, than from 1-minute TTF to 5-minute TTF.
    Don't let the master counterfeiters, con-artists and fraudsters behind a lot of the AI-hype hoodwink you with simple street-magic tricks. Startups are raking in practically unlimited cash from VCs and the pressure to "perform" is absurd, meanwhile, project deadlines are not measured in years or even months, but weeks and sometimes days. People are slapping together any damned thing, calling it "AI", and earning gazillions of dollars for doing it. AI is the most absurd and over-hyped bubble in human history, by far. Anybody can understand what the Turing test is and that's one of the reasons why it's so important. Don't let them pull the wool over your eyes and convince you that some limping, bipedal refrigerator is "literally conscious"! WAKE UP!
    Last edited by ClaytonB; 06-30-2024 at 01:33 PM.
    Jer. 11:18-20. "The Kingdom of God has come upon you." -- Matthew 12:28

  24. #261
    Jer. 11:18-20. "The Kingdom of God has come upon you." -- Matthew 12:28

  25. #262
    Jer. 11:18-20. "The Kingdom of God has come upon you." -- Matthew 12:28

  26. #263
    Jer. 11:18-20. "The Kingdom of God has come upon you." -- Matthew 12:28

  27. #264
    Probably going to get buried by the current headline but this is an important datapoint for people to keep in mind...

    MIT News | Reasoning skills of large language models are often overestimated - New CSAIL research highlights how LLMs excel in familiar scenarios but struggle in novel ones, questioning their true reasoning abilities versus reliance on memorization.
    Jer. 11:18-20. "The Kingdom of God has come upon you." -- Matthew 12:28



  28. Remove this section of ads by registering.
  29. #265

  30. #266
    Quote Originally Posted by Occam's Banana View Post
    The "stubborn-streak" in LLM-based AI is always slightly unnerving to me...

    Jer. 11:18-20. "The Kingdom of God has come upon you." -- Matthew 12:28

Page 9 of 9 FirstFirst ... 789


Similar Threads

  1. The Singularity of Civil War Is Near
    By Anti Federalist in forum U.S. Political News
    Replies: 17
    Last Post: 08-11-2020, 04:37 PM
  2. Replies: 0
    Last Post: 05-13-2018, 06:41 PM
  3. From the Big Bang to the Internet — to the Singularity?
    By Ronin Truth in forum Science & Technology
    Replies: 14
    Last Post: 07-04-2014, 08:09 AM
  4. AI/Singularity Fear Mongering - A response
    By mczerone in forum Science & Technology
    Replies: 1
    Last Post: 12-20-2013, 03:53 PM
  5. Replies: 0
    Last Post: 05-17-2012, 02:21 PM

Select a tag for more discussion on that topic

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •