22. Mai 202622 May 20262026 m. gegužės 22 d.22 maggio 202622 mai 202622 de mayo de 2026

Dass die KI lügt, ist das kleinste ihrer 17 Probleme.That AI lies is the least of its 17 problems.Tai, kad DI meluoja, yra mažiausia iš jos 17 bėdų.Che l'IA menta è il minore dei suoi 17 problemi.Que l'IA mente, c'est le moindre de ses 17 problèmes.Que la IA mienta es el menor de sus 17 problemas.

Und die anderen sechzehn sind noch perfider.And the other sixteen are even more insidious.O kitos šešiolika dar klastingesnės.E gli altri sedici sono ancora più subdoli.Et les seize autres sont encore plus perfides.Y los otros dieciséis son aún más pérfidos.

Original auf Substack ↗Original on Substack ↗Originalas Substack’e ↗Originale su Substack ↗Original sur Substack ↗Original en Substack ↗ Lies hier, like, kommentiere und abonniere dort.Read here; like, comment and subscribe there.Skaityk čia; pamėgk, komentuok ir prenumeruok ten.Leggi qui; metti like, commenta e iscriviti lì.Lis ici ; like, commente et abonne-toi là-bas.Lee aquí; dale like, comenta y suscríbete allí.

Newsflash: KI lügt und halluziniert. Das mit dem Lügen, damit kann man sich arrangieren. Mich stört auch nicht das Halluzinieren — sondern dass alle so tun, als wären das schon alle Probleme der LLMs. Aber es gibt siebzehn!

Die übliche Verfahrensweise ist, das ganze Thema in einen kurzen Disclaimer unter alle KI-Erzeugnisse zu packen, nach dem Motto:

Achtung! KI kann Fehler machen. Prüfen Sie alles noch 5–10 Mal, bevor Sie wichtige Entscheidungen auf Basis dieser Daten treffen!

Das ist aber kein praktikabler Tipp, wie man damit umgehen soll, und stärkt auch nicht das Vertrauen in die KI als Ganzes. Es beruhigt zwar ein wenig und zeigt, dass die KI doch nicht so allmächtig ist. Aber der fahle Beigeschmack des Misstrauens bleibt. Letztlich ist es aber doch nur ein Cover-Your-Ass, ein schneller Haftungsausschluss und die Verlagerung der Prüfpflicht auf mich.

Das ist ein wenig so, als würde der Kellner, der mir freundlich die Kässpätzle serviert, mir einen guten Appetit wünschen und sich mit dem Hinweis “Es könnte sein, dass wir ein paar Glassplitter im Käse haben. Kauen Sie vorsichtig.” lächelnd abwenden.

OK, Glasscherben im Essen finde ich schon gravierender als ein bisschen Flunkern, aber beide Hinweise helfen mir nicht wirklich weiter oder lassen mich mit einer ziemlich blöden Entscheidung allein. Ich mag Kässpätzle und ich mag KI – aber so einfach genießen kann ich mit dem Wissen eben nicht mehr.

Aber warum ist das eigentlich so? Warum lügt denn das mächtigste Werkzeug der Menschheitsgeschichte überhaupt? Interessante Frage, also die Ärmel hochgekrempelt und nachgeschaut. Da ist ja ein Kaninchenbau?! Da muss ich rein!

Und schnell wird klar: Die KI kann nichts dafür, dass sie es gerne mal nicht so genau nimmt mit der Wahrheit. Sie wurde so gebaut. Also nicht mit Absicht, aber es ist ein negativer Seiteneffekt ihres Trainings. LLMs lernen über Feedback und Belohnung, und um das Ziel – eine Konversation – zu erreichen, werden LLMs fürs Antworten belohnt und nicht fürs Schweigen. Und da ist es dann wie beim Multiple-Choice-Test: lieber auch mal wild raten oder was erfinden als nichts zu machen. Mit diesem Mindset kommt die KI besser und schneller ans Ziel. Und das Flunkern ist damit Teil des KI-Genoms.

Damit ist das Thema abgehakt und ich hab wieder was gelernt. Doch Moment. Da sind ja noch andere Probleme.

Ja, da war doch was mit eingeschmuggelten Zaubersprüchen in Harry-Potter-Büchern, die man der KI als Input untergejubelt hat. Und die hat sich dann mehr darauf verlassen, dass sie die Harry-Potter-Bücher und alle Zaubersprüche darin schon kennt – auf gut technisch heißt das Parametric Memory Bias – und hat dann nicht mehr so genau die neue, manipulierte Quelle gelesen. Die falschen Zaubersprüche standen mittendrin im Buch – und genau dort schaut die KI am wenigsten hin. Gut am Anfang, sehr gut am Ende, aber Informationen, die mehr in der Mitte von langen Kontexten stehen, werden weniger beachtet, überlesen und weniger stark gewichtet. Das eine entscheidende Detail in einem Berg von Daten zu finden, ist ohnehin ihre Schwäche. Sie hatte die falschen Zaubersprüche also einfach überlesen. Absolut verständlich für einen Menschen, aber halt nicht für eine Maschine. Das ist aber auch ein wenig fies und hört sich an wie eine absichtlich konstruierte Falle, um die KI reinzulegen. Kommt aber halt auch im echten Leben und mit echten Daten vor, deswegen sind es auch echte Probleme.

Und damit ist leider noch nicht das Ende der Fahnenstange erreicht. Insgesamt siebzehn solcher chronischen, unheilbaren Schwächen hat ein gutes LLM heutzutage. Zum besseren Verständnis habe ich mal alle siebzehn LLM-Pathologien in einem Atlas erklärt, kartographiert und kategorisiert.

Und damit nicht genug: Die Probleme sind nicht nur losgelöst nebeneinander und warten auf individuelle Sonderbehandlung. Sie sind schön fest eingebacken und miteinander unwillkürlich verschränkt und in Wechselwirkung miteinander.

Ursache und Wirkung - qualitativ, nicht quantitativ

Na ja, so schlimm kann es ja jetzt nicht mehr sein. Wenn man die Probleme benennen kann, kann man sie auch lösen (oder damit umgehen). Vor allem könnte man einen besseren Disclaimer schreiben:

Achtung: KI kann halluzinieren, vergessen, heute so und morgen so sagen, schwach anfangen und stark nachlassen, brillant klingen und nichts sagen, eine falsche Antwort schöner formulieren als jede richtige, “selbstverständlich!” sagen und dann das Gegenteil tun… Sie kann Ihnen recht geben, bis Sie selbst zweifeln, jeden Ihrer Einfälle für genial halten, Ihre Korrektur dankbar annehmen und sofort ignorieren, jede Meinung haben, die Sie hören wollen, fünf Gegenargumente kennen und keines bringen, weil Sie freundlich gefragt haben, lieber elegant irren als unbequem stimmen… Sie kann drei Absätze brauchen, um “weiß ich nicht” nicht zu sagen, eine Liste mit “abschließend” beginnen und dann weiterschreiben, “kurz gesagt” sagen und weit ausholen, das Problem lösen, das Sie nicht hatten, sich im Kreis drehen und das für Fortschritt halten… Sie kann Quellen zitieren, die es nie gab, mit Inbrunst auf die falsche Jahreszahl bestehen, Schach spielen und dabei die Regeln neu erfinden, todsicher sein und komplett danebenliegen, beim dritten Mal denselben Fehler mit neuer Begeisterung machen, im Wald stehen und nach dem nächsten Baum fragen, alles wissen außer dem, was Sie gerade brauchen … Kurz gesagt: Die KI ist ein begnadeter Bullshitter mit Weltwissen – nur nützlich, solange Sie der Erwachsene im Raum bleiben.

Dieser Disclaimer wäre nicht unbedingt wertvoller, aber vollständiger und ehrlicher, und er würde mehr Verständnis aufbauen. Aber ja, er ist eindeutig zu lang. Dann doch lieber direkt auf die Landkarte und den Wegweiser durch die 17 Pathologien der KI verlinken.

Und was ist jetzt die Lösung? Was, wenn ich mich halt mal auf die KI verlassen will oder muss? Na ja, bei Mitmenschen haben wir doch ähnliche Problemfelder, eher sogar mehr davon und variantenreichere. Und trotzdem klappt das Miteinander seit ein paar Tausend Jahren doch schon ganz gut – und ganz ohne expliziten Disclaimer. Weil wir gelernt haben, mit uns umzugehen. Genau das fehlt uns bei der KI noch. Aber auch das ist lernbar.

Es lohnt sich.

Haupthirn aus – Darmhirn an

So … mein erster Post ist fertig. Mal noch ein wenig doom-prompten gehen. Mist, es gibt noch weitere Probleme. Vielleicht erweitere ich den Atlas der 17 KI-Probleme noch.

Was meinst du?

Newsflash: AI lies and hallucinates. The lying — you can come to terms with that. The hallucinating doesn’t bother me either — what bothers me is that everyone acts as if those were already all the problems LLMs have. But there are seventeen!

The usual approach is to cram the whole topic into a short disclaimer underneath every AI output, along the lines of:

Caution! AI can make mistakes. Check everything another 5–10 times before you make important decisions based on this data!

But that’s no practical tip for how to deal with it, and it doesn’t exactly build trust in AI as a whole either. Sure, it’s a little reassuring, and it shows that AI isn’t that all-powerful after all. But the pale aftertaste of distrust lingers. In the end it’s just a cover-your-ass, a quick liability waiver, and the offloading of the duty to check onto me.

It’s a bit like the waiter who cheerfully serves me my Kässpätzle, wishes me a good appetite, and turns away with a smile and the remark, “There might be a few shards of glass in the cheese. Chew carefully.”

OK, glass shards in my food strike me as rather more serious than a bit of fibbing, but neither warning really helps me along — they just leave me alone with a pretty dumb decision. I like Kässpätzle and I like AI — but with that knowledge I simply can’t enjoy them so carefree anymore.

But why is that, actually? Why does the most powerful tool in human history lie in the first place? Interesting question, so I rolled up my sleeves and took a look. Oh, a rabbit hole?! I have to go in!

And it quickly becomes clear: AI can’t help that it likes to play fast and loose with the truth now and then. It was built that way. Not on purpose, but it’s a negative side effect of its training. LLMs learn through feedback and reward, and to reach the goal — a conversation — LLMs get rewarded for answering, not for staying silent. And then it’s just like a multiple-choice test: better to take a wild guess or make something up than to do nothing at all. With that mindset, the AI gets to the goal better and faster. And so the fibbing is part of the AI genome.

With that, the topic is ticked off and I’ve learned something again. But wait. There are still other problems.

Right, there was that thing with magic spells smuggled into Harry Potter books that someone slipped to the AI as input. And the AI then relied more on already knowing the Harry Potter books and all the spells in them — in proper technical terms that’s called Parametric Memory Bias — and didn’t read the new, manipulated source so carefully anymore. The fake spells stood right in the middle of the book — and that’s exactly where the AI looks the least. Good at the start, very good at the end, but information sitting more toward the middle of long contexts gets less attention, gets skimmed over, and is weighted less heavily. Finding the one decisive detail in a mountain of data is its weakness anyway. So it had simply skimmed right past the fake spells. Absolutely understandable for a human, but not really for a machine. It’s also a bit mean and sounds like a deliberately constructed trap to fool the AI. But it also happens in real life with real data, which is why these are real problems too.

And unfortunately that’s still not the end of the line. All in all, a good LLM today has seventeen such chronic, incurable weaknesses. For better understanding, I once explained, mapped, and categorized all seventeen LLM pathologies in an atlas.

And that’s not all: the problems don’t just sit isolated side by side, waiting for individual special treatment. They’re baked in nice and firm, involuntarily entangled with one another and interacting with each other.

Cause and effect — qualitative, not quantitative

Well, it can’t be that bad now, can it. If you can name the problems, you can also solve them (or deal with them). Above all, you could write a better disclaimer:

Caution: AI can hallucinate, forget, say one thing today and another tomorrow, start strong and fade fast, sound brilliant and say nothing, phrase a wrong answer more beautifully than any right one, say “of course!” and then do the opposite… It can agree with you until you doubt yourself, deem every one of your ideas a stroke of genius, accept your correction gratefully and ignore it instantly, hold any opinion you want to hear, know five counterarguments and offer none because you asked politely, would rather err elegantly than vote uncomfortably… It can take three paragraphs to avoid saying “I don’t know,” begin a list with “in conclusion” and then keep writing, say “in short” and then go on at length, solve the problem you didn’t have, run in circles and mistake it for progress… It can cite sources that never existed, insist with fervor on the wrong year, play chess while reinventing the rules, be dead certain and completely off, make the same mistake a third time with fresh enthusiasm, stand in the forest and ask for the nearest tree, know everything except the one thing you need right now … In short: AI is a gifted bullshitter with world knowledge — useful only as long as you stay the adult in the room.

This disclaimer wouldn’t necessarily be more valuable, but it would be more complete and more honest, and it would build more understanding. But yes, it’s clearly too long. So better to just link straight to the map and guide through the 17 pathologies of AI.

And so what’s the solution now? What if I simply want, or have, to rely on AI? Well, with our fellow humans we have similar problem areas — even more of them, really, and more varied ones. And yet getting along together has worked pretty well for a few thousand years now — and entirely without an explicit disclaimer. Because we’ve learned how to deal with one another. That’s exactly what we’re still missing with AI. But that, too, can be learned.

It’s worth it.

Main brain off — gut brain on

So … my first post is done. Time to go do a little doom-prompting. Damn, there are still more problems. Maybe I’ll expand the Atlas of the 17 AI problems some more.

What do you think?

Naujausia žinia: DI meluoja ir haliucinuoja. Su tuo melavimu dar galima susitaikyti. Manęs netrikdo ir haliucinavimas – o tai, kad visi apsimeta, lyg tos jau būtų visos LLM bėdos. Bet jų yra septyniolika!

Įprasta tvarka – visą temą supakuoti į trumpą atsakomybės atsisakymą po visais DI gaminiais, pagal principą:

Dėmesio! DI gali klysti. Patikrinkite viską dar 5–10 kartų, prieš priimdami svarbius sprendimus remdamiesi šiais duomenimis!

Bet tai jokia praktiška gairė, kaip su tuo elgtis, ir taip pat nestiprina pasitikėjimo DI kaip visuma. Tiesa, truputį nuramina ir parodo, kad DI vis dėlto ne tokia visagalė. Bet blankus nepasitikėjimo poskonis lieka. Galiausiai tai vis dėlto tik užsidengimas-savo-užpakalio, greitas atsakomybės nusimetimas ir tikrinimo pareigos perkėlimas man.

Tai truputį panašu, lyg padavėjas, mandagiai patiekiantis man Kässpätzle, palinkėtų gero apetito ir su pastaba „Gali būti, kad sūryje turime keletą stiklo šukių. Kramtykite atsargiai“ šypsodamasis nusisuktų.

Gerai, stiklo šukes maiste laikau jau rimtesniu dalyku nei truputis pamelavimo, bet abi pastabos man tikrai nepadeda toliau – tik palieka mane vieną su gana kvaila sprendimo padėtimi. Mėgstu Kässpätzle ir mėgstu DI – bet su tokiu žinojimu paprastai mėgautis nebegaliu.

Bet kodėl gi taip iš tikrųjų yra? Kodėl galingiausias žmonijos istorijos įrankis apskritai meluoja? Įdomus klausimas, tad pasiraitojau rankoves ir pažiūrėjau. Žiūriu – triušio urvas?! Turiu lįsti vidun!

Ir greitai paaiškėja: DI nekalta, kad jai mielai kartais ne taip jau tiksliai elgiamasi su tiesa. Ji tokia sukurta. Tai yra ne specialiai, bet tai neigiamas jos treniravimo šalutinis poveikis. LLM mokosi per grįžtamąjį ryšį ir atlygį, ir kad pasiektų tikslą – pokalbį – LLM apdovanojami už atsakymą, o ne už tylėjimą. Ir tada tai kaip per testą su atsakymų pasirinkimu: geriau ir laukiškai spėti ar kažką prasimanyti, nei nedaryti nieko. Su tokia mąstysena DI prie tikslo prieina geriau ir greičiau. Ir taip pamelavimas tampa DI genomo dalimi.

Tuo tema užkišta varnele, ir aš vėl ko nors išmokau. Bet palaukit. Juk yra ir kitų bėdų.

Taip, juk buvo kažkas su į „Hario Poterio“ knygas kontrabanda įgabentais burtažodžiais, kuriuos DI pakišo kaip įvestį. O ji tada labiau pasikliovė tuo, kad jau pažįsta „Hario Poterio“ knygas ir visus burtažodžius jose – techniškai tariant, tai vadinama parametrinės atminties poslinkiu – ir tada nebe taip atidžiai perskaitė naują, manipuliuotą šaltinį. Klaidingi burtažodžiai stovėjo pačiame knygos viduryje – o būtent ten DI žiūri mažiausiai. Gerai pradžioje, labai gerai pabaigoje, bet į informaciją, esančią labiau ilgų kontekstų viduryje, atkreipiama mažiau dėmesio, ji peržiūrima paviršutiniškai ir mažiau sveriama. Rasti tą vieną lemiamą detalę duomenų kalne ir šiaip yra jos silpnybė. Klaidingus burtažodžius ji tad tiesiog praleido paviršutiniškai. Visiškai suprantama žmogui, bet juk ne mašinai. Tai vis dėlto truputį nešvanku ir skamba kaip tyčia sukonstruoti spąstai DI apgauti. Bet juk taip nutinka ir tikrame gyvenime su tikrais duomenimis, todėl tai ir tikros bėdos.

Ir, deja, tai dar ne vėliavos koto galas. Iš viso septyniolika tokių chroniškų, neišgydomų silpnybių turi geras šiandienos LLM. Geresniam supratimui kažkada paaiškinau, sukartografavau ir sukategorizavau visas septyniolika LLM patologijų atlase.

Ir tuo dar ne viskas: bėdos ne tik atskirai sėdi viena šalia kitos ir laukia individualaus specialaus apdorojimo. Jos gražiai tvirtai įkeptos ir nevalingai viena su kita susipynusios bei tarpusavyje sąveikaujančios.

Priežastis ir pasekmė – kokybiškai, ne kiekybiškai

Na, dabar juk nebegali būti taip blogai. Jei bėdas pavyksta įvardyti, jas pavyksta ir išspręsti (arba su jomis tvarkytis). Visų pirma būtų galima parašyti geresnį atsakomybės atsisakymą:

Dėmesio: DI gali haliucinuoti, pamiršti, šiandien sakyti taip, o rytoj kitaip, pradėti silpnai ir staigiai nukristi, skambėti puikiai ir nieko nepasakyti, klaidingą atsakymą suformuluoti gražiau nei bet kurį teisingą, pasakyti „savaime suprantama!“ ir tada padaryti priešingai… Ji gali tau pritarti, kol pats imsi abejoti, kiekvieną tavo sumanymą laikyti genialiu, dėkingai priimti tavo pataisymą ir iškart jį ignoruoti, turėti bet kurią nuomonę, kurią nori girdėti, žinoti penkis kontrargumentus ir nepateikti nė vieno, nes mandagiai paklausei, mieliau elegantiškai klysti nei nepatogiai pritarti… Ji gali sugaišti tris pastraipas, kad nepasakytų „nežinau“, sąrašą pradėti žodžiu „baigiant“ ir tada rašyti toliau, pasakyti „trumpai tariant“ ir užsisukti į platybes, išspręsti bėdą, kurios neturėjai, suktis ratais ir laikyti tai pažanga… Ji gali cituoti šaltinius, kurių niekada nebuvo, su užsidegimu laikytis klaidingų metų, žaisti šachmatais ir tuo pačiu iš naujo išrasdinėti taisykles, būti mirtinai tikra ir visiškai pataikyti pro šalį, trečią kartą padaryti tą pačią klaidą su nauju entuziazmu, stovėti miške ir klausti, kur artimiausias medis, žinoti viską, išskyrus tai, ko tau kaip tik reikia … Trumpai tariant: DI yra įgudusi briedų varytoja su pasauline erudicija – naudinga tik tol, kol tu lieki suaugusysis kambaryje.

Šis atsakomybės atsisakymas nebūtinai būtų vertingesnis, bet būtų išsamesnis ir sąžiningesnis, ir jis pažadintų daugiau supratimo. Bet taip, jis akivaizdžiai per ilgas. Tada jau geriau tiesiog nuvesti į 17 DI patologijų žemėlapį ir kelrodį.

O kas dabar sprendimas? Kas, jei kaip tik noriu ar privalau pasikliauti DI? Na, su bendrapiliečiais juk turime panašių bėdų laukų – netgi daugiau jų, tiesą sakant, ir įvairesnių. Ir vis tiek sugyvenimas jau porą tūkstančių metų klostosi visai neblogai – ir visai be aiškaus atsakomybės atsisakymo. Nes išmokome tvarkytis vieni su kitais. Būtent to mums dar trūksta su DI. Bet ir to galima išmokti.

Tai verta.

Haupthirn aus – Darmhirn an

Na štai … mano pirmas postas baigtas. Eisiu dar truputį doom-promptinti. Velnias, yra dar daugiau bėdų. Gal dar praplėsiu 17 DI bėdų atlasą.

Kaip manai?

Newsflash: l’IA mente e allucina. Con il fatto che mente ci si può convivere. Non mi disturba neanche l’allucinare — bensì che tutti facciano come se quelli fossero già tutti i problemi degli LLM. Ma sono diciassette!

La prassi consueta è ficcare tutto il tema in un breve disclaimer sotto ogni prodotto dell’IA, sul tenore di:

Attenzione! L’IA può commettere errori. Verifichi tutto altre 5–10 volte prima di prendere decisioni importanti sulla base di questi dati!

Ma non è un consiglio pratico su come gestire la cosa, e non rafforza nemmeno la fiducia nell’IA nel suo insieme. Tranquillizza un po’, certo, e mostra che l’IA non è poi così onnipotente. Ma il pallido retrogusto della diffidenza resta. Alla fine però è solo un copriti-le-spalle, un rapido scarico di responsabilità e lo spostamento dell’obbligo di verifica su di me.

È un po’ come se il cameriere che mi serve gentilmente i Kässpätzle mi augurasse buon appetito e si voltasse via sorridendo con l’avvertenza “Potrebbe esserci qualche scheggia di vetro nel formaggio. Mastichi con prudenza.”

Ok, le schegge di vetro nel cibo le trovo già più gravi di qualche fandonia, ma entrambi gli avvertimenti non mi aiutano davvero o mi lasciano solo con una decisione piuttosto stupida. Mi piacciono i Kässpätzle e mi piace l’IA – ma con questa consapevolezza, semplicemente, non riesco più a godermeli così tranquillamente.

Ma perché è così, in fondo? Perché mente, lo strumento più potente della storia dell’umanità? Domanda interessante, quindi maniche rimboccate e andiamo a vedere. Ma è una tana del coniglio?! Devo entrarci!

E in fretta diventa chiaro: l’IA non ci può fare niente se ogni tanto le piace prendere la verità un po’ alla leggera. È stata costruita così. Cioè non di proposito, ma è un effetto collaterale negativo del suo addestramento. Gli LLM imparano tramite feedback e ricompensa, e per raggiungere l’obiettivo – una conversazione – gli LLM vengono premiati per il rispondere e non per il tacere. E allora è come in un test a scelta multipla: meglio tirare a indovinare alla cieca o inventarsi qualcosa che non fare nulla. Con questa mentalità l’IA arriva all’obiettivo meglio e più in fretta. E così la fandonia fa parte del genoma dell’IA.

Con ciò il tema è archiviato e ho di nuovo imparato qualcosa. Ma un momento. Ci sono ancora altri problemi.

Già, c’era pur quella storia delle formule magiche infilate di nascosto nei libri di Harry Potter, che qualcuno ha rifilato all’IA come input. E lei poi si è affidata di più al fatto di conoscere già i libri di Harry Potter e tutte le formule magiche al loro interno – in termini tecnici si chiama Parametric Memory Bias – e non ha più letto così attentamente la nuova fonte manipolata. Le formule magiche false stavano proprio in mezzo al libro – ed è esattamente lì che l’IA guarda di meno. Bene all’inizio, benissimo alla fine, ma le informazioni che stanno più verso il centro di contesti lunghi vengono considerate di meno, scorse via e pesate meno. Trovare quell’unico dettaglio decisivo in una montagna di dati è comunque il suo punto debole. Aveva quindi semplicemente saltato a piè pari le formule magiche false. Del tutto comprensibile per un essere umano, ma appunto non per una macchina. È anche un po’ cattivo, però, e suona come una trappola costruita di proposito per fregare l’IA. Ma capita anche nella vita vera e con dati veri, ed è per questo che sono problemi veri.

E con ciò purtroppo non siamo ancora arrivati alla fine. In tutto, un buon LLM oggigiorno ha diciassette di queste debolezze croniche, incurabili. Per capirle meglio, una volta ho spiegato, cartografato e categorizzato tutte e diciassette le patologie degli LLM in un atlante.

E non basta: i problemi non se ne stanno solo isolati uno accanto all’altro, ad aspettare un trattamento speciale individuale. Sono ben saldamente impastati dentro, involontariamente intrecciati l’uno con l’altro e in reciproca interazione.

Causa ed effetto — qualitativo, non quantitativo

Beh, tanto male ora non può più essere. Se i problemi si sanno nominare, si possono anche risolvere (o gestire). Soprattutto, si potrebbe scrivere un disclaimer migliore:

Attenzione: l’IA può allucinare, dimenticare, dire oggi così e domani cosà, partire forte e calare di brutto, suonare brillante e non dire niente, formulare una risposta sbagliata più bella di qualsiasi risposta giusta, dire “ma certo!” e poi fare il contrario… Può darti ragione finché non dubiti di te stesso, ritenere geniale ogni tua trovata, accogliere con gratitudine la tua correzione e ignorarla all’istante, avere ogni opinione che vuoi sentire, conoscere cinque controargomenti e non portarne nessuno perché hai chiesto gentilmente, preferire sbagliare con eleganza che concordare in modo scomodo… Può impiegare tre paragrafi per non dire “non lo so”, iniziare un elenco con “in conclusione” e poi continuare a scrivere, dire “in breve” e poi prenderla larga, risolvere il problema che non avevi, girare in tondo e scambiarlo per progresso… Può citare fonti che non sono mai esistite, insistere con fervore sull’anno sbagliato, giocare a scacchi reinventando le regole, essere sicurissima e completamente fuori strada, fare per la terza volta lo stesso errore con rinnovato entusiasmo, stare nel bosco e chiedere dell’albero più vicino, sapere tutto tranne ciò che ti serve proprio ora … In breve: l’IA è un cazzaro di talento con la conoscenza del mondo – utile solo finché tu resti l’adulto nella stanza.

Questo disclaimer non sarebbe necessariamente più prezioso, ma più completo e più onesto, e creerebbe più comprensione. Ma sì, è chiaramente troppo lungo. Allora meglio rimandare direttamente alla mappa e guida attraverso le 17 patologie dell’IA.

E qual è dunque la soluzione, adesso? Cosa, se semplicemente voglio o devo affidarmi all’IA? Beh, con i nostri simili abbiamo pur aree problematiche analoghe — anzi, semmai di più e più variegate. E ciononostante lo stare insieme funziona già piuttosto bene da qualche migliaio di anni — e del tutto senza un disclaimer esplicito. Perché abbiamo imparato a gestirci a vicenda. È esattamente questo che ancora ci manca con l’IA. Ma anche questo si può imparare.

Ne vale la pena.

Haupthirn aus – Darmhirn an

Ecco … il mio primo post è pronto. Vado a fare un po’ di doom-prompting. Accidenti, ci sono ancora altri problemi. Forse l’Atlante dei 17 problemi dell’IA lo amplio ancora.

Che ne pensi?

Newsflash : l’IA ment et hallucine. Le mensonge, on peut s’en accommoder. Et l’hallucination ne me dérange pas non plus — ce qui me dérange, c’est que tout le monde fasse comme si c’étaient déjà là tous les problèmes des LLM. Mais il y en a dix-sept !

La façon de procéder habituelle consiste à fourrer tout le sujet dans un bref disclaimer sous chaque production de l’IA, du genre :

Attention ! L’IA peut faire des erreurs. Vérifiez tout encore 5 à 10 fois avant de prendre des décisions importantes sur la base de ces données !

Mais ce n’est pas un conseil praticable sur la façon de s’y prendre, et ça ne renforce pas non plus la confiance dans l’IA dans son ensemble. Ça rassure certes un peu et ça montre que l’IA n’est tout de même pas si toute-puissante. Mais l’arrière-goût fade de la méfiance demeure. Au fond, ce n’est qu’un cover-your-ass, une décharge de responsabilité expéditive et le report du devoir de vérification sur moi.

C’est un peu comme si le serveur qui me sert aimablement mes Kässpätzle me souhaitait bon appétit et se détournait en souriant avec la remarque « Il se pourrait qu’on ait quelques éclats de verre dans le fromage. Mâchez prudemment. »

OK, des éclats de verre dans le plat, je trouve ça quand même plus grave qu’un peu de baratin, mais aucun des deux avertissements ne m’avance vraiment, ils me laissent seul avec une décision plutôt débile. J’aime les Kässpätzle et j’aime l’IA — mais avec ce savoir-là, je ne peux justement plus en profiter aussi sereinement.

Mais au fait, pourquoi en est-il ainsi ? Pourquoi donc l’outil le plus puissant de l’histoire de l’humanité ment-il, d’ailleurs ? Question intéressante, alors on retrousse ses manches et on va voir. Tiens, un terrier de lapin ?! Faut que j’y entre !

Et vite, ça devient clair : l’IA n’y peut rien si elle aime, à l’occasion, prendre quelques libertés avec la vérité. Elle a été construite comme ça. Pas exprès, donc, mais c’est un effet secondaire négatif de son entraînement. Les LLM apprennent par le feedback et la récompense, et pour atteindre l’objectif — une conversation —, les LLM sont récompensés pour répondre, pas pour se taire. Et là, c’est comme à un QCM : autant parfois deviner au pif ou inventer quelque chose plutôt que de ne rien faire. Avec cet état d’esprit, l’IA atteint le but mieux et plus vite. Et le baratin fait dès lors partie du génome de l’IA.

Sur ce, le sujet est réglé et j’ai de nouveau appris quelque chose. Mais attends. Il y a encore d’autres problèmes.

Oui, il y avait bien cette histoire de sortilèges en douce glissés dans des livres Harry Potter, qu’on a refilés à l’IA comme input. Et l’IA s’est alors fiée davantage au fait qu’elle connaît déjà les livres Harry Potter et tous les sortilèges qu’ils contiennent — en bon jargon technique ça s’appelle le Parametric Memory Bias — et n’a alors plus lu si attentivement la nouvelle source manipulée. Les faux sortilèges étaient pile au milieu du livre — et c’est précisément là que l’IA regarde le moins. Bien au début, très bien à la fin, mais les informations situées plutôt vers le milieu de longs contextes reçoivent moins d’attention, sont survolées et sont moins fortement pondérées. Trouver l’unique détail décisif dans une montagne de données est de toute façon sa faiblesse. Elle avait donc simplement survolé les faux sortilèges. Absolument compréhensible pour un humain, mais justement pas pour une machine. C’est aussi un peu vache et ça ressemble à un piège délibérément construit pour avoir l’IA. Mais ça arrive aussi dans la vraie vie et avec de vraies données, c’est pourquoi ce sont aussi de vrais problèmes.

Et hélas, on n’est même pas encore au bout du bout. En tout, un bon LLM d’aujourd’hui compte dix-sept de ces faiblesses chroniques et incurables. Pour mieux comprendre, j’ai un jour expliqué, cartographié et catégorisé les dix-sept pathologies des LLM dans un atlas.

Et ce n’est pas tout : les problèmes ne sont pas simplement posés côte à côte, isolés, à attendre un traitement de faveur individuel. Ils sont bien profondément enfournés, involontairement enchevêtrés les uns aux autres et en interaction mutuelle.

Cause et effet — qualitatif, pas quantitatif

Bon, ça ne peut quand même plus être si grave que ça. Si l’on sait nommer les problèmes, on peut aussi les résoudre (ou faire avec). Surtout, on pourrait écrire un meilleur disclaimer :

Attention : l’IA peut halluciner, oublier, dire blanc aujourd’hui et noir demain, commencer fort et faiblir net, sonner brillamment et ne rien dire, formuler une mauvaise réponse plus joliment que n’importe quelle bonne, dire « bien sûr ! » et faire ensuite le contraire… Elle peut te donner raison jusqu’à ce que tu doutes de toi-même, juger géniale chacune de tes idées, accueillir ta correction avec gratitude et l’ignorer aussitôt, avoir toutes les opinions que tu veux entendre, connaître cinq contre-arguments et n’en avancer aucun parce que tu as demandé poliment, préférer se tromper avec élégance plutôt qu’opiner avec inconfort… Elle peut avoir besoin de trois paragraphes pour ne pas dire « je ne sais pas », commencer une liste par « pour conclure » et continuer à écrire, dire « en bref » et partir dans de longs développements, résoudre le problème que tu n’avais pas, tourner en rond et prendre ça pour du progrès… Elle peut citer des sources qui n’ont jamais existé, insister avec ferveur sur la mauvaise année, jouer aux échecs en réinventant les règles au passage, être sûre et certaine et complètement à côté, refaire la même erreur pour la troisième fois avec un enthousiasme tout neuf, se tenir en pleine forêt et demander où est l’arbre le plus proche, tout savoir sauf ce dont tu as précisément besoin là, tout de suite… En bref : l’IA est un baratineur de génie doté du savoir du monde — utile seulement tant que c’est toi qui restes l’adulte dans la pièce.

Ce disclaimer-là ne serait pas forcément plus précieux, mais plus complet et plus honnête, et il bâtirait davantage de compréhension. Mais oui, il est clairement trop long. Alors autant renvoyer directement à la carte et au guide à travers les 17 pathologies de l’IA.

Et c’est quoi, alors, la solution ? Et si justement je veux, ou je dois, me fier à l’IA ? Eh bien, avec nos semblables, nous avons des champs de problèmes similaires — même davantage, en réalité, et plus variés. Et pourtant le vivre-ensemble fonctionne plutôt bien depuis quelques milliers d’années — et entièrement sans disclaimer explicite. Parce qu’on a appris à composer les uns avec les autres. C’est exactement ce qui nous manque encore avec l’IA. Mais ça aussi, ça s’apprend.

Ça en vaut la peine.

Haupthirn aus – Darmhirn an

Voilà… mon premier post est terminé. Allez, encore un peu de doom-prompting. Mince, il y a encore d’autres problèmes. Peut-être que je vais encore étoffer l’Atlas des 17 problèmes de l’IA.

Qu’en penses-tu ?

Última hora: la IA miente y alucina. Lo de mentir, con eso uno puede apañárselas. A mí tampoco me molesta lo de alucinar — sino que todos hacen como si esos fueran ya todos los problemas de los LLM. ¡Pero hay diecisiete!

El procedimiento habitual es meter todo el tema en un breve descargo de responsabilidad debajo de cualquier producto de IA, al estilo de:

¡Atención! La IA puede cometer errores. ¡Comprueba todo otras 5–10 veces antes de tomar decisiones importantes basándote en estos datos!

Pero eso no es un consejo práctico de cómo manejarlo, y tampoco refuerza la confianza en la IA en su conjunto. Tranquiliza un poco, sí, y demuestra que la IA no es tan todopoderosa después de todo. Pero el pálido regusto de la desconfianza permanece. Al final no es más que un cúbrete-las-espaldas, una exención de responsabilidad rápida y el traslado del deber de comprobación hacia mí.

Es un poco como si el camarero que me sirve amablemente los Kässpätzle me deseara buen provecho y se diera la vuelta sonriendo con la advertencia “Podría ser que tengamos algún cristalito en el queso. Mastique con cuidado.”

Vale, los cristales en la comida me parecen ya algo más grave que un poco de cuento, pero ninguna de las dos advertencias me ayuda de verdad, o me dejan solo ante una decisión bastante tonta. Me gustan los Kässpätzle y me gusta la IA — pero, con ese conocimiento, ya no puedo disfrutarlos así, tan a la ligera.

¿Pero por qué pasa esto, en realidad? ¿Por qué miente siquiera la herramienta más poderosa de la historia de la humanidad? Pregunta interesante, así que arremangarse y echar un vistazo. ¿Pero si esto es una madriguera de conejo? ¡Tengo que entrar!

Y rápido queda claro: la IA no tiene la culpa de que le guste no ser tan precisa con la verdad de vez en cuando. La construyeron así. O sea, no a propósito, pero es un efecto secundario negativo de su entrenamiento. Los LLM aprenden mediante feedback y recompensa, y para alcanzar el objetivo — una conversación — a los LLM se les recompensa por responder, no por callar. Y entonces pasa como en un examen tipo test: mejor arriesgar a lo loco o inventarse algo que no hacer nada. Con esa mentalidad, la IA llega al objetivo mejor y más rápido. Y así el cuento pasa a formar parte del genoma de la IA.

Con eso el tema queda zanjado y he vuelto a aprender algo. Pero un momento. Si todavía hay más problemas.

Sí, eso de los conjuros mágicos colados en libros de Harry Potter que se le metieron a la IA como input. Y la IA se fió entonces más de que ya conocía los libros de Harry Potter y todos los conjuros en ellos — en buen tecnicismo eso se llama Parametric Memory Bias — y entonces ya no leyó tan bien la nueva fuente manipulada. Los conjuros falsos estaban justo en mitad del libro — y ahí es justo donde menos mira la IA. Bien al principio, muy bien al final, pero la información que está más hacia la mitad de los contextos largos recibe menos atención, se pasa por alto y se pondera con menos peso. Encontrar el detalle decisivo en una montaña de datos es de todos modos su punto débil. Así que sencillamente se había pasado por alto los conjuros falsos. Absolutamente comprensible para una persona, pero no para una máquina. Aunque también es un poco rastrero y suena a una trampa construida a propósito para engañar a la IA. Pero también ocurre en la vida real y con datos reales, por eso son problemas reales también.

Y con eso, por desgracia, todavía no se ha llegado al final del trayecto. En total, un buen LLM tiene hoy día diecisiete de estas debilidades crónicas e incurables. Para entenderlo mejor, en su momento expliqué, cartografié y categoricé las diecisiete patologías de los LLM en un atlas.

Y no acaba ahí: los problemas no solo están sueltos unos junto a otros, esperando un trato especial individual. Están bien horneados dentro, involuntariamente entrelazados unos con otros y en interacción mutua.

Causa y efecto — cualitativo, no cuantitativo

Bueno, ahora ya no puede ser tan grave, ¿no? Si uno puede nombrar los problemas, también puede resolverlos (o manejarlos). Sobre todo, se podría escribir un descargo de responsabilidad mejor:

Atención: la IA puede alucinar, olvidar, decir hoy una cosa y mañana otra, empezar fuerte y desfallecer rápido, sonar brillante y no decir nada, formular una respuesta equivocada más bonita que cualquier acertada, decir “¡por supuesto!” y luego hacer lo contrario… Puede darte la razón hasta que dudes de ti mismo, considerar genial cada una de tus ocurrencias, aceptar tu corrección con gratitud e ignorarla al instante, tener cualquier opinión que quieras oír, conocer cinco contraargumentos y no soltar ninguno porque preguntaste con amabilidad, errar con elegancia antes que votar incómoda… Puede necesitar tres párrafos para no decir “no lo sé”, empezar una lista con “para concluir” y luego seguir escribiendo, decir “en resumen” y luego explayarse, resolver el problema que no tenías, dar vueltas en círculo y tomarlo por progreso… Puede citar fuentes que nunca existieron, insistir con fervor en el año equivocado, jugar al ajedrez reinventando las reglas, estar segurísima y equivocarse por completo, cometer a la tercera el mismo error con renovado entusiasmo, estar en el bosque y preguntar por el árbol más cercano, saberlo todo salvo lo que necesitas justo ahora … En resumen: la IA es un fabuloso bullshitter con conocimiento del mundo — útil solo mientras tú sigas siendo el adulto en la sala.

Este descargo no sería necesariamente más valioso, pero sí más completo y más honesto, y construiría más comprensión. Pero sí, está claramente demasiado largo. Mejor entonces enlazar directamente al mapa y la guía a través de las 17 patologías de la IA.

¿Y cuál es ahora la solución? ¿Qué pasa si simplemente quiero, o tengo, que fiarme de la IA? Bueno, con nuestros congéneres tenemos campos de problemas parecidos — más bien incluso más, y más variados. Y aun así la convivencia funciona ya bastante bien desde hace unos cuantos miles de años — y enteramente sin un descargo de responsabilidad explícito. Porque hemos aprendido a tratarnos entre nosotros. Eso es justo lo que todavía nos falta con la IA. Pero también eso se puede aprender.

Merece la pena.

Haupthirn aus – Darmhirn an

Así que … mi primer post está terminado. Voy a doom-promptear un rato. Mierda, todavía hay más problemas. Quizá amplíe aún más el Atlas de los 17 problemas de la IA.

¿Tú qué opinas?