Page 1 of 1

Retrieval Augmented Generation

Posted: Sat Sep 27, 2025 4:27 pm
by ConfusedJew
Is there an existing database with links or PDFs of all the Holocaust denial texts and sources?

Holocaust denial literature is fringe, often self-published, and not widely available in mainstream academic or library databases. That means LLMs often summarize their positions from secondary sources rather than quoted them directly which can lead to hallucinations and errors. Deniers often write in a technical-sounding style so simplification can often misrepresent their points.

By building a Retrieval-Augmented Generation (RAG), LLMs can be connected to that database and automatically pull exact passages from the text in order to rebut which eliminates the risk of making up quotations.

Have you guys built anything like this yourself already? I don't have the time to go through all the material by hand obviously, but I'm trying to figure out if there's a way to use technology to be more effective and efficient.

Re: Retrieval Augmented Generation

Posted: Sat Sep 27, 2025 6:29 pm
by Wetzelrad
Your AI already makes up quotes from Holocaust affirmation literature. Why should anyone want that problem to be extended to revisionist literature?
https://www.codohforum.com/viewtopic.php?p=13146#p13146

Your AI also misrepresents the content of that literature in other ways. I have also seen it on several occasions make reverse arguments from what revisionists wrote, as if it had read and understood the logic but flipped the facts around.
https://www.codohforum.com/viewtopic.php?p=9762#p9762

Even so. From having tried Grok, I can confirm that it can access revisionist texts (from archive.org or niche websites) and derive information from them. Unfortunately it is still an AI so it misrepresents what it reads. I will share a link to one example below, which was a question I asked in the context of our conversation from two days ago. Grok was able to find a reference to a wall crack where cyanide was not found (or was below the detection level), but it reported this to me as if there were multiple cracks where cyanide was found. (This is definitely wrong. See page 263 of the Rudolf Report or 323 of TCOA.) From this example alone it becomes apparent that AI is not able to accurately summarize what it reads.
https://x.com/i/grok?conversation=1971416809543696632

The mistake Grok made is probably similar to what your AI did when it wrote the post I was responding to. You should check your AI's sources. Probably it will tell you that it already is using revisionist literature to formulate its responses. The reason it's forced to lie to create its responses is because it has instructions to affirm the Holocaust narrative.

Re: Retrieval Augmented Generation

Posted: Sat Sep 27, 2025 6:47 pm
by HansHill
It feels psychotic the extent to which you will go to actively avoid engaging with revisionist material.

Were you as serious as you pretend to be, you would have made progress through at least a handful of the ~50 Holocaust Handbook series available for free via CODOH

Your latest sack of horsesh#t is laughable:

- Fringe: irrevelant because you successfully made your way here which means you can make your way to the HH series
- Self-published: irrelevant because you are willing to engage with "self-published" forum posters like us
- Not widely available: see point 1
- Academic or library databases: see point 1
- "have you guys tried anything like that?" - See point 1
- "I don't have time" - Then until you find time, you must accept you will continue to be second best in every exchange you find yourself in.

Re: Retrieval Augmented Generation

Posted: Sat Sep 27, 2025 7:48 pm
by Archie
ConfusedJew wrote: Sat Sep 27, 2025 4:27 pm Is there an existing database with links or PDFs of all the Holocaust denial texts and sources?
Oh, you mean like Holocaust Handbooks which we have referred you to repeatedly over the last five months?
Holocaust denial literature is fringe, often self-published, and not widely available in mainstream academic or library databases. That means LLMs often summarize their positions from secondary sources rather than quoted them directly which can lead to hallucinations and errors. Deniers often write in a technical-sounding style so simplification can often misrepresent their points.
Most of our material is available online for free, unlike the material from mainstream commercial and academic publishers. Revisionist material is more not less accessible because we are not as concerned with turning a profit. (As an aside, I will mention that good university libraries actually do have revisionist books in their catalogs).

The output from LLMs is mostly from online secondary sources like Wikipedia, Reddit, Amazon, Goodreads, Google results, etc. From my interactions with it, in most cases I do not think it is using the original texts of revisionist or mainstream authors. That is, if you ask it about Hilberg or Van Pelt, I don't think it's "reading" their books. It's summarizing what other people have said about Hilbert of Van Pelt, along with a big dose of often very wrong interpolations. The AI companies certainly have access to tons of revisionist material (their bots crawl our sites constantly), but they have put their thumb on the scale on the Holocaust and artificially suppress that material, similar to what Google does. If you search for info on a revisionist argument on Google, generally it will give you a bunch of Jewish sites attacking revisionists without showing you the revisionist side. And obviously this bias is obviously already baked in to sources like Wikipedia. So of course it is not going to reliably summarize revisionist thought. But I would add that it isn't very reliable on the mainstream literature, at least in terms of the minutiae which require detailed consultation of the original texts and documents.
By building a Retrieval-Augmented Generation (RAG), LLMs can be connected to that database and automatically pull exact passages from the text in order to rebut which eliminates the risk of making up quotations.

Have you guys built anything like this yourself already? I don't have the time to go through all the material by hand obviously, but I'm trying to figure out if there's a way to use technology to be more effective and efficient.
Dude, just read some books and form your own opinion about it. Stop it with the shortcuts that clearly aren't working.

It's been five months now. If you had simply started reading like a normal person you could have easily read a half a dozen books by now or gone through an equivalent amount of online sources. Instead you have decided to waste your time going in circles, posting nonsense from GPT which you lack the context to understand and which you are too lazy to cross-check.

Re: Retrieval Augmented Generation

Posted: Sat Sep 27, 2025 9:54 pm
by ConfusedJew
HansHill wrote: Sat Sep 27, 2025 6:47 pm Were you as serious as you pretend to be, you would have made progress through at least a handful of the ~50 Holocaust Handbook series available for free via CODOH
Have you read any books by Raul Hilberg, Saul Firedlander, Yehuda Bauer, Christopher Browning, or Deborah Lipstadt?
Your latest sack of horsesh#t is laughable:

- Fringe: irrevelant because you successfully made your way here which means you can make your way to the HH series
- Self-published: irrelevant because you are willing to engage with "self-published" forum posters like us
- Not widely available: see point 1
- Academic or library databases: see point 1
- "have you guys tried anything like that?" - See point 1
- "I don't have time" - Then until you find time, you must accept you will continue to be second best in every exchange you find yourself in.
I'm explaining why LLMs need to be trained specifically on your content. ChatGPT isn't trained on self published and fringe content. I'm looking for ways to deal with this more accurately and efficiently.

Re: Retrieval Augmented Generation

Posted: Sat Sep 27, 2025 9:57 pm
by ConfusedJew
Archie wrote: Sat Sep 27, 2025 7:48 pm If you search for info on a revisionist argument on Google, generally it will give you a bunch of Jewish sites attacking revisionists without showing you the revisionist side. And obviously this bias is obviously already baked in to sources like Wikipedia. So of course it is not going to reliably summarize revisionist thought.
The issue is that it isn't directly trained on the material so I'm trying to figure out a way to train it directly on the material so that I can respond more directly. It can figure out the science, but unless it's trained on the data and arguments, it will not be that effective.

Re: Retrieval Augmented Generation

Posted: Sat Sep 27, 2025 10:05 pm
by Nazgul
ConfusedJew wrote: Sat Sep 27, 2025 9:57 pm It can figure out the science, but unless it's trained on the data and arguments, it will not be that effective.
It cannot understand the science, but only skim the surface at best. You need to understand the science.

Re: Retrieval Augmented Generation

Posted: Sat Sep 27, 2025 10:51 pm
by Archie
ConfusedJew wrote: Sat Sep 27, 2025 9:54 pm Have you read any books by Raul Hilberg, Saul Firedlander, Yehuda Bauer, Christopher Browning, or Deborah Lipstadt?
Hilberg - yes. I read the 1961 edition of Destruction (skimmed some parts, other key parts I have read many times and checked the sources). I read some of Politics of Memory. I've read his Zundel testimony. And lots of revisionist commentary on him.
S. Friedlander - no
Bauer - I have a copy of American Jewry and the Holocaust on my shelves and have read parts of it. I have read several of his articles.
Browning - yes. I read Path to Genocide and took many pages of notes. I read several other articles. I plan on reading Origins of the Final Solution at some point.
Lipstadt - yes. Denying the Holocaust and Beyond Belief.

Re: Retrieval Augmented Generation

Posted: Sun Sep 28, 2025 12:30 am
by ConfusedJew
Nazgul wrote: Sat Sep 27, 2025 10:05 pm It cannot understand the science, but only skim the surface at best. You need to understand the science.
It's not perfect but it can really get to the root of issues. I want to see if I can upload all of the scientific documents to see if it can identify where the crux of the disagreements are.

Which are the most important documents or writings on the chemistry side? I'll look for the most important ones there and the most important mainstream rebuttals to see where the gaps and disagreements are.