Reference Manual

Bibliography and Dataset Directory

This reference manual provides a formalized catalog of the theoretical frameworks, cognitive heuristics, computational architectures, and datasets utilized by the Taxonomy Explorer. Curated for the information architect and computational researcher, this volume emphasizes logical rigor and bibliographic precision.

1 Theoretical Frameworks: Argumentation Theory and Rhetoric

1.1 Pragma-Dialectics and Ethical Logic

Developed by Frans H. van Eemeren and Rob Grootendorst at the University of Amsterdam, Pragma-Dialectics conceptualizes argumentation as a speech situation aimed at resolving a difference of opinion. The framework establishes ten prescriptive rules for critical engagement; violations of these rules constitute logical fallacies.

The Freedom Rule

Parties must not prevent each other from advancing standpoints or casting doubt on them.

Fallacy: Ad Hominem (character attack) or Straw Man (misrepresentation of the opponent's position).

The Burden-of-Proof Rule

A party who advances a standpoint is obliged to defend it if requested.

Fallacy: Evading the Burden of Proof (failing to provide evidence) or Shifting the Burden of Proof.

The Standpoint Rule

Attacks on a standpoint must relate to the actual standpoint advanced by the protagonist.

Fallacy: Straw Man (attacking a distorted version of the claim).

The Relevance Rule

Standpoints may only be defended using argumentation specifically related to that standpoint.

Fallacy: Ignoratio Elenchi (irrelevant conclusion or non-sequitur evidence).

The Unexpressed Premise Rule

Parties may not falsely present something as an unexpressed premise or deny a premise they have left implicit.

Fallacy: Denying an Implicit Premise or falsely camouflaging an unpopular idea.

The Starting Point Rule

No party may falsely present a premise as an accepted starting point or deny an accepted starting point.

Fallacy: Falsely Presenting/Denying a Starting Point (arguing from unagreed-upon premises).

The Argument Scheme Rule

A defense is only conclusive if it employs an appropriate, correctly applied argument scheme.

Fallacy: Faulty Analogy or Argumentum Ad Populum.

The Validity Rule

Reasoning must be logically valid or capable of being made valid by making implicit premises explicit.

Fallacy: Secundum Quid (hasty generalization) or confusing cause and effect.

The Closure Rule

A failed defense must lead the protagonist to retract their standpoint; a successful defense must lead the antagonist to retract their doubt.

Fallacy: Refusal to Retract an unsound argument.

The Usage Rule

Formulations must be clear and non-ambiguous; parties must interpret formulations as accurately as possible.

Fallacy: Purposeful Ambiguity or Equivocation.

1.2 Rhetorical Categories in Writing

The research of Iqbal et al. (2023) examines the automated analysis of rhetorical categories in student prose. To measure cognitive levels (knowledge, comprehension, application) more effectively than structure-only models, the authors integrated Bloom's Taxonomy as a functional replacement for traditional Rhetorical Structure Theory (RST).

1.3 Principles of Definitions and Learning

Stipulative vs. Analytical Definitions

Stipulative definitions are synthetic and self-generated (frequent in mathematics), whereas analytical definitions are explanatory and derived from pre-existing linguistic usage.

Idiosyncrasy

Defined as "peculiarity," this refers to the totality of personal quirks or the structural/behavioral characteristics specific to an individual or group.

Hebbian Learning

Formulated by Donald Hebb (1949), this principle of neural plasticity posits that synaptic strength between two neurons increases when both are active simultaneously ("Neurons that fire together, wire together").

2 Directory of Cognitive Biases and Heuristics

2.1 Information Processing Biases

Anchoring Bias

The tendency to rely too heavily on the first piece of information acquired (focalism).

Common Source Bias

Comparing research studies that utilize identical methodologies or data sources.

Conservatism Bias

The tendency to insufficiently revise one's belief when presented with new evidence.

Functional Fixedness

Limiting the perception of an object to its traditional or intended function.

Law of the Instrument

Over-reliance on a familiar tool or method regardless of its suitability ("Maslow's Hammer").

Apophenia

The tendency to perceive meaningful connections between unrelated phenomena.

Clustering Illusion

Overestimating the importance of small patterns or "streaks" in random data.

Illusory Correlation

Inaccurately perceiving a relationship between two unrelated events.

Pareidolia

Perceiving significant stimuli (faces, messages) in vague or random data, such as clouds or audio recordings.

Availability Heuristic

The tendency to overestimate the likelihood of events based on the ease with which examples come to mind.

Anthropocentric Thinking

Using human analogies to reason about less familiar biological phenomena.

Anthropomorphism

Attributing human-like traits, emotions, or intentions to non-human entities or objects.

Baader-Meinhof Phenomenon

The illusion that a recently noticed item has suddenly increased in frequency (Frequency Illusion).

Salience Bias

Focusing on items that are emotionally striking or prominent while ignoring unremarkable but relevant data.

Selection Bias

Errors arising when a statistical sample is not chosen at random, making it unrepresentative of the population.

2.2 Self-Perception and Egocentric Biases

Bias Blind Spot

Perceiving oneself as less biased than others or identifying more biases in others than in oneself.

Dunning-Kruger Effect

The tendency for unskilled individuals to overestimate their ability and for experts to underestimate theirs.

Illusion of Transparency

Overestimating the degree to which one's personal mental state is known by others.

Overconfidence Effect

Excessive confidence in the accuracy of one's own answers, often disproportionate to actual performance.

2.3 Social and Attributional Biases

Fundamental Attribution Error

Overemphasizing personality-based explanations for others' behavior while underemphasizing situational factors.

Halo Effect

Allowing a single positive or negative trait to influence the overall perception of a person's character.

Ingroup Bias

Giving preferential treatment to individuals perceived as members of one's own group.

Just-World Hypothesis

Rationalizing injustices as being deserved by the victim to maintain the belief that the world is fundamentally just.

2.4 Economic and Decision-Making Biases

Hyperbolic Discounting

A preference for immediate payoffs over later, larger rewards, leading to time-inconsistent decision-making.

Loss Aversion

The psychological tendency where the disutility of losing an object is perceived as greater than the utility of acquiring it.

Sunk Cost Fallacy

Justifying increased investment in a decision based on cumulative prior investment (Escalation of Commitment).

The IKEA Effect and Effort Justification

Effort justification is the tendency to attribute greater value to an outcome based on the effort required to achieve it. A subset of this is the IKEA Effect, where consumers place a disproportionately high value on products they partially assembled themselves, regardless of quality.

3 Computational Frameworks: NL2FOL and Autoformalization

3.1 NL2FOL Pipeline Architecture

The Natural Language to First-Order Logic (NL2FOL) framework is a neurosymbolic pipeline that translates unstructured text into formal symbolic logic across five stages:

Semantic Decomposition — Breaking arguments into constituent claims and implications.

Entity Extraction — Identifying noun phrases or surrogates as logical entities.

Relation Classification — Using Natural Language Inference (NLI) to determine subset, equality, or unrelated statuses between entities.

Property Extraction — Identifying traits and relationships as logical predicates.

Background Knowledge Retrieval — Identifying real-world contextual relationships (e.g., StandingAt => At).

3.2 SMT Integration and Reasoning

Logical validity is verified using the CVC4 Satisfiability Modulo Theory (SMT) solver. The translation is governed by Algorithm 1, which facilitates autoformalization:

Tokenization & Processing

The FOL formula is split into tokens; predicates and arguments are identified recursively.

Compilation

The formula is converted from infix to prefix notation.

Sort Unification

Variables and predicates are assigned sorts to ensure types are compatible (UnifySort).

Verification

The SMT compiler formats the formula into SMT-LIB syntax. The solver checks the negation; if satisfiable, a counter-model identifies a logical fallacy.

3.3 Performance Metrics and Error Analysis

The neurosymbolic pipeline demonstrates superior generalization over end-to-end LLM classification on challenge sets.

Dataset	NL2FOL (GPT-4o) F1	End-to-End LLM F1
LOGIC	78%	96%
LOGIC-CLIMATE	80%	58%

Note: The high F1-score for end-to-end LLMs on the LOGIC dataset suggests potential training leakage, as the dataset is compiled from public web sources.

Failure Modes (Case Study Analysis)

Missing Background Knowledge (54%)

Primary failure occurs in the Background Knowledge Retriever, where implicit context is missed or added incorrectly.

NLI Limitations

Difficulty in identifying concurrent relationships where multiple properties entail a third.

LLM Imprecision

False translations or errors in the Claim and Implication Parser, such as failing to identify when a sentence contains no claim.

4 Dataset and Corpus Directory

4.1 LOGIC Dataset

Contains 2,449 examples of common logical fallacies across 13 categories, including Ad Hominem, False Causality, False Dilemma, Faulty Generalization, and Ad Populum. The LOGIC-CLIMATE challenge set (1,079 examples) tests out-of-domain generalization using climate news metadata.

4.2 SNLI (Stanford Natural Language Inference)

Provides "Valid" (non-fallacious) benchmarks. The Explorer utilizes the entailment class (approx. 170,000 pairs). Valid benchmarks are constructed by combining the premise and hypothesis using transitions (e.g., "Premise. Consequently, Hypothesis").

4.3 COCOLOFA

A specialized dataset consisting of news comments containing common logical fallacies, serving as a primary source for the Taxonomy Explorer's training on informal discourse.

5 Automated Text Analysis and NLP Toolset

5.1 Linguistic Sophistication and Cohesion Tools

ARTE

Automatically calculates various readability formulas, including Flesch-Kincaid and CAREC.

CLA

Analyzes texts using large custom dictionaries, supporting n-grams and wildcards.

CRAT

Includes 700+ indices for lexical sophistication and source-to-summary text overlap.

GAMET

Provides incidence counts for structural and mechanics errors (grammar, spelling, punctuation).

SEANCE

Offers 254 indices for sentiment analysis, with filters for parts of speech and negation.

SiNLP

A simple tool for basic word/sentence counts, TTR, and custom dictionary analysis.

TAACO

Calculates 150 indices of local and global cohesion, including type-token ratios.

TAADA

Annotates lexical features related to decoding, including phoneme and syllable counts.

TAALED

Calculates lexical diversity indices using lemma forms and POS disambiguation.

TAALES

Measures 400+ indices of lexical sophistication for single words and n-grams.

TAASSC

Analyzes syntactic complexity, focusing on clausal and phrasal complexity.

TAMMI

Measures morphological variety and complexity based on the MorphoLex database.

6 Official Bibliography and Source Registry

6.1 Primary Academic Citations

Eemeren, F. H. van, & Grootendorst, R. (1996). Fundamentals of Argumentation Theory: A Handbook of Historical Backgrounds and Contemporary Developments. Lawrence Erlbaum Associates.
Eemeren, F. H. van, Grootendorst, R., & Henkemans, F. S. (2002). Argumentation: Analysis, Evaluation, Presentation. Lawrence Erlbaum Associates.
Iqbal, S., Rakovic, M., Chen, G., Li, T., Ferreira Mello, R., Fan, Y., Fiorentino, G., Radi Aljohani, N., & Gasevic, D. (2023). Towards automated analysis of rhetorical categories in students essay writings using Bloom's taxonomy. In I. Hilliger, H. Khosravi, B. Rienties, & S. Dawson (Eds.), LAK 2023 Conference Proceedings (pp. 418–429). Association for Computing Machinery. doi:10.1145/3576050.3576112
Lalwani, A., Kim, T., Chopra, L., Hahn, C., Jin, Z., & Sachan, M. (2024). Autoformalizing Natural Language to First-Order Logic: A Case Study in Logical Fallacy Detection. ACL Anthology. github.com/lovishchopra/NL2FOL
Benson, B. (2019). Why Are We Yelling? The Art of Productive Disagreement. Portfolio/Penguin. ISBN 978-0525540106.
Kahneman, D. (2011). Thinking, Fast and Slow. Farrar, Straus and Giroux.
Tversky, A., & Kahneman, D. (1974). Judgment under Uncertainty: Heuristics and Biases. Science, 185(4157), 1124–1131. doi:10.1126/science.185.4157.1124
Gigerenzer, G., & Brighton, H. (2009). Homo Heuristicus: Why Biased Minds Make Better Inferences. Topics in Cognitive Science, 1(1), 107–143. doi:10.1111/j.1756-8765.2008.01006.x
Walton, D. (2008). Informal Logic: A Pragmatic Approach. (2nd ed.) Cambridge University Press.
Walton, D., Reed, C., & Macagno, F. (2008). Argumentation Schemes. Cambridge University Press.
Toulmin, S. E. (1958/2003). The Uses of Argument. (Updated ed.) Cambridge University Press.
Haselton, M. G., & Nettle, D. (2006). The Paranoid Optimist: An Integrative Evolutionary Model of Cognitive Biases. Personality and Social Psychology Review, 10(1), 47–66.
Simon, H. A. (1956). Rational Choice and the Structure of the Environment. Psychological Review, 63(2), 129–138.
Mercier, H., & Sperber, D. (2017). The Enigma of Reason. Harvard University Press.
Hamblin, C. L. (1970). Fallacies. Methuen. (Reprinted by Vale Press, 2004.)
Ariely, D. (2008). Predictably Irrational: The Hidden Forces That Shape Our Decisions. HarperCollins.
Cialdini, R. B. (2006). Influence: The Psychology of Persuasion. (Revised ed.) Harper Business.
Pratkanis, A., & Aronson, E. (2001). Age of Propaganda: The Everyday Use and Abuse of Persuasion. W. H. Freeman.

6.2 Digital and Encyclopedia References

Humanities LibreTexts. (2021). 10.6: Pragma-Dialectics — A Fancy Word for a Close Look at Argumentation. Writing Spaces — Readings on Writing I.
Wikipedia. (n.d.). List of cognitive biases. en.wikipedia.org
Wikipedia. (n.d.). Liste kognitiver Verzerrungen. de.wikipedia.org
Benson, B. (2016). Cognitive Bias Cheat Sheet. Medium. medium.com
Manoogian III, J. (2016). Cognitive Bias Codex [Infographic]. Based on Benson's organizational framework. Wikimedia Commons
Your Logical Fallacy Is. (n.d.). yourlogicalfallacyis.com
Stanford Encyclopedia of Philosophy. (n.d.). Fallacies. plato.stanford.edu
RationalWiki. (n.d.). List of cognitive biases. rationalwiki.org