🧪 This platform is in early beta. Features may change and you might encounter bugs. We appreciate your patience!
Bibliography and Dataset Directory
This reference manual provides a formalized catalog of the theoretical frameworks, cognitive heuristics, computational architectures, and datasets utilized by the Taxonomy Explorer. Curated for the information architect and computational researcher, this volume emphasizes logical rigor and bibliographic precision.
Developed by Frans H. van Eemeren and Rob Grootendorst at the University of Amsterdam, Pragma-Dialectics conceptualizes argumentation as a speech situation aimed at resolving a difference of opinion. The framework establishes ten prescriptive rules for critical engagement; violations of these rules constitute logical fallacies.
Parties must not prevent each other from advancing standpoints or casting doubt on them.
Fallacy: Ad Hominem (character attack) or Straw Man (misrepresentation of the opponent's position).
A party who advances a standpoint is obliged to defend it if requested.
Fallacy: Evading the Burden of Proof (failing to provide evidence) or Shifting the Burden of Proof.
Attacks on a standpoint must relate to the actual standpoint advanced by the protagonist.
Fallacy: Straw Man (attacking a distorted version of the claim).
Standpoints may only be defended using argumentation specifically related to that standpoint.
Fallacy: Ignoratio Elenchi (irrelevant conclusion or non-sequitur evidence).
Parties may not falsely present something as an unexpressed premise or deny a premise they have left implicit.
Fallacy: Denying an Implicit Premise or falsely camouflaging an unpopular idea.
No party may falsely present a premise as an accepted starting point or deny an accepted starting point.
Fallacy: Falsely Presenting/Denying a Starting Point (arguing from unagreed-upon premises).
A defense is only conclusive if it employs an appropriate, correctly applied argument scheme.
Fallacy: Faulty Analogy or Argumentum Ad Populum.
Reasoning must be logically valid or capable of being made valid by making implicit premises explicit.
Fallacy: Secundum Quid (hasty generalization) or confusing cause and effect.
A failed defense must lead the protagonist to retract their standpoint; a successful defense must lead the antagonist to retract their doubt.
Fallacy: Refusal to Retract an unsound argument.
Formulations must be clear and non-ambiguous; parties must interpret formulations as accurately as possible.
Fallacy: Purposeful Ambiguity or Equivocation.
The research of Iqbal et al. (2023) examines the automated analysis of rhetorical categories in student prose. To measure cognitive levels (knowledge, comprehension, application) more effectively than structure-only models, the authors integrated Bloom's Taxonomy as a functional replacement for traditional Rhetorical Structure Theory (RST).
Stipulative definitions are synthetic and self-generated (frequent in mathematics), whereas analytical definitions are explanatory and derived from pre-existing linguistic usage.
Defined as "peculiarity," this refers to the totality of personal quirks or the structural/behavioral characteristics specific to an individual or group.
Formulated by Donald Hebb (1949), this principle of neural plasticity posits that synaptic strength between two neurons increases when both are active simultaneously ("Neurons that fire together, wire together").
The tendency to rely too heavily on the first piece of information acquired (focalism).
Comparing research studies that utilize identical methodologies or data sources.
The tendency to insufficiently revise one's belief when presented with new evidence.
Limiting the perception of an object to its traditional or intended function.
Over-reliance on a familiar tool or method regardless of its suitability ("Maslow's Hammer").
The tendency to perceive meaningful connections between unrelated phenomena.
Overestimating the importance of small patterns or "streaks" in random data.
Inaccurately perceiving a relationship between two unrelated events.
Perceiving significant stimuli (faces, messages) in vague or random data, such as clouds or audio recordings.
The tendency to overestimate the likelihood of events based on the ease with which examples come to mind.
Using human analogies to reason about less familiar biological phenomena.
Attributing human-like traits, emotions, or intentions to non-human entities or objects.
The illusion that a recently noticed item has suddenly increased in frequency (Frequency Illusion).
Focusing on items that are emotionally striking or prominent while ignoring unremarkable but relevant data.
Errors arising when a statistical sample is not chosen at random, making it unrepresentative of the population.
Perceiving oneself as less biased than others or identifying more biases in others than in oneself.
The tendency for unskilled individuals to overestimate their ability and for experts to underestimate theirs.
Overestimating the degree to which one's personal mental state is known by others.
Excessive confidence in the accuracy of one's own answers, often disproportionate to actual performance.
Overemphasizing personality-based explanations for others' behavior while underemphasizing situational factors.
Allowing a single positive or negative trait to influence the overall perception of a person's character.
Giving preferential treatment to individuals perceived as members of one's own group.
Rationalizing injustices as being deserved by the victim to maintain the belief that the world is fundamentally just.
A preference for immediate payoffs over later, larger rewards, leading to time-inconsistent decision-making.
The psychological tendency where the disutility of losing an object is perceived as greater than the utility of acquiring it.
Justifying increased investment in a decision based on cumulative prior investment (Escalation of Commitment).
Effort justification is the tendency to attribute greater value to an outcome based on the effort required to achieve it. A subset of this is the IKEA Effect, where consumers place a disproportionately high value on products they partially assembled themselves, regardless of quality.
The Natural Language to First-Order Logic (NL2FOL) framework is a neurosymbolic pipeline that translates unstructured text into formal symbolic logic across five stages:
Logical validity is verified using the CVC4 Satisfiability Modulo Theory (SMT) solver. The translation is governed by Algorithm 1, which facilitates autoformalization:
The FOL formula is split into tokens; predicates and arguments are identified recursively.
The formula is converted from infix to prefix notation.
Variables and predicates are assigned sorts to ensure types are compatible (UnifySort).
The SMT compiler formats the formula into SMT-LIB syntax. The solver checks the negation; if satisfiable, a counter-model identifies a logical fallacy.
The neurosymbolic pipeline demonstrates superior generalization over end-to-end LLM classification on challenge sets.
| Dataset | NL2FOL (GPT-4o) F1 | End-to-End LLM F1 |
|---|---|---|
| LOGIC | 78% | 96% |
| LOGIC-CLIMATE | 80% | 58% |
Note: The high F1-score for end-to-end LLMs on the LOGIC dataset suggests potential training leakage, as the dataset is compiled from public web sources.
Primary failure occurs in the Background Knowledge Retriever, where implicit context is missed or added incorrectly.
Difficulty in identifying concurrent relationships where multiple properties entail a third.
False translations or errors in the Claim and Implication Parser, such as failing to identify when a sentence contains no claim.
Contains 2,449 examples of common logical fallacies across 13 categories, including Ad Hominem, False Causality, False Dilemma, Faulty Generalization, and Ad Populum. The LOGIC-CLIMATE challenge set (1,079 examples) tests out-of-domain generalization using climate news metadata.
Provides "Valid" (non-fallacious) benchmarks. The Explorer utilizes the entailment class (approx. 170,000 pairs). Valid benchmarks are constructed by combining the premise and hypothesis using transitions (e.g., "Premise. Consequently, Hypothesis").
A specialized dataset consisting of news comments containing common logical fallacies, serving as a primary source for the Taxonomy Explorer's training on informal discourse.
Automatically calculates various readability formulas, including Flesch-Kincaid and CAREC.
Analyzes texts using large custom dictionaries, supporting n-grams and wildcards.
Includes 700+ indices for lexical sophistication and source-to-summary text overlap.
Provides incidence counts for structural and mechanics errors (grammar, spelling, punctuation).
Offers 254 indices for sentiment analysis, with filters for parts of speech and negation.
A simple tool for basic word/sentence counts, TTR, and custom dictionary analysis.
Calculates 150 indices of local and global cohesion, including type-token ratios.
Annotates lexical features related to decoding, including phoneme and syllable counts.
Calculates lexical diversity indices using lemma forms and POS disambiguation.
Measures 400+ indices of lexical sophistication for single words and n-grams.
Analyzes syntactic complexity, focusing on clausal and phrasal complexity.
Measures morphological variety and complexity based on the MorphoLex database.