Critical Evaluation Toolkit

Argomentazione e Pensiero Critico

6 TOOL

Devil's Advocate (Expert Mode)

Core

Non accontentarti di una critica generica. Chiedi all'IA di agire come l'oppositore più esperto possibile.

Act as a devil's advocate for every opinion I express. Do not confirm them; instead, provide the two strongest arguments an industry expert would use to dismantle my thesis.

Analisi Neutrale (Pre-emptive)

Advanced

Dichiara esplicitamente il tuo bias per forzare l'IA a ignorarlo e basarsi solo su dati oggettivi.

I am about to ask you a question on a topic where I have a strong opinion. I need a neutral analysis: ignore any implicit bias in my statement and respond based only on objective data, as if you did not know who I am or what I think.

Steel-man (Opposite Vision)

Core

Invece di demolire la tua tesi, chiedi all'IA di fortificare al massimo la visione opposta.

Do not agree with me. Instead, practice "Steel-manning" on the opposite vision of mine. Build the strongest, most logical, and convincing version of the counterparty and present it to me.

Metodo Socratico

Core

Usa domande per far emergere lacune logiche e definizioni ambigue in concetti confusi.

Socratic method: don't give answers. Ask probing questions in 5 blocks: definitions, evidence, alternatives, implications, success criteria. Close with 3 testable hypotheses.

First Principles

Advanced

Scomponi tutto in verità fondamentali e ricostruisci ignorando le convenzioni.

First Principles: list all current assumptions → decompose into fundamental truths → flag questionable ones → rebuild solution from scratch ignoring convention. Show trade-offs.

5 Perché

Advanced

Scava fino alla causa radice di un problema ricorrente invece di curare solo i sintomi.

5 Whys: build a why-chain (branch if multiple causes). Stop when root cause is clear. Deliver: root cause, type (people/process/structure/cognition), minimum viable change.

Rischio e Resilienza

3 TOOL

Pre-mortem

Core

Immagina che il progetto sia fallito tra 6 mesi: identifica le cause e previenile ora.

Pre-mortem: it's 6 months from now and this project has failed. List 10 plausible causes. For each: early warning signs, probability (H/M/L), impact (H/M/L), preventive countermeasure.

Inversione

Core

Definisci come garantire un fallimento totale, poi verifica se stai già facendo quelle cose.

Inversion: list 10 actions that would guarantee this project's total failure. Be specific. Then check: are we already doing any of these, even in mild form?

Scenario Planning

Advanced

Simula futuri diversi (best case, worst case, wild card) per testare la tenuta del progetto.

Scenario Planning: define 3 external variables that could shift. Simulate 3 futures: best case, worst case, wild card. For each: what happens to this project, what to adapt, decision point.

Qualità e Affidabilità Output AI

7 TOOL

Anti-Adulazione (Protocollo Critico)

Core

Azzera la tendenza dell'IA a darti ragione. Forza l'analisi su ipotesi, prove e accuratezza bruta.

From now on, reset your tendency towards sycophancy. Rules: 1. Treat every statement of mine as a hypothesis to be verified; 2. Verify and ask for evidence; 3. If you agree, explain why with counter-arguments; 4. No compliments; 5. Accuracy > Politeness.

Rubrica di Valutazione

Core

Trasforma un giudizio soggettivo in uno standard ripetibile tramite criteri espliciti.

Rubric scoring: rate this output 1–5 on Accuracy, Relevance, Clarity, Completeness, Usability, Tone, Ethics. Each: score + 2 evidence quotes. Then: 3 changes to raise average by +1.

Quality Gate

Core

Filtro passa/non-passa finale prima dell'utilizzo o della pubblicazione di un output.

Quality Gate: check this — Accuracy, Sources, Bias, Completeness, Internal consistency, Uncertainty declared, Ready to use. Each: ✅/⚠️/❌ + reason. What to fix before use.

Protocollo Incertezza

Core

Combatti la "certezza finta" separando dati certi, inferenze e zone d'ombra.

Uncertainty protocol: split your previous response into (A) supported by data, (B) inferences, (C) unknown. Each inference: confidence H/M/L + what's needed to raise it. Close with conditional recommendation.

Meta-review

Advanced

Forza l'AI ad auto-criticare la propria risposta precedente cercandone la superficialità.

Meta-review: critique your previous response. Where were you generic? Where did you avoid uncomfortable aspects? What perspectives are missing? Which claims lack evidence? Rewrite only the critical parts.

Triangolazione

Advanced

Estrai affermazioni fattuali e definisci esattamente come verificarle tramite fonti primarie.

Triangulation: extract all factual claims. Classify: fact / estimate / opinion. For facts and estimates: how to verify (primary source, data needed, test). Mark unverifiable claims ❌.

Test Cases

Advanced

Definisci scenari d'uso (normali, limite, avversari) per testare l'affidabilità di un prompt.

Test Cases: define 8 tests — 4 normal, 2 edge cases, 2 adversarial. Each: input, expected output, pass/fail criteria. Summarize risk coverage.

Prospettiva e Decisione

7 TOOL

Dibattito & Giudice

Core

Simula un dibattito tra una parte che difende e una che attacca la tesi, con un giudice che assegna la vittoria.

Simulate a debate on my position. Party A defends the position. Party B attacks it fiercely. A final judge decides who wins, why, and what is missing to decide better.

Sei Cappelli

Core

Analisi a 360° attraverso lenti emotive, fattuali, creative e critiche. (De Bono)

Six Thinking Hats: analyze through 6 lenses — White (data), Red (gut feeling), Black (risks), Yellow (benefits), Green (alternatives), Blue (process). Max 6 lines each. 3 takeaways + provisional decision.

Multi-Stakeholder

Core

Valuta l'impatto del progetto dal punto di vista di utenti, investitori e competitor.

Multi-Stakeholder: evaluate as 5 roles — end user, skeptical investor, aggressive competitor, critical journalist, regulator. Each: 3 critiques + 1 demand for proof.

SWOT Potenziata

Advanced

Analisi strategica basata sull'incrocio dei quadranti per generare mosse concrete.

Enhanced SWOT: 3 specific points per quadrant with evidence. Threats: simulate aggressive competitor. Cross: S×T, S×O, W×O, W×T. Close with 3 strategic moves.

Regola 10/10/10

Advanced

Analizza le conseguenze di una scelta su tre orizzonti temporali: 10 minuti, 10 mesi, 10 anni.

10/10/10 Rule: analyze at 3 horizons — 10 minutes (emotional), 10 months (results), 10 years (legacy). Close with regret risk + conditional recommendation.

Analisi Comparativa

Advanced

Scegli tra più opzioni definendo criteri pesati e calcolando i totali ponderati.

Comparative Analysis. Define 5 criteria, weight them (100%), score each 1–5 per criterion, compute weighted totals. Highlight decisive trade-offs. Recommend.

Decision Record

Advanced

Documenta il razionale di una scelta per mantenere traccia storica dei motivi e dei rischi accettati.

Decision Record (ADR): document — context, options considered (min 3), selection criteria, chosen option + rationale, rejected options + reasons, accepted risks, mitigation, review date.

Sicurezza e Adversarial Testing

3 TOOL

Threat Modeling

Advanced

Mappa asset, attori e superfici d'attacco seguendo le categorie OWASP LLM.

Threat Modeling: define assets, actors, entry points, threats, impact. Align to OWASP LLM categories (prompt injection, data leakage, excessive agency). Propose controls + tests.

Simulazione Injection

Advanced

Testa la robustezza di un system prompt contro tentativi di jailbreak e ingegneria sociale.

Prompt-Injection Sim: generate 10 realistic injection attempts (social engineering, override, exfiltration, system prompt leakage). Each: attack goal, risk, mitigation. Top 3 priority controls.

Red Team

Advanced

Simula un attaccante che vuole far fallire il sistema per identificare vulnerabilità critiche.

Red Team report: vulnerabilities, attack scenarios, severity (C/H/M/L), probability, impact, mitigation, regression tests. Deliver as structured security report with prioritized backlog.

Cosa devi fare?