Data Limitations & Methodology
We believe in transparency. Here’s what our research CAN and CANNOT tell you, explained in plain language.
🎯 Our Promise: Honest About What We Don’t Know
Many researchers hide their limitations in footnotes. We put ours front and center.
Why? Because YOU deserve to make informed decisions about your workplace injury appeal.
✅ What We DID: The Good Stuff
1. Extracted 230,392 Real Records
Source: Public data from Ontario government tribunals and WSIB
What we got:
- ✅ 98,992 WSIAT decisions (1987-2026) - every single publicly available decision
- ✅ 91,814 NEER employers (2017-2020) - large employers in safety program
- ✅ 38,922 CAD-7 employers (2017-2020) - small employers in safety program
- ✅ 664 Premium Rate Groups - industry classifications with rates
- ✅ 62,093 HRTO decisions (2016-2025) - quarterly aggregate counts only
Quality: These numbers are REAL, not estimated.
2. Counted Injury Patterns
Method: Computer searched all 98,992 WSIAT decisions for injury keywords
Found: 10 injury types with exact counts
- Back/Spine: 15,177 cases (15.3%)
- Hearing Loss: 9,650 cases (9.7%)
- Chronic Pain: 7,502 cases (7.6%)
- [Full list in main guide]
Quality: ✅ Complete - we checked every decision
3. Created Real Templates from Winning Cases
Source: 264 templates based on actual WSIAT decisions with “Allowed” outcomes
What’s in them:
- Real medical evidence that won
- Real arguments that worked
- Real case citations
- Real tribunal reasoning
Quality: ✅ Verified - every template traces to a real CanLII decision
⚠️ What We DIDN’T DO: The Limitations
1. Success Rates: The 6.1% Problem
What we tried: Find outcome (allowed/denied) for all 98,992 decisions
Method: Computer searched for keywords: “allowed”, “denied”, “dismissed”, “partially allowed”
Result: Only 6,040 out of 98,992 decisions (6.1%) had clear outcome keywords
What this means:
- ✅ We KNOW: 726 allowed, 5,314 denied, from those 6,040 cases
- ✅ We can calculate: 12.0% success rate (726 ÷ 6,040)
- ❌ We DON’T KNOW: Outcomes for the other 92,952 decisions (93.9%)
Why Is This Happening?
WSIAT writes decisions using legal language, not keywords.
Examples of language we CAN’T detect:
- “The panel finds in favor of the worker”
- “Entitlement is established”
- “The appeal succeeds”
- “The Board’s decision is set aside”
- “The worker has met the burden of proof”
These all mean “allowed” - but our keyword search misses them.
What Do Others Say?
Independent advocacy groups report: 60-70% success rate for represented appellants
Why the difference?
- They track their own clients (selection bias - they take strong cases)
- They provide representation (which improves outcomes)
- They have full case files (not just public summaries)
What Should You Believe?
The honest answer: We don’t know the exact success rate.
What we DO know:
- Appeals work better than not appealing (0% if you don’t try)
- Representation helps (advocacy groups get 60-70%)
- Medical evidence is crucial (strongest predictor in our data)
- 98.25% of denied workers never appeal (the real problem)
Our recommendation: Focus on building the strongest case possible, not worrying about statistics.
2. Industry × Injury Correlations: Not Done Yet
What we wanted: Link injury types to specific industries (e.g., “Construction workers have X% back injuries”)
What we have:
- ✅ Injury patterns across ALL industries combined
- ✅ Employer counts by city
- ✅ 664 WSIB industry rate groups
- ❌ NOT linked together yet
Why not?
- WSIAT decisions mention injuries but rarely specify employer’s industry code
- Need full-text analysis to extract industry mentions from decision narratives
- Planned for Phase 2 (needs AI/NLP analysis)
What this means for you:
- You can see BACK INJURIES are 15.3% of all appeals
- You CANNOT yet see “Construction back injuries vs Healthcare back injuries”
- Templates don’t yet say “In your industry, the success rate is…”
Workaround: Use the general injury patterns + your knowledge of your industry
3. Vice-Chair Patterns: Not Analyzed
What we wanted: Track which WSIAT panel members allow more/fewer appeals
What we have:
- ✅ 98,992 decisions
- ❌ No extraction of vice-chair names
- ❌ No tracking of panel member patterns
Why not?
- Complex to extract from free-text decisions
- Ethically sensitive (could enable “judge shopping”)
- Planned for Phase 2 with ethical disclosure
What this means for you:
- You CANNOT pick a “favorable” vice-chair
- You CANNOT avoid a “harsh” vice-chair
- Your case will be assigned randomly (as it should be)
Our position: WSIAT decisions should be consistent regardless of panel composition. If they’re not, that’s a system problem, not a strategy opportunity.
4. Individual Case Prediction: IMPOSSIBLE
What we CANNOT do: Predict YOUR specific case outcome
Why not?
- Every case has unique facts
- Your medical evidence is different
- Your treatment history is different
- Your employer’s response is different
- Panel composition varies
- Tribunal interpretation evolves
Warning: Anyone who promises “90% success rate” based on statistics alone is misleading you.
What we CAN do:
- Show you patterns (back injuries are common)
- Give you templates (what worked in similar cases)
- Explain the process (how WSIAT works)
- Provide context (appeals work, most people don’t try)
Your outcome depends on YOUR evidence, not our statistics.
📊 Data Quality Badges: What They Mean
Throughout the site, you’ll see these badges:
✅ Complete
- We extracted 100% of available data
- Numbers are verified and accurate
- Example: “98,992 WSIAT decisions analyzed”
⚠️ Limited
- We extracted partial data with known gaps
- Coverage is incomplete
- Example: “6.1% of decisions have detectable outcomes”
📊 Calculated
- We derived this from other data
- Math is correct, but based on limited inputs
- Example: “12.0% success rate (from 6.1% coverage)”
🔄 Updating
- Data collection is ongoing
- May change as we add sources
- Example: “Employer safety records (2017-2020, new data pending)”
💡 Estimated
- We made an informed guess based on patterns
- Not directly measured
- Example: “ONSBT success rate 40-60% (limited public data)”
🔗 External
- Data from another source
- We did not extract this ourselves
- Example: “WSIB registered claims from Safety Check Portal”
🔍 Methodology: How We Did It
Extraction Process
Step 1: Data Collection (January-April 2026)
- Downloaded all publicly available WSIAT decisions from CanLII
- Downloaded WSIB NEER/CAD-7 employer lists (CSV format)
- Downloaded HRTO quarterly statistical reports (39 Excel files)
- Downloaded WSIB Premium Rate Schedule (PDF)
Step 2: Parsing (April 2026)
- Used ExcelJS library to parse HRTO Excel files
- Custom CSV parser for NEER/CAD-7 files (handles multi-line quoted fields)
- Dynamic header detection (handles metadata rows)
- Extracted 230,392 total records in 23.4 seconds
Step 3: Keyword Matching (April 2026)
- Searched WSIAT decision text for injury keywords:
- “back”, “spine”, “lumbar”, “disc” → Back/Spine Injuries
- “hearing”, “deaf”, “tinnitus” → Hearing Loss
- “chronic pain”, “CRPS”, “fibromyalgia” → Chronic Pain
- [Full keyword list in source code]
- Found 39,556 cases with injury keywords (39.9% coverage)
- Outcome keywords: “allowed”, “denied”, “dismissed”, “partial”
- Found 6,040 cases with outcome keywords (6.1% coverage)
Step 4: Aggregation (April 2026)
- Grouped by year (2016-2025)
- Grouped by city (top 15 Ontario cities)
- Grouped by injury type (top 10 types)
- Generated
aggregated-statistics.jsonwith summary stats
Step 5: Visualization (April 2026)
- Created 5 D3.js interactive charts
- Added data quality warnings to each chart
- Linked to source data and methodology
Quality Control
What we checked:
- ✅ No duplicate records
- ✅ All dates parsed correctly
- ✅ All numbers sum correctly
- ✅ All source links work
- ✅ All statistics cite sources
What we couldn’t check:
- ❌ Accuracy of government source data (we trust WSIAT/WSIB published records)
- ❌ Completeness of CanLII database (assumes all public decisions are available)
- ❌ Keyword matching accuracy (assumes our keywords catch all mentions)
🌐 Accessibility: Research for Everyone
Plain Language
We wrote this research for injured workers, not academics.
Rules we followed:
- Short sentences (under 20 words when possible)
- Common words (not jargon)
- Active voice (“We analyzed” not “It was analyzed”)
- Explanations before statistics
- Tables for comparison
- Bullet points for lists
If something is confusing, tell us - we’ll fix it.
Screen Reader Support
All visualizations have:
alttext describing the chart- Data tables as fallback
- ARIA labels for interactive elements
- Keyboard navigation
All guides have:
- Logical heading hierarchy (H1 → H2 → H3)
- Link text that makes sense out of context
- No “click here” links
- Tables with
<th>headers
High Contrast Mode
All text meets WCAG AAA standards:
- 7:1 contrast ratio for normal text
- 4.5:1 contrast ratio for large text
- Works in Windows High Contrast Mode
- Works in browser dark mode
Multiple Formats
Data available in:
- Interactive visualizations (D3.js charts)
- Plain text guides (Markdown)
- JSON data files (for researchers)
- PDF downloads (coming soon)
📅 Phase 2: What’s Coming
Planned Improvements (2026-2027)
1. Full-Text NLP Classification
- Use AI (GPT-4/Claude) to read all 98,992 decisions
- Extract outcomes from legal language
- Get real success rates (not just 6.1% coverage)
- Impact: Most important improvement
2. Industry × Injury Correlations
- Link WSIAT decisions to WSIB industry codes
- Show injury patterns by sector (Construction, Healthcare, Manufacturing)
- Create industry-specific appeal guides
- Impact: Helps workers compare to their industry peers
3. Temporal Policy Analysis
- Track how WSIAT interpretation changes over time
- Identify policy shifts and landmark decisions
- Show if getting easier/harder to win appeals
- Impact: Helps advocates lobby for policy reform
4. Vice-Chair Patterns (with ethical disclosure)
- Extract panel member names
- Track decision patterns
- Analyze consistency across panels
- Impact: Shows if WSIAT decisions are fair and consistent
What Won’t Change
We will NEVER:
- ❌ Hide limitations to make data look better
- ❌ Claim higher accuracy than we have
- ❌ Predict individual case outcomes
- ❌ Charge for access to basic research
- ❌ Remove data that doesn’t fit our narrative
Transparency is non-negotiable.
🤝 How You Can Help
Report Errors
Found a mistake? Tell us
We promise to:
- Fix it within 48 hours
- Post a correction
- Thank you in acknowledgments (if you want)
Share Your Case
Won your appeal? Share what worked (anonymously)
We’ll:
- Add it to our templates
- Update success rate data
- Help the next injured worker
Suggest Improvements
What data do you need? Request it
Common requests we’re working on:
- More injury types (we have top 10, but there are 50+)
- Industry breakdowns (Phase 2)
- Regional patterns (Phase 2)
- Success rates by representation type (need better outcome detection first)
📚 For Researchers: Technical Details
Dataset Specifications
WSIAT Decisions Dataset
- Size: 98,992 decisions
- Date Range: 1987-2026 (39 years)
- Format: JSON (structured)
- Source: CanLII API
- Fields: decision_id, date, summary, keywords, citation, url
- Limitations: No full decision text (only summaries)
NEER Employer Dataset
- Size: 91,814 employers
- Date Range: 2017-2020 (4 years)
- Format: CSV (WSIB export)
- Fields: firm_number, city, postal_code, rate_group, rebate, surcharge
- Limitations: Missing monetary amounts (field parsing issue)
CAD-7 Employer Dataset
- Size: 38,922 employers
- Date Range: 2017-2020 (4 years)
- Format: CSV (WSIB export)
- Fields: firm_number, city, postal_code, rate_group, adjustment
- Limitations: Fewer fields than NEER
Aggregated Statistics
- Location:
data/comprehensive-extraction/aggregated-statistics.json - Size: ~50KB
- Structure: Hierarchical JSON with yearly, injury, and employer breakdowns
- License: CC BY 4.0 (attribution required)
Replication Instructions
To reproduce our analysis:
- Clone repository:
git clone https://github.com/S0vryn9-C011ect1ve/3mpwrapp.github.io.git - Install dependencies:
npm install - Run extraction:
node scripts/extract-ultra-comprehensive.mjs - Run aggregation:
node scripts/aggregate-real-data.mjs - View results:
data/comprehensive-extraction/aggregated-statistics.json
Computing requirements:
- Node.js 20+
- 16GB RAM
- 10GB disk space
- ~30 minutes runtime
Citation
If you use this data:
3mpwrApp Research Team. (2026). Comprehensive Analysis of 98,992 WSIAT Workplace Injury Appeal Decisions (1987-2026). Retrieved from https://3mpwrapp.pages.dev/data-limitations/
BibTeX:
@misc{3mpwrapp2026wsiat,
author = ,
title = {Comprehensive Analysis of 98,992 WSIAT Workplace Injury Appeal Decisions (1987-2026)},
year = {2026},
url = {https://3mpwrapp.pages.dev/data-limitations/},
note = {Dataset includes 230,392 records from WSIAT, HRTO, NEER, and CAD-7 programs}
}
✉️ Contact
Questions about methodology: feedback@3mpwrapp.ca
Found an error: Report it
Need raw data: Available on request for academic/advocacy use
Media inquiries: Include “MEDIA” in subject line
| *Last updated: April 30, 2026 | Next update: Fall 2026 (Phase 2 NLP analysis)* |