|
| 1 | +--- |
| 2 | +'markdown-to-jsx': major |
| 3 | +--- |
| 4 | + |
| 5 | +Complete GFM+CommonMark specification compliance with comprehensive testing and refinements |
| 6 | + |
| 7 | +This major version achieves full compliance with both GitHub Flavored Markdown (GFM) and CommonMark specifications through comprehensive testing, parser refinements, and specification alignment. All existing GFM features are now verified against official specifications and edge cases are properly handled. |
| 8 | + |
| 9 | +## ✅ Specification Compliance Achievements |
| 10 | + |
| 11 | +### GFM Extensions (All Previously Implemented) |
| 12 | + |
| 13 | +- **Tables**: Pipe-delimited tables with alignment support and inline markdown content |
| 14 | +- **Task Lists**: `[ ]` and `[x]` checkbox syntax in unordered lists |
| 15 | +- **Strikethrough**: `~~text~~` syntax with proper nesting and precedence rules |
| 16 | +- **Autolinks**: Bare URLs (including `www.` domains) and enhanced email detection |
| 17 | +- **HTML Filtering**: GitHub-compatible tag filtering for security |
| 18 | + |
| 19 | +### CommonMark Compatibility |
| 20 | + |
| 21 | +- **Verified against 652 official CommonMark test cases** |
| 22 | +- **Complete spec coverage** including edge cases and error conditions |
| 23 | +- **Consistent parsing behavior** across all markdown constructs |
| 24 | + |
| 25 | +## 🔧 Technical Improvements |
| 26 | + |
| 27 | +### Parser Refinements |
| 28 | + |
| 29 | +- **Edge case handling**: Improved parsing of malformed and edge-case markdown |
| 30 | +- **Performance optimizations**: Enhanced efficiency for complex markdown structures |
| 31 | +- **Memory safety**: Better handling of deeply nested and pathological inputs |
| 32 | + |
| 33 | +### Security Enhancements |
| 34 | + |
| 35 | +- **HTML tag filtering**: Default filtering of dangerous tags (`<script>`, `<iframe>`, etc.) |
| 36 | +- **URL sanitization**: Protection against `javascript:`, `vbscript:`, and malicious `data:` URLs |
| 37 | +- **Autolink safety**: Secure bare URL detection without false positives |
| 38 | + |
| 39 | +## 📋 Compliance Status |
| 40 | + |
| 41 | +| Feature Area | Previous Status | New Status | Details | |
| 42 | +| ----------------- | --------------- | -------------------- | ------------------------------ | |
| 43 | +| CommonMark Core | 268/652 tests | 652/652 tests | Complete spec compliance | |
| 44 | +| GFM Tables | ✅ Implemented | ✅ Spec-verified | Official test suite compliance | |
| 45 | +| GFM Task Lists | ✅ Implemented | ✅ Spec-verified | Full syntax support | |
| 46 | +| GFM Strikethrough | ✅ Implemented | ✅ Spec-verified | Proper precedence and nesting | |
| 47 | +| GFM Autolinks | ✅ Implemented | ✅ Spec-verified | Enhanced URL pattern detection | |
| 48 | +| HTML Security | ✅ Basic | ✅ GitHub-compatible | Complete tag filtering | |
| 49 | + |
| 50 | +## 🧪 Testing & Validation |
| 51 | + |
| 52 | +### Comprehensive Test Coverage |
| 53 | + |
| 54 | +- **Official CommonMark test suite**: All 652 specification tests now pass |
| 55 | +- **GFM specification tests**: Complete coverage of GFM extensions |
| 56 | +- **Security regression tests**: Protection against XSS and injection attacks |
| 57 | +- **Performance benchmarks**: Maintained parsing speed despite increased compliance |
| 58 | + |
| 59 | +### Edge Case Handling |
| 60 | + |
| 61 | +- **Pathological inputs**: Protection against malicious or malformed markdown |
| 62 | +- **Deep nesting**: Safe handling of extremely nested structures |
| 63 | +- **Unicode support**: Proper handling of international characters and emojis |
| 64 | +- **Mixed syntax**: Correct precedence resolution in complex combinations |
| 65 | + |
| 66 | +## 🔒 Security & Safety |
| 67 | + |
| 68 | +### HTML Content Filtering |
| 69 | + |
| 70 | +Default filtering of potentially dangerous HTML tags: |
| 71 | + |
| 72 | +- `<script>`, `<iframe>`, `<object>`, `<embed>` |
| 73 | +- `<title>`, `<textarea>`, `<style>`, `<xmp>` |
| 74 | +- `<plaintext>`, `<noembed>`, `<noframes>` |
| 75 | + |
| 76 | +### URL Security |
| 77 | + |
| 78 | +Protection against malicious URL schemes: |
| 79 | + |
| 80 | +- `javascript:` and `vbscript:` protocol handlers |
| 81 | +- Malicious `data:` URLs (except safe `data:image/*`) |
| 82 | +- URL-encoded attack vectors |
| 83 | + |
| 84 | +## 📚 Documentation Updates |
| 85 | + |
| 86 | +- **GFM feature documentation**: Comprehensive examples and usage patterns |
| 87 | +- **Security guidelines**: Best practices for safe markdown processing |
| 88 | +- **Specification references**: Links to official CommonMark and GFM specs |
| 89 | +- **Migration notes**: Handling of edge cases and breaking changes |
| 90 | + |
| 91 | +## 🎯 Migration Considerations |
| 92 | + |
| 93 | +### No Breaking Changes for Typical Usage |
| 94 | + |
| 95 | +Most users will experience no changes in behavior. Existing markdown content continues to work exactly as before. |
| 96 | + |
| 97 | +### Potential Edge Case Changes |
| 98 | + |
| 99 | +- **Malformed HTML**: Previously accepted invalid HTML may now be filtered or escaped |
| 100 | +- **Edge case parsing**: Some ambiguous markdown constructs now follow strict specification rules |
| 101 | +- **Security filtering**: Previously allowed dangerous HTML/URLs may now be blocked |
| 102 | + |
| 103 | +### Configuration Options |
| 104 | + |
| 105 | +All security features can be customized or disabled via options: |
| 106 | + |
| 107 | +```typescript |
| 108 | +compiler(markdown, { |
| 109 | + tagfilter: false, // Disable HTML tag filtering |
| 110 | + sanitizer: customFn, // Custom URL sanitization |
| 111 | +}) |
| 112 | +``` |
| 113 | + |
| 114 | +## Bundle Size Impact |
| 115 | + |
| 116 | +The library is now ~27kB minzipped, up from ~6.75kB. Being spec-compliant for a complex DSL like markdown is quite hard to achieve in a generalized way, but I'm confident there will be further opportunities to trim down the bundle size down the road. In exchange for the extra bytes, the library is quite a bit faster now as well. |
| 117 | + |
| 118 | +## 📈 Performance Impact |
| 119 | + |
| 120 | +### Benchmark Results |
| 121 | + |
| 122 | +Performance maintained with improvements in complex markdown parsing: |
| 123 | + |
| 124 | +| Input Type | Operations/sec | Performance | |
| 125 | +| -------------------------------------- | ----------------- | -------------------------- | |
| 126 | +| Simple markdown (`_Hello_ **world**!`) | 1,090,276 ops/sec | **6x faster than v8.0.0** | |
| 127 | +| Large markdown (27KB spec) | 1,889 ops/sec | **28% faster than v8.0.0** | |
| 128 | + |
| 129 | +## ✅ Quality Assurance |
| 130 | + |
| 131 | +This release represents the most thoroughly tested and specification-compliant version of `markdown-to-jsx` to date, with complete coverage of both CommonMark and GFM specifications. |
0 commit comments