10000 Add recipe on remark and HTML · NodeGo/unifiedjs.github.io@d68f26c · GitHub
[go: up one dir, main page]

Skip to content

Commit d68f26c

Browse files
committed
Add recipe on remark and HTML
1 parent dc12a67 commit d68f26c

File tree

4 files changed

+180
-2
lines changed

4 files changed

+180
-2
lines changed

dictionary.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ stdout
2020
stringifier
2121
stringify
2222
syntaxes
23+
XSS
2324

2425
// Names, products, etc.
2526
BundlePhobia

doc/learn/remark-html.md

Lines changed: 176 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,176 @@
1+
---
2+
group: recipe
3+
index: 6
4+
title: HTML and remark
5+
description: How to use remark to turn markdown into HTML, and to allow embedded HTML inside markdown
6+
tags:
7+
- remark
8+
- html
9+
- plugin
10+
- markdown
11+
- html
12+
- parse
13+
author: Titus Wormer
14+
authorTwitter: wooorm
15+
authorGithub: wooorm
16+
published: 2021-03-09
17+
modified: 2021-03-09
18+
---
19+
20+
## HTML and remark
21+
22+
remark is a markdown compiler.
23+
It’s concerned with HTML in two ways:
24+
25+
1. markdown is often turned into HTML
26+
2. markdown sometimes has embedded HTML
27+
28+
When dealing with HTML and markdown, we will use both remark and rehype.
29+
This article shows some examples of how to do that.
30+
31+
### Contents
32+
33+
* [How to turn markdown into HTML](#how-to-turn-markdown-into-html)
34+
* [How to turn HTML into markdown](#how-to-turn-html-into-markdown)
35+
* [How to allow HTML embedded in markdown](#how-to-allow-html-embedded-in-markdown)
36+
* [How to properly support HTML inside markdown](#how-to-properly-support-html-inside-markdown)
37+
38+
### How to turn markdown into HTML
39+
40+
remark handles markdown: it can parse and serialize it.
41+
But it’s **not** for HTML.
42+
That’s what rehype does, which exists to parse and serialize HTML.
43+
44+
To turn markdown into HTML, we need [`remark-parse`][remark-parse],
45+
[`remark-rehype`][remark-rehype], and [`rehype-stringify`][rehype-stringify]:
46+
47+
```javascript
48+
var unified = require('unified')
49+
var remarkParse = require('remark-parse')
50+
var remarkRehype = require('remark-rehype')
51+
var rehypeStringify = require('rehype-stringify')
52+
53+
unified()
54+
.use(remarkParse) // Parse markdown content to a syntax tree
55+
.use(remarkRehype) // Turn markdown syntax tree to HTML syntax tree, ignoring embedded HTML
56+
.use(rehypeStringify) // Serialize HTML syntax tree
57+
.process('*emphasis* and **strong**')
58+
.then((file) => console.log(String(file)))
59+
.catch((error) => {
60+
throw error
61+
})
62+
```
63+
64+
This turns `*emphasis* and **strong**` into
65+
`<em>emphasis</em> and <strong>strong</strong>`, but it does not support HTML
66+
embedded inside markdown (such as `*emphasis* and <strong>strong</strong>`).
67+
68+
This solution **is safe**: content you don’t trust cannot cause an XSS
69+
vulnerability.
70+
71+
### How to turn HTML into markdown
72+
73+
We can also do the inverse.
74+
To turn HTML into markdown, we need [`rehype-parse`][rehype-parse],
75+
[`rehype-remark`][rehype-remark], and [`remark-stringify`][remark-stringify]:
76+
77+
```javascript
78+
var unified = require('unified')
79+
var rehypeParse = require('rehype-parse')
80+
var rehypeRemark = require('rehype-remark')
81+
var remarkStringify = require('remark-stringify')
82+
83+
unified()
84+
.use(rehypeParse) // Parse HTML to a syntax tree
85+
.use(rehypeRemark) // Turn HTML syntax tree to markdown syntax tree
86+
.use(remarkStringify) // Serialize HTML syntax tree
87+
.process('<em>emphasis</em> and <strong>strong</strong>')
88+
.then((file) => console.log(String(file)))
89+
.catch((error) => {
90+
throw error
91+
})
92+
```
93+
94+
This turns `<em& B41A gt;emphasis</em> and <strong>strong</strong>` into
95+
`*emphasis* and **strong**`.
96+
97+
### How to allow HTML embedded in markdown
98+
99+
Markdown is a content format that’s great for the more basic things:
100+
it’s nicer to write `*emphasis*` than `<em>emphasis</em>`.
101+
But, it’s limited: only a couple things are supported with its terse syntax.
102+
Luckily, for more complex things, markdown allows HTML inside it.
103+
A common example of this is to include a `<details>` element.
104+
105+
HTML embedded in markdown can be allowed when going from markdown to HTML
106+
by configuring [`remark-rehype`][remark-rehype] and
107+
[`rehype-stringify`][rehype-stringify]:
108+
109+
```javascript
110+
var unified = require('unified')
111+
var remarkParse = require('remark-parse')
112+
var remarkRehype = require('remark-rehype')
113+
var rehypeStringify = require('rehype-stringify')
114+
115+
unified()
116+
.use(remarkParse)
117+
.use(remarkRehype, {allowDangerousHtml: true}) // Pass raw HTML strings through.
118+
.use(rehypeStringify, {allowDangerousHtml: true}) // Serialize the raw HTML strings
119+
.process('*emphasis* and <strong>strong</strong>')
120+
.then((file) => console.log(String(file)))
121+
.catch((error) => {
122+
throw error
123+
})
124+
```
125+
126+
This solution **is not safe**: content you don’t trust can cause XSS
127+
vulnerabilities.
128+
129+
### How to properly support HTML inside markdown
130+
131+
To properly support HTML embedded inside markdown, we need another plugin:
132+
[`rehype-raw`][rehype-raw].
133+
This plugin will take the strings of HTML embedded in markdown and parse them
134+
with an actual HTML parser.
135+
136+
```javascript
137+
var unified = require('unified')
138+
var remarkParse = require('remark-parse')
139+
var remarkRehype = require('remark-rehype')
140+
var rehypeRaw = require('rehype-raw')
141+
var rehypeStringify = require('rehype-stringify')
142+
143+
unified()
144+
.use(remarkParse)
145+
.use(remarkRehype, {allowDangerousHtml: true})
146+
.use(rehypeRaw) // *Parse* the raw HTML strings embedded in the tree
147+
.use(rehypeStringify)
148+
.process('*emphasis* and <strong>strong</strong>')
149+
.then((file) => console.log(String(file)))
150+
.catch((error) => {
151+
throw error
152+
})
153+
```
154+
155+
This solution **is not safe**: content you don’t trust can cause XSS
156+
vulnerabilities.
157+
158+
But because we now have a complete HTML syntax tree, we can sanitize that tree.
159+
For a safe solution, add [`rehype-sanitize`][rehype-sanitize] right before
160+
`rehype-stringify`.
161+
162+
[remark-parse]: https://github.com/remarkjs/remark/tree/main/packages/remark-parse
163+
164+
[remark-stringify]: https://github.com/remarkjs/remark/tree/main/packages/remark-stringify
165+
166+
[remark-rehype]: https://github.com/remarkjs/remark-rehype
167+
168+
[rehype-parse]: https://github.com/rehypejs/rehype/tree/main/packages/rehype-parse
169+
170+
[rehype-stringify]: https://github.com/rehypejs/rehype/tree/main/packages/rehype-stringify
171+
172+
[rehype-remark]: https://github.com/rehypejs/rehype-remark
173+
174+
[rehype-raw]: https://github.com/rehypejs/rehype-raw
175+
176+
[rehype-sanitize]: https://github.com/rehypejs/rehype-sanitize

doc/learn/remove-node.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
group: recipe
3-
index: 5
3+
index: 7
44
title: Remove a node
55
description: How to remove nodes in any unist tree
66
tags:

generate/pipeline/article.js

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,8 @@ module.exports = unified()
4141
JSX: null,
4242
MDX: null,
4343
PR: 'Pull request',
44-
XML: 'Extensible Markup Language'
44+
XML: 'Extensible Markup Language',
45+
XSS: 'Cross Site Scripting'
4546
})
4647
.use(rehypeLink)
4748
.use(rewriteUrls, {origin})

0 commit comments

Comments
 (0)
0