267 lines
7.2 KiB
Markdown
267 lines
7.2 KiB
Markdown
|
# parse-entities
|
|||
|
|
|||
|
[![Build][build-badge]][build]
|
|||
|
[![Coverage][coverage-badge]][coverage]
|
|||
|
[![Downloads][downloads-badge]][downloads]
|
|||
|
[![Size][size-badge]][size]
|
|||
|
|
|||
|
Parse HTML character references.
|
|||
|
|
|||
|
## Contents
|
|||
|
|
|||
|
* [What is this?](#what-is-this)
|
|||
|
* [When should I use this?](#when-should-i-use-this)
|
|||
|
* [Install](#install)
|
|||
|
* [Use](#use)
|
|||
|
* [API](#api)
|
|||
|
* [`parseEntities(value[, options])`](#parseentitiesvalue-options)
|
|||
|
* [Types](#types)
|
|||
|
* [Compatibility](#compatibility)
|
|||
|
* [Security](#security)
|
|||
|
* [Related](#related)
|
|||
|
* [Contribute](#contribute)
|
|||
|
* [License](#license)
|
|||
|
|
|||
|
## What is this?
|
|||
|
|
|||
|
This is a small and powerful decoder of HTML character references (often called
|
|||
|
entities).
|
|||
|
|
|||
|
## When should I use this?
|
|||
|
|
|||
|
You can use this for spec-compliant decoding of character references.
|
|||
|
It’s small and fast enough to do that well.
|
|||
|
You can also use this when making a linter, because there are different warnings
|
|||
|
emitted with reasons for why and positional info on where they happened.
|
|||
|
|
|||
|
## Install
|
|||
|
|
|||
|
This package is [ESM only][esm].
|
|||
|
In Node.js (version 14.14+, 16.0+), install with [npm][]:
|
|||
|
|
|||
|
```sh
|
|||
|
npm install parse-entities
|
|||
|
```
|
|||
|
|
|||
|
In Deno with [`esm.sh`][esmsh]:
|
|||
|
|
|||
|
```js
|
|||
|
import {parseEntities} from 'https://esm.sh/parse-entities@3'
|
|||
|
```
|
|||
|
|
|||
|
In browsers with [`esm.sh`][esmsh]:
|
|||
|
|
|||
|
```html
|
|||
|
<script type="module">
|
|||
|
import {parseEntities} from 'https://esm.sh/parse-entities@3?bundle'
|
|||
|
</script>
|
|||
|
```
|
|||
|
|
|||
|
## Use
|
|||
|
|
|||
|
```js
|
|||
|
import {parseEntities} from 'parse-entities'
|
|||
|
|
|||
|
console.log(parseEntities('alpha & bravo')))
|
|||
|
// => alpha & bravo
|
|||
|
|
|||
|
console.log(parseEntities('charlie ©cat; delta'))
|
|||
|
// => charlie ©cat; delta
|
|||
|
|
|||
|
console.log(parseEntities('echo © foxtrot ≠ golf 𝌆 hotel'))
|
|||
|
// => echo © foxtrot ≠ golf 𝌆 hotel
|
|||
|
```
|
|||
|
|
|||
|
## API
|
|||
|
|
|||
|
This package exports the identifier `parseEntities`.
|
|||
|
There is no default export.
|
|||
|
|
|||
|
### `parseEntities(value[, options])`
|
|||
|
|
|||
|
Parse HTML character references.
|
|||
|
|
|||
|
##### `options`
|
|||
|
|
|||
|
Configuration (optional).
|
|||
|
|
|||
|
###### `options.additional`
|
|||
|
|
|||
|
Additional character to accept (`string?`, default: `''`).
|
|||
|
This allows other characters, without error, when following an ampersand.
|
|||
|
|
|||
|
###### `options.attribute`
|
|||
|
|
|||
|
Whether to parse `value` as an attribute value (`boolean?`, default: `false`).
|
|||
|
This results in slightly different behavior.
|
|||
|
|
|||
|
###### `options.nonTerminated`
|
|||
|
|
|||
|
Whether to allow nonterminated references (`boolean`, default: `true`).
|
|||
|
For example, `©cat` for `©cat`.
|
|||
|
This behavior is compliant to the spec but can lead to unexpected results.
|
|||
|
|
|||
|
###### `options.position`
|
|||
|
|
|||
|
Starting `position` of `value` (`Position` or `Point`, optional).
|
|||
|
Useful when dealing with values nested in some sort of syntax tree.
|
|||
|
The default is:
|
|||
|
|
|||
|
```js
|
|||
|
{line: 1, column: 1, offset: 0}
|
|||
|
```
|
|||
|
|
|||
|
###### `options.warning`
|
|||
|
|
|||
|
Error handler ([`Function?`][warning]).
|
|||
|
|
|||
|
###### `options.text`
|
|||
|
|
|||
|
Text handler ([`Function?`][text]).
|
|||
|
|
|||
|
###### `options.reference`
|
|||
|
|
|||
|
Reference handler ([`Function?`][reference]).
|
|||
|
|
|||
|
###### `options.warningContext`
|
|||
|
|
|||
|
Context used when calling `warning` (`'*'`, optional).
|
|||
|
|
|||
|
###### `options.textContext`
|
|||
|
|
|||
|
Context used when calling `text` (`'*'`, optional).
|
|||
|
|
|||
|
###### `options.referenceContext`
|
|||
|
|
|||
|
Context used when calling `reference` (`'*'`, optional)
|
|||
|
|
|||
|
##### Returns
|
|||
|
|
|||
|
`string` — decoded `value`.
|
|||
|
|
|||
|
#### `function warning(reason, point, code)`
|
|||
|
|
|||
|
Error handler.
|
|||
|
|
|||
|
###### Parameters
|
|||
|
|
|||
|
* `this` (`*`) — refers to `warningContext` when given to `parseEntities`
|
|||
|
* `reason` (`string`) — human readable reason for emitting a parse error
|
|||
|
* `point` ([`Point`][point]) — place where the error occurred
|
|||
|
* `code` (`number`) — machine readable code the error
|
|||
|
|
|||
|
The following codes are used:
|
|||
|
|
|||
|
| Code | Example | Note |
|
|||
|
| ---- | ------------------ | --------------------------------------------- |
|
|||
|
| `1` | `foo & bar` | Missing semicolon (named) |
|
|||
|
| `2` | `foo { bar` | Missing semicolon (numeric) |
|
|||
|
| `3` | `Foo &bar baz` | Empty (named) |
|
|||
|
| `4` | `Foo &#` | Empty (numeric) |
|
|||
|
| `5` | `Foo &bar; baz` | Unknown (named) |
|
|||
|
| `6` | `Foo € baz` | [Disallowed reference][invalid] |
|
|||
|
| `7` | `Foo � baz` | Prohibited: outside permissible unicode range |
|
|||
|
|
|||
|
#### `function text(value, position)`
|
|||
|
|
|||
|
Text handler.
|
|||
|
|
|||
|
###### Parameters
|
|||
|
|
|||
|
* `this` (`*`) — refers to `textContext` when given to `parseEntities`
|
|||
|
* `value` (`string`) — string of content
|
|||
|
* `position` ([`Position`][position]) — place where `value` starts and ends
|
|||
|
|
|||
|
#### `function reference(value, position, source)`
|
|||
|
|
|||
|
Character reference handler.
|
|||
|
|
|||
|
###### Parameters
|
|||
|
|
|||
|
* `this` (`*`) — refers to `referenceContext` when given to `parseEntities`
|
|||
|
* `value` (`string`) — decoded character reference
|
|||
|
* `position` ([`Position`][position]) — place where `source` starts and ends
|
|||
|
* `source` (`string`) — raw source of character reference
|
|||
|
|
|||
|
## Types
|
|||
|
|
|||
|
This package is fully typed with [TypeScript][].
|
|||
|
It exports the additional types `Options`, `WarningHandler`,
|
|||
|
`ReferenceHandler`, and `TextHandler`.
|
|||
|
|
|||
|
## Compatibility
|
|||
|
|
|||
|
This package is at least compatible with all maintained versions of Node.js.
|
|||
|
As of now, that is Node.js 14.14+ and 16.0+.
|
|||
|
It also works in Deno and modern browsers.
|
|||
|
|
|||
|
## Security
|
|||
|
|
|||
|
This package is safe: it matches the HTML spec to parse character references.
|
|||
|
|
|||
|
## Related
|
|||
|
|
|||
|
* [`wooorm/stringify-entities`](https://github.com/wooorm/stringify-entities)
|
|||
|
— encode HTML character references
|
|||
|
* [`wooorm/character-entities`](https://github.com/wooorm/character-entities)
|
|||
|
— info on character references
|
|||
|
* [`wooorm/character-entities-html4`](https://github.com/wooorm/character-entities-html4)
|
|||
|
— info on HTML4 character references
|
|||
|
* [`wooorm/character-entities-legacy`](https://github.com/wooorm/character-entities-legacy)
|
|||
|
— info on legacy character references
|
|||
|
* [`wooorm/character-reference-invalid`](https://github.com/wooorm/character-reference-invalid)
|
|||
|
— info on invalid numeric character references
|
|||
|
|
|||
|
## Contribute
|
|||
|
|
|||
|
Yes please!
|
|||
|
See [How to Contribute to Open Source][contribute].
|
|||
|
|
|||
|
## License
|
|||
|
|
|||
|
[MIT][license] © [Titus Wormer][author]
|
|||
|
|
|||
|
<!-- Definitions -->
|
|||
|
|
|||
|
[build-badge]: https://github.com/wooorm/parse-entities/workflows/main/badge.svg
|
|||
|
|
|||
|
[build]: https://github.com/wooorm/parse-entities/actions
|
|||
|
|
|||
|
[coverage-badge]: https://img.shields.io/codecov/c/github/wooorm/parse-entities.svg
|
|||
|
|
|||
|
[coverage]: https://codecov.io/github/wooorm/parse-entities
|
|||
|
|
|||
|
[downloads-badge]: https://img.shields.io/npm/dm/parse-entities.svg
|
|||
|
|
|||
|
[downloads]: https://www.npmjs.com/package/parse-entities
|
|||
|
|
|||
|
[size-badge]: https://img.shields.io/bundlephobia/minzip/parse-entities.svg
|
|||
|
|
|||
|
[size]: https://bundlephobia.com/result?p=parse-entities
|
|||
|
|
|||
|
[npm]: https://docs.npmjs.com/cli/install
|
|||
|
|
|||
|
[esmsh]: https://esm.sh
|
|||
|
|
|||
|
[license]: license
|
|||
|
|
|||
|
[author]: https://wooorm.com
|
|||
|
|
|||
|
[esm]: https://gist.github.com/sindresorhus/a39789f98801d908bbc7ff3ecc99d99c
|
|||
|
|
|||
|
[typescript]: https://www.typescriptlang.org
|
|||
|
|
|||
|
[warning]: #function-warningreason-point-code
|
|||
|
|
|||
|
[text]: #function-textvalue-position
|
|||
|
|
|||
|
[reference]: #function-referencevalue-position-source
|
|||
|
|
|||
|
[invalid]: https://github.com/wooorm/character-reference-invalid
|
|||
|
|
|||
|
[point]: https://github.com/syntax-tree/unist#point
|
|||
|
|
|||
|
[position]: https://github.com/syntax-tree/unist#position
|
|||
|
|
|||
|
[contribute]: https://opensource.guide/how-to-contribute/
|