summary refs log tree commit diff
path: root/bin/man1/dehtml.1
blob: c55c35d4543f4b1c729e311e3e7c5abd8ea8e7d5 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
.Dd September  7, 2021
.Dt DEHTML 1
.Os
.
.Sh NAME
.Nm dehtml
.Nd extract text from HTML
.
.Sh SYNOPSIS
.Nm
.Op Fl s
.Op Ar
.
.Sh DESCRIPTION
The
.Nm
utility extracts text
from HTML documents.
Text inside
.Sy <title> ,
.Sy <style>
and
.Sy <script>
tags is discarded.
Numeric and common named HTML entities
are converted.
.
.Pp
The arguments are as follows:
.Bl -tag -width Ds
.It Fl s
Collapse whitespace outside of
.Sy <pre>
tags.
.El
.
.Sh BUGS
There is no way to extract image alt text.
n/shotty.c?id=5bf9a73af5cbf66abf0c84aa480e7c9fa76d4228&follow=1'>Add DL to shottyJune McEnroe 2019-07-12Color html rather than bodyJune McEnroe 2019-07-12Make author consistent and update URLsJune McEnroe 2019-07-12Move to www/text.causal.agencyJune McEnroe 2019-07-12Add new causal.agency with shotty shotsJune McEnroe 2019-07-12Use -s to infer terminal sizeJune McEnroe 2019-07-12Add DCH to shottyJune McEnroe 2019-07-12Support insert mode in shottyJune McEnroe 2019-07-11Don't do carriage return on line feedJune McEnroe 2019-07-11Interpret 256color-style SGRsJune McEnroe 2019-07-11Use inline style rather than <b>, <i>, <u>June McEnroe 2019-07-11Factor out clearJune McEnroe 2019-07-11Add bright option to shottyJune McEnroe 2019-07-11Output <b>, <i>, <u> in shottyJune McEnroe 2019-07-10Ignore SM and RMJune McEnroe 2019-07-09Add shotty man page and build itJune McEnroe 2019-07-09Add up -cJune McEnroe 2019-07-09Add options for default colors to shottyJune McEnroe 2019-07-08Use char literals consistentlyJune McEnroe