summary refs log tree commit diff
path: root/bin/man1/dehtml.1
blob: c55c35d4543f4b1c729e311e3e7c5abd8ea8e7d5 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
.Dd September  7, 2021
.Dt DEHTML 1
.Os
.
.Sh NAME
.Nm dehtml
.Nd extract text from HTML
.
.Sh SYNOPSIS
.Nm
.Op Fl s
.Op Ar
.
.Sh DESCRIPTION
The
.Nm
utility extracts text
from HTML documents.
Text inside
.Sy <title> ,
.Sy <style>
and
.Sy <script>
tags is discarded.
Numeric and common named HTML entities
are converted.
.
.Pp
The arguments are as follows:
.Bl -tag -width Ds
.It Fl s
Collapse whitespace outside of
.Sy <pre>
tags.
.El
.
.Sh BUGS
There is no way to extract image alt text.
/span>Switch gr alias back to git rebaseJune McEnroe 2020-10-27Allow cd host: to cd to same path over sshJune McEnroe 2020-10-27Use SendEnv for cd host:pathJune McEnroe 2020-10-27Allow cd host:path over sshJune McEnroe 2020-10-07Use mandoc -T utf8 for text.June McEnroe 2020-09-20Add The Awakened KingdomJune McEnroe 2020-09-12Move /opt/local back, cheat port select to use system manJune McEnroe 2020-09-12Move /opt/local behind /usr againJune McEnroe 2020-09-12Enable toc in cgit renderings of man pagesJune McEnroe 2020-09-11Install mandoc on macOSJune McEnroe 2020-09-11Rewrite install script yet againJune McEnroe 2020-09-11Remove NetBSD from install scriptJune McEnroe 2020-09-11Use MacPorts rather than pkgsrcJune McEnroe 2020-09-11Add debian VM name to sshJune McEnroe 2020-09-11Add influencer tweetJune McEnroe 2020-09-10Add The Kingdom of GodsJune McEnroe 2020-09-07Add SunglassesJune McEnroe 2020-09-06Add Between the BreathsJune McEnroe 2020-09-04Open /dev/tty in nudgeJune McEnroe 2020-09-04Add nudgeJune McEnroe 2020-09-03Build fbclock with -lzJune McEnroe 2020-08-29Add tweets from retweetsJune McEnroe