summary refs log tree commit diff
path: root/bin/man1/dehtml.1
diff options
context:
space:
mode:
Diffstat (limited to 'bin/man1/dehtml.1')
-rw-r--r--bin/man1/dehtml.138
1 files changed, 38 insertions, 0 deletions
diff --git a/bin/man1/dehtml.1 b/bin/man1/dehtml.1
new file mode 100644
index 00000000..c55c35d4
--- /dev/null
+++ b/bin/man1/dehtml.1
@@ -0,0 +1,38 @@
+.Dd September  7, 2021
+.Dt DEHTML 1
+.Os
+.
+.Sh NAME
+.Nm dehtml
+.Nd extract text from HTML
+.
+.Sh SYNOPSIS
+.Nm
+.Op Fl s
+.Op Ar
+.
+.Sh DESCRIPTION
+The
+.Nm
+utility extracts text
+from HTML documents.
+Text inside
+.Sy <title> ,
+.Sy <style>
+and
+.Sy <script>
+tags is discarded.
+Numeric and common named HTML entities
+are converted.
+.
+.Pp
+The arguments are as follows:
+.Bl -tag -width Ds
+.It Fl s
+Collapse whitespace outside of
+.Sy <pre>
+tags.
+.El
+.
+.Sh BUGS
+There is no way to extract image alt text.
td> 2021-09-14Sort by title if authors matchJune McEnroe 2021-09-13Swap-remove tags as they're foundJune McEnroe 2021-09-12Replace htagml regex with strncmpJune McEnroe 2021-09-11Also defer printing comment for lone close-parensJune McEnroe 2021-09-10Publish "git-comment"June McEnroe 2021-09-10Add git comment --pretty optionJune McEnroe 2021-09-08Defer printing comment if line is blank or closing braceJune McEnroe 2021-09-08Up default min-repeat to 30 linesJune McEnroe 2021-09-08Handle dirty lines in git-commentJune McEnroe 2021-09-08Document and install git-commentJune McEnroe 2021-09-08Add repeat and all options to git-commentJune McEnroe 2021-09-08Add group threshold to git-commentJune McEnroe