CLI Usage

Complete reference for the lectito command-line tool.

Basic Syntax

lectito [OPTIONS] <INPUT>

The INPUT can be:

  • A URL (starts with http:// or https://)
  • A local file path
  • - to read from stdin

Examples

URL Extraction

lectito https://example.com/article

Local File

lectito article.html

Stdin Pipe

curl https://example.com | lectito -
cat page.html | lectito -
wget -qO- https://example.com | lectito -

Options

-o, --output <FILE>

Write output to a file instead of stdout.

lectito https://example.com/article -o article.md

-f, --format <FORMAT>

Specify output format. Available formats:

FormatDescription
markdown or mdMarkdown (default)
jsonStructured JSON
text or txtPlain text
htmlCleaned HTML
lectito https://example.com/article -f json

--timeout <SECONDS>

HTTP request timeout in seconds (default: 30).

lectito https://example.com/article --timeout 60

--user-agent <USER_AGENT>

Custom User-Agent header.

lectito https://example.com/article --user-agent "MyBot/1.0"

--config <PATH>

Path to site configuration file (TOML format).

lectito https://example.com/article --config site-config.toml

-v, --verbose

Enable verbose debug logging.

lectito https://example.com/article -v

-h, --help

Display help information.

lectito --help

-V, --version

Display version information.

lectito --version

Common Workflows

Extract and Save Article

lectito https://example.com/article -o articles/article.md

Batch Processing Multiple URLs

while read url; do
    lectito "$url" -o "articles/$(date +%s).md"
done < urls.txt

Extract to JSON for Processing

lectito https://example.com/article --format json | jq '.metadata.title'

Extract from Multiple Files

for file in articles/*.html; do
    lectito "$file" -o "processed/$(basename "$file" .html).md"
done

Custom Timeout for Slow Sites

lectito https://slow-site.com/article --timeout 120

Output Formats

Markdown (Default)

Output includes TOML frontmatter with metadata (when --frontmatter is used):

+++
title = "Article Title"
author = "John Doe"
date = "2025-01-17"
excerpt = "A brief description..."
+++

# Article Title

Article content here...

JSON

Structured output with all metadata:

{
    "metadata": {
        "title": "Article Title",
        "author": "John Doe",
        "date": "2025-01-17",
        "excerpt": "A brief description..."
    },
    "content": "<div>...</div>",
    "text_content": "Article content here...",
    "word_count": 500
}

Plain Text

Just the article text without formatting:

Article Title

Article content here...

Exit Codes

CodeMeaning
0Success
1Error (invalid URL, network failure, etc.)

Error Handling

The CLI will print error messages to stderr:

lectito https://invalid-domain-xyz.com
# Error: failed to fetch URL: dns error: failed to lookup address information

For content that isn't readable:

lectito https://example.com/page
# Error: content not readable: score 15.2 < threshold 20.0

Tips

  1. Use timeouts: Set appropriate timeouts to avoid hanging
  2. Batch operations: Process multiple URLs in parallel
  3. Save to file: Use -o to avoid terminal rendering overhead
  4. JSON for parsing: Use JSON output when processing with other tools

Next Steps