Async vs Sync

Understanding Lectito's async and synchronous APIs.

Overview

Lectito provides both synchronous and asynchronous APIs:

FunctionAsync/SyncUse Case
parse()SyncParse HTML from string
parse_with_url()SyncParse with URL context
fetch_and_parse()AsyncFetch from URL then parse
fetch_url()AsyncFetch HTML from URL

When to Use Each

Use Sync APIs When

  • You already have the HTML as a string
  • You're using your own HTTP client
  • Performance is not critical
  • You're integrating into synchronous code

Use Async APIs When

  • You need to fetch from URLs
  • You're already using async/await
  • You want concurrent fetches
  • Performance matters for network operations

Synchronous Parsing

Parse HTML that you already have:

use lectito_core::parse;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let html = "<html>...</html>";
    let article = parse(html)?;
    Ok(())
}

Asynchronous Fetching

Fetch and parse in one operation:

use lectito_core::fetch_and_parse;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let url = "https://example.com/article";
    let article = fetch_and_parse(url).await?;
    Ok(())
}

Manual Fetch and Parse

Use your own HTTP client:

use lectito_core::parse;
use reqwest::Client;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = Client::new();
    let response = client.get("https://example.com/article")
        .send()
        .await?;

    let html = response.text().await?;
    let article = parse(&html)?;

    Ok(())
}

Concurrent Fetches

Fetch multiple articles concurrently:

use lectito_core::fetch_and_parse;
use futures::future::join_all;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let urls = vec![
        "https://example.com/article1",
        "https://example.com/article2",
        "https://example.com/article3",
    ];

    let futures: Vec<_> = urls.into_iter()
        .map(|url| fetch_and_parse(url))
        .collect();

    let articles = join_all(futures).await;

    for article in articles {
        match article {
            Ok(a) => println!("Got: {:?}", a.metadata.title),
            Err(e) => eprintln!("Error: {}", e),
        }
    }

    Ok(())
}

Batch Processing

Process URLs with concurrency limits:

use lectito_core::fetch_and_parse;
use futures::stream::{StreamExt, try_stream};

async fn process_urls(urls: Vec<String>) -> Result<(), Box<dyn std::error::Error>> {
    let stream = try_stream! {
        for url in urls {
            let article = fetch_and_parse(&url).await?;
            yield article;
        }
    };

    let mut stream = stream.buffer_unordered(5); // 5 concurrent requests

    while let Some(article) = stream.next().await {
        println!("Processed: {:?}", article?.metadata.title);
    }

    Ok(())
}

Sync Code in Async Context

If you need to use sync parsing in async code:

use lectito_core::parse;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Fetch with your async HTTP client
    let html = fetch_html().await?;

    // Parse is sync, but that's fine in async context
    let article = parse(&html)?;

    Ok(())
}

async fn fetch_html() -> Result<String, Box<dyn std::error::Error>> {
    // Your async fetching logic
    Ok(String::from("<html>...</html>"))
}

Performance Considerations

Parsing (Sync)

Parsing is CPU-bound and runs synchronously:

use lectito_core::parse;
use std::time::Instant;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let html = "<html>...</html>";

    let start = Instant::now();
    let article = parse(html)?;
    let duration = start.elapsed();

    println!("Parsed in {:?}", duration);

    Ok(())
}

Fetching (Async)

Fetching is I/O-bound and benefits from async:

use lectito_core::fetch_and_parse;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let start = std::time::Instant::now();
    let article = fetch_and_parse("https://example.com/article").await?;
    let duration = start.elapsed();

    println!("Fetched and parsed in {:?}", duration);

    Ok(())
}

Choosing the Right Approach

ScenarioRecommended Approach
Have HTML stringparse() (sync)
Need to fetch URLfetch_and_parse() (async)
Custom HTTP clientYour client + parse() (sync)
Batch URL processingfetch_and_parse() with concurrent futures
CLI toolDepends on your runtime setup
Web serverfetch_and_parse() (async) for better throughput

Feature Flags

To disable async features and reduce dependencies:

[dependencies]
lectito-core = { version = "0.1", default-features = false, features = ["markdown"] }

This removes reqwest and tokio dependencies. You'll need to fetch HTML yourself.

Next Steps