Async vs Sync
Understanding Lectito's async and synchronous APIs.
Overview
Lectito provides both synchronous and asynchronous APIs:
| Function | Async/Sync | Use Case |
|---|---|---|
parse() | Sync | Parse HTML from string |
parse_with_url() | Sync | Parse with URL context |
fetch_and_parse() | Async | Fetch from URL then parse |
fetch_url() | Async | Fetch HTML from URL |
When to Use Each
Use Sync APIs When
- You already have the HTML as a string
- You're using your own HTTP client
- Performance is not critical
- You're integrating into synchronous code
Use Async APIs When
- You need to fetch from URLs
- You're already using async/await
- You want concurrent fetches
- Performance matters for network operations
Synchronous Parsing
Parse HTML that you already have:
use lectito_core::parse;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let html = "<html>...</html>";
let article = parse(html)?;
Ok(())
}
Asynchronous Fetching
Fetch and parse in one operation:
use lectito_core::fetch_and_parse;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let url = "https://example.com/article";
let article = fetch_and_parse(url).await?;
Ok(())
}
Manual Fetch and Parse
Use your own HTTP client:
use lectito_core::parse;
use reqwest::Client;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let client = Client::new();
let response = client.get("https://example.com/article")
.send()
.await?;
let html = response.text().await?;
let article = parse(&html)?;
Ok(())
}
Concurrent Fetches
Fetch multiple articles concurrently:
use lectito_core::fetch_and_parse;
use futures::future::join_all;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let urls = vec![
"https://example.com/article1",
"https://example.com/article2",
"https://example.com/article3",
];
let futures: Vec<_> = urls.into_iter()
.map(|url| fetch_and_parse(url))
.collect();
let articles = join_all(futures).await;
for article in articles {
match article {
Ok(a) => println!("Got: {:?}", a.metadata.title),
Err(e) => eprintln!("Error: {}", e),
}
}
Ok(())
}
Batch Processing
Process URLs with concurrency limits:
use lectito_core::fetch_and_parse;
use futures::stream::{StreamExt, try_stream};
async fn process_urls(urls: Vec<String>) -> Result<(), Box<dyn std::error::Error>> {
let stream = try_stream! {
for url in urls {
let article = fetch_and_parse(&url).await?;
yield article;
}
};
let mut stream = stream.buffer_unordered(5); // 5 concurrent requests
while let Some(article) = stream.next().await {
println!("Processed: {:?}", article?.metadata.title);
}
Ok(())
}
Sync Code in Async Context
If you need to use sync parsing in async code:
use lectito_core::parse;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Fetch with your async HTTP client
let html = fetch_html().await?;
// Parse is sync, but that's fine in async context
let article = parse(&html)?;
Ok(())
}
async fn fetch_html() -> Result<String, Box<dyn std::error::Error>> {
// Your async fetching logic
Ok(String::from("<html>...</html>"))
}
Performance Considerations
Parsing (Sync)
Parsing is CPU-bound and runs synchronously:
use lectito_core::parse;
use std::time::Instant;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let html = "<html>...</html>";
let start = Instant::now();
let article = parse(html)?;
let duration = start.elapsed();
println!("Parsed in {:?}", duration);
Ok(())
}
Fetching (Async)
Fetching is I/O-bound and benefits from async:
use lectito_core::fetch_and_parse;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let start = std::time::Instant::now();
let article = fetch_and_parse("https://example.com/article").await?;
let duration = start.elapsed();
println!("Fetched and parsed in {:?}", duration);
Ok(())
}
Choosing the Right Approach
| Scenario | Recommended Approach |
|---|---|
| Have HTML string | parse() (sync) |
| Need to fetch URL | fetch_and_parse() (async) |
| Custom HTTP client | Your client + parse() (sync) |
| Batch URL processing | fetch_and_parse() with concurrent futures |
| CLI tool | Depends on your runtime setup |
| Web server | fetch_and_parse() (async) for better throughput |
Feature Flags
To disable async features and reduce dependencies:
[dependencies]
lectito-core = { version = "0.1", default-features = false, features = ["markdown"] }
This removes reqwest and tokio dependencies. You'll need to fetch HTML yourself.
Next Steps
- Output Formats - Working with different output formats
- Configuration - Advanced configuration options
- Basic Usage - Core usage patterns