Options

ReadabilityOptions

ReadabilityOptions changes extraction behavior. Most callers should start with ReadabilityOptions::default() and only set fields that solve a specific problem.

#![allow(unused)]
fn main() {
pub struct ReadabilityOptions {
    pub max_elems_to_parse: Option<usize>,
    pub nb_top_candidates: usize,
    pub char_threshold: usize,
    pub content_selector: Option<String>,
    pub site_profiles: Vec<String>,
    pub mobile_viewport_width: Option<usize>,
    pub classes_to_preserve: Vec<String>,
    pub keep_classes: bool,
    pub disable_json_ld: bool,
    pub link_density_modifier: f32,
}
}

Defaults:

#![allow(unused)]
fn main() {
ReadabilityOptions {
    max_elems_to_parse: None,
    nb_top_candidates: 5,
    char_threshold: 500,
    content_selector: None,
    site_profiles: Vec::new(),
    mobile_viewport_width: Some(480),
    classes_to_preserve: Vec::new(),
    keep_classes: false,
    disable_json_ld: false,
    link_density_modifier: 0.0,
}
}

content_selector is the most direct override. Use it when the caller knows where the article lives in the document. site_profiles accepts TOML profile strings that provide host-scoped content roots, removal selectors, metadata hints, cleanup settings, and fallback behavior. char_threshold controls when an attempt is accepted. nb_top_candidates controls how many candidates remain in play during selection.

ReadableOptions

ReadableOptions only affects is_probably_readable. It does not change full article extraction.

#![allow(unused)]
fn main() {
pub struct ReadableOptions {
    pub min_content_length: usize,
    pub min_score: f32,
}
}

Use lower thresholds for short-form content. Use higher thresholds when false positives are more expensive than missed articles.

Defaults:

#![allow(unused)]
fn main() {
ReadableOptions {
    min_content_length: 140,
    min_score: 20.0,
}
}