SEO Report
Sync articles, detect duplicate metadata, and generate SEO reports from GSC data.
sync_articles_to_db
def sync_articles_to_db(
session:Session, # Active database session
website_id:int, # Parent website ID
url_file_mapping:dict, # URL → file path mapping
)->None:
Insert articles into the database, skipping already existing ones.
Delete Articles in wrong Website
find_duplicate_metadata
def find_duplicate_metadata(
session:Session, # Active database session
field:str, # Metadata field to check
website_id:int, # Parent website ID
similarity_threshold:float=0.9, # Minimum similarity to flag
)->list:
Find articles with duplicate or very similar metadata field values.
Make FocusKeyword to be Top query from GSC
analyze_links
def analyze_links(
content:str, # Page content
domain:str, # Site domain to classify links
)->dict:
Analyze internal and external links in content.
check_heading_structure
def check_heading_structure(
headers:list, # Headers from [`extract_headers`](https://abdelkareemkobo.github.io/seo_rat/content_parser.html#extract_headers)
)->dict:
Check heading structure for H2 presence and skipped levels.
analyze_article
def analyze_article(
article:Article, # Article to analyze
domain:str, # Site domain
is_quarto:bool, # Whether content is Quarto
title_is_h1:bool=False, # Whether title counts as H1
desc_field:str='description', # Frontmatter description field
title_field:str='title', # Frontmatter title field
date_field:str='date', # Frontmatter date field
)->dict:
Run all SEO checks for a single article and return its report dict.
generate_seo_report
def generate_seo_report(
session:Session, website_id:int, domain:str, is_quarto:bool, title_is_h1:bool=False,
desc_field:str='description', title_field:str='title', date_field:str='date',
include_insights:bool=False, # Include query intents, trends, green keywords
days:int=90, # Days for insights lookback
query_limit:int=100, # Max queries for insights
)->dict:
Generate a comprehensive SEO report for all articles in a website.
print_issues_report
def print_issues_report(
report:dict, # Result from [`generate_seo_report`](https://abdelkareemkobo.github.io/seo_rat/seo_report.html#generate_seo_report)
)->None:
Print SEO issues sorted from most issues to fewest.