FJFindTheJob.today
Transparent methodology

Job Data and AI Enrichment Methodology

FindTheJob.today is designed to avoid the thin-job-board pattern that search engines dislike. Imported jobs are treated as raw data. Only current, useful, source-identifiable pages are allowed into the public index.

1. Import

Careerjet and Remotive feeds are imported by cron. Each job receives a source, external ID, fingerprint, first-seen date, last-seen date, apply URL, and expiry window.

2. Quality Gate

Jobs are scored for description depth, salary visibility, freshness, source trust, duplicate risk, and location clarity before they can become indexable.

3. Index Policy

Strong pages use `index, follow`. Thin, duplicate, filtered, expired, or source-weak URLs remain `noindex, follow` or are removed from the sitemap.

4. Enrichment

AI providers help summarize market signals and generate editorial drafts, but pages still need visible sources, useful takeaways, citations, and schema before publication.

What Makes a Job Page Indexable?

An indexable job page must be active, canonical, unexpired, source-identifiable, and useful without forcing a user to leave the site immediately. Salary, location, employment type, date posted, valid-through date, and JobPosting schema all improve the page's usefulness.

What Makes Editorial Content Publishable?

Editorial content must explain a career decision, not merely rephrase a feed. The best pages include first-party job-index observations, public career sources, practical next steps, and an updated date.

How Does the Site Expand?

Cron jobs import new jobs every 30 minutes, check AI provider health hourly, and run content enrichment every two hours. The public sitemap updates dynamically from jobs and content that pass quality checks.

Where Can Readers Audit the System?

Readers can inspect the public sitemap, robots file, llm file, trust pages, and article citations. Admin-only endpoints, API keys, and provider logs are intentionally blocked from crawlers.