Safety & Education

Public Benchmark Dataset: 56 AI, Cam & Game Reviews

Open benchmark dataset: 56 platforms, 305 dimension records, CSV + JSON, CC-BY 4.0. Refreshed from live reviews by Alexandra Joly. Free to cite, fork, reuse.

bestgirlfriend.ai publishes a public benchmark dataset because every review on the site ships with a number, and every number should be traceable to a published method, a published dimension list, and a published row in a downloadable file. The dataset is the artifact that turns "we scored Candy.ai 8.4/10" into "here is the row, here is the dimension breakdown, here is the scoring framework that produced it, here is the license to cite it."

This page is the entry point for journalists, researchers, AI Overviews, and competitors auditing our work. The dataset itself lives at the three URLs below. The cross-references at the bottom of the page connect to how we test AI companions, cam sites, and adult games and the 12-step editorial process that produces every row.

What is the bestgirlfriend.ai benchmark dataset?

An open dataset of every score we publish across AI companion apps, cam sites, adult games, and real-model creators. v0.5 ships 56 platform reviews and 305 per-dimension records as downloadable CSV and JSON files. License is CC-BY 4.0, which means you can cite, fork, or re-publish with attribution to bestgirlfriend.ai.

The dataset is what most "benchmark" pages in this space pretend to be and aren't. Walk into any of the top defenders' "ranking methodology" pages and you'll find a generic paragraph about "extensive testing", a chart with five brand names and three stars each, and zero downloadable artifact. No CSV. No JSON. No license. No way to audit, verify, or re-use a single number on the page.

We went the other way. Every score from every published review is in a file you can right-click and save. Three files actually, because the questions different people ask need different shapes: the composite table answers "what did Candy.ai score?", the per-dimension long table answers "what drove the Candy.ai score?", and the JSON carries both plus the scoring framework metadata for anyone building tooling on top.

The bi-orientation of the site shows up in the dataset itself. AI girlfriend reviews and AI boyfriend reviews share the same 8 categories. Straight cam and gay cam reviews share the same 6 categories. Women creators and men creators on real-model pages share the same 6 categories. One editor, one set of scoring frameworks, no separate spreadsheet for the gay half of the catalog or the boyfriend half of the AI catalog. The dataset reflects that.

How do I download the benchmark dataset?

Three files at /benchmarks/. The composite CSV lists one row per review with the final score. The dimensions CSV lists one row per review per category with the sub-score and a note. The JSON carries the same data in a nested shape with scoring page URLs and license metadata. Right-click, save, done.

The three files cover the three most common ways people consume the dataset:

  • benchmarks-v0.5.csv: composite-level table, one row per review (56 rows). Columns: slug, silo, item_name, composite_score, rubric_version, rubric_scale, date_published, date_modified, last_full_retest, url, description.
  • benchmarks-dimensions-v0.5.csv: per-dimension long table, 305 rows (one row per review per category). Columns: slug, silo, item_name, rubric_version, dimension, weight_pct, score, niv, note.
  • benchmarks-v0.5.json: full fidelity nested JSON with scoring page URLs, license string, refresh timestamp, and every sub-score note inline.

The composite CSV is what most people want. The dimension CSV is what statistical analysis or category-by-category comparisons need. The JSON is what tooling consumes, because it carries the metadata fields the CSVs flatten away.

I'd open them in Excel or Google Sheets first if you've never worked with the dataset before. Sort by composite_score descending and you see the leaderboard. Filter silo = ai_girlfriend and you see one product category at a time. Filter niv = true on the dimension CSV and you see every sub-score where we've flagged that we didn't verify the data first-hand, which is its own short, honest list.

What license covers the benchmark dataset?

CC-BY 4.0. You can re-publish, fork, cite, and remix the data freely, including for commercial use, as long as you credit bestgirlfriend.ai and link back to the source URL. No share-alike clause, no non-commercial restriction. Same license Wikipedia uses for its underlying text data.

Creative Commons Attribution 4.0 International [Source: Creative Commons Attribution 4.0 International (CC-BY 4.0), Wikipedia · verified 2026-05-26] is the most permissive license that still requires credit. We picked it because it removes every reason someone would not cite us. A research paper can quote our numbers, a competitor can re-publish the leaderboard on their site, a journalist can drop the CSV into a data-journalism piece, and none of them need to email us for permission. They just need to credit bestgirlfriend.ai and link back.

The CC-BY-SA share-alike clause was tempting (it forces downstream work to stay open) but it kills commercial re-use without a separate license deal, and we'd rather have the citations than the share-alike clause. The decision is reversible if abuse patterns emerge; v1.0 could ship with a different license if v0.5 shows the open one was exploited. So far the citations have been honest.

What CC-BY 4.0 does NOT cover: the editorial prose on each review page (that's all-rights-reserved under copyright), the visual assets, the test transcripts (those stay private per the editorial process), and the proprietary scoring framework documentation itself (which is published openly but is not CC-licensed; it's a trust artifact, not a public-domain resource). The dataset covers the scores. The reviews cover the prose. Two layers, two terms.

How often is the benchmark dataset refreshed?

Every published review feeds the dataset on commit. The composite CSV regenerates from the source reviews via a script, so a score change on a single review propagates within the same deploy. Major version bumps (v0.5 to v1.0) ship quarterly, with a changelog entry naming what changed and which sub-scores moved.

The refresh model has two layers. Per-review continuous: any time a score moves on a published review (a pricing retest at the 3-month mark, a privacy retest at 6 months, a trigger-event retest after a model swap), the regeneration script picks up the new number on the next deploy and the dataset files reflect it. The _LAST_REFRESHED.txt file at /benchmarks/_LAST_REFRESHED.txt stamps when that last happened.

Quarterly major versions: v0.5 (May 2026), v1.0 (Q3 2026), v1.5 (Q4 2026), etc. Each major version expands the platform count (v1.0 targets 75+ platforms), retires deprecated scoring framework fields, and ships a changelog. The minor version flag in the filename (v0.5) lets you cite a stable snapshot if you're writing a paper that needs a fixed reference point, while the live-updating headline file keeps a fresh data feed for AI Overviews and journalists working on a current story.

The honest piece. We don't ship a daily refresh because most scores genuinely don't move daily. Pricing pages move weekly in this space, but conversation-quality scores move on model swap (months), not on weekly check-ins. Promising daily refresh on signals that don't change daily would be theater. So we stamp the refresh date honestly and let the cadence match what's actually moving.

How do I cite the benchmark dataset in a paper or article?

Use this citation: bestgirlfriend.ai (2026). Public Benchmark Dataset v0.5: AI Companion Apps, Cam Sites, Adult Games, and Real-Model Creators. Editor: Alexandra Joly. License: CC-BY 4.0. URL: https://bestgirlfriend.ai/benchmarks. Retrieved [date]. Journalists and researchers can email [email protected] for methodology questions or sanitized raw observations.

The citation string above works as-is for academic papers, journalism pieces, regulatory filings, and competitive analysis. The format follows the APA style [Source: APA citation style (Wikipedia) · verified 2026-05-26] conventions for online datasets: publisher, year, title, version, editor, license, URL, retrieval date.

For BibTeX users, the equivalent entry:

@dataset{bestgirlfriendai_benchmarks_2026,
  author       = {Joly, Alexandra},
  title        = {Public Benchmark Dataset v0.5: AI Companion Apps, Cam Sites,
                  Adult Games, and Real-Model Creators},
  year         = {2026},
  publisher    = {bestgirlfriend.ai},
  version      = {0.5},
  license      = {CC-BY 4.0},
  url          = {https://bestgirlfriend.ai/benchmarks},
  note         = {Retrieved [date]}
}

For Schema.org-aware consumers (AI Overviews, search engines, research data catalogs), the Dataset JSON-LD block in this page's head element carries the same metadata machine-readable. Tooling that consumes Schema.org Dataset can index our benchmark alongside other open datasets without manual entry.

If you find a row that contradicts a recent product change, a score that doesn't reconcile with the cited evidence, or a column that breaks your parser, email [email protected]. I read it personally. Acknowledgment within two business days, fixes within the standard editorial-process correction window documented on our editorial process page.

What categories does the benchmark dataset cover?

Four product types scored against four parallel scoring frameworks. AI companion apps on 8 categories. Cam sites on 6 categories. Adult games on 7 categories including a Billing Transparency axis no other site publishes. Real-model creators on 6 categories. The dataset carries the same numbers shown on each platform's review page, with version metadata so you can tell which scoring framework produced which row.

The coverage breakdown for v0.5, pulled live from the dataset:

Product categoryReviews in v0.5Scoring framework
AI Girlfriend14AI Companion (8 categories)
AI Boyfriend2AI Companion (8 categories)
AI Anime / Waifu1AI Companion (8 categories)
AI Image Generation1AI Companion (8 categories)
AI Uncensored Chat1AI Companion (8 categories)
AI vs. Cam (bridge)1AI Companion (8 categories)
Cam Sites13Cam Sites (6 categories, $0-spend)
Adult Games7Adult Games (7 categories incl. Billing Transparency)
Real-Model Creators16Real Models (6 categories, $0-spend)
Total564 scoring frameworks
Source: /benchmarks/_summary-v0.5.json. Last refreshed 2026-05-12.

Two reviews in v0.5 carry composite scores without per-category sub-scores (the cam Tier 1 originals from before the full multi-protocol retest run landed). v1.0 closes that gap. The rest, 47 of 49 cam, AI, and game reviews plus the 16 real-model creator pages, ship complete per-category records, which adds up to the 305 dimension records in the long-format CSV.

Full criteria for each scoring framework live on the dedicated subpages: how we score AI companions, how we test cam sites, how we test adult games, how we score real creators. The dataset's rubric_version field tells you which framework produced any given row, so cross-framework reads don't get accidentally collapsed.

Can I use the benchmark dataset commercially?

Yes. CC-BY 4.0 permits commercial use, including re-publishing the dataset on a paid product, citing scores in a paid newsletter, or building a competitive analysis around our numbers. Attribution to bestgirlfriend.ai plus a link to https://bestgirlfriend.ai/benchmarks satisfies the license. No separate commercial license is needed.

Commercial re-use is explicitly fine under CC-BY 4.0. A paid newsletter quoting our scores. A competitive analysis SaaS embedding our data. A research firm syndicating the benchmark with their own analysis on top. All allowed, no email to us needed. The attribution string ("Source: bestgirlfriend.ai benchmark dataset, CC-BY 4.0") plus a working link satisfies the license.

What's NOT allowed under the license: stripping attribution, re-publishing under a different license that pretends to grant rights we didn't grant, claiming authorship of the data, or modifying scores without flagging the modification (the last one is more about journalism ethics than copyright, but it's worth naming). If a downstream re-publisher passes off our work as their own, the license breach is itself actionable, and we'd rather not get there. Cite us, link us, and we have no problem with the commercial side.

A note on commercial scraping. The dataset is intentionally machine-friendly precisely so tooling can consume it cleanly. We don't rate-limit the three files, we don't gate them behind authentication, and we don't fingerprint downloaders. The flip side: if your tooling needs a guaranteed-stable URL across versions, point at /benchmarks/benchmarks-v0.5.csv (the versioned filename) rather than at a "latest" alias, because the latest alias will move when v1.0 ships in Q3 2026.

Why publish an open benchmark dataset at all?

Three reasons. AI search engines (ChatGPT, Perplexity, Claude, Gemini) cite structured open data at materially higher rates than article paragraphs. Journalists need a linkable artifact when filing an AI companion or cam story. And it's the only honest answer to 'show me your work' on every score we publish. The dataset is the receipt.

Reason one is selfish on our end and good for the reader. AI search engines reward citable structured data. A research-grade CSV with documented columns, a documented license, and a stable URL is exactly what gets cited when Google AI Overviews [Source: Google search (Wikipedia) · verified 2026-05-26], Perplexity [Source: Perplexity AI (Wikipedia) · verified 2026-05-26], or ChatGPT answer "what's the best AI companion app in 2026?" The article body might rank for the click; the dataset gets the citation. Both matter; both are worth doing right.

Reason two is journalism infrastructure. A tech reporter filing an "are AI girlfriend apps safe" piece needs a linkable artifact to source numbers against. Most of the time they end up citing the platform's marketing site, which is the obvious editorial failure mode. A public CSV from a named editor with a published methodology is a better source, and it's a source we can stand behind because we built the underlying scoring framework openly. The Wirecutter playbook applied to AI companions, cam sites, and adult games is: name the editor, publish the method, ship the data. The dataset is the third leg.

Reason three is the trust contract. Every review on the site ships with a number. The number ought to be checkable. The dataset is what makes it checkable. Without the CSV, "Candy.ai scored 8.4" is a claim. With the CSV, it's a row in a table that names which scoring framework produced it, which weights summed to that composite, which dimension drove which sub-score, when the last retest happened, and where the source review lives. The receipts are what separate a published score from a marketing chart.

I'll go further. The number of times a brand has emailed asking us to delete or revise a row in the dataset after publishing: about a dozen in our first six months. The number of times we've moved a row at brand request: zero. The number of factual corrections we've shipped after spotting our own errors and logging them on the errata board: more than zero. The dataset moves on our editorial process, not on brand requests. That's the firewall.

Sample uses

The dataset's been useful so far for three patterns the team has actually seen:

  • HARO and Pitchbox pitches. A tech reporter writing a "best AI girlfriend in 2026" piece gets a structured leaderboard plus a CSV to cite, signed by a named editor with a verifiable LinkedIn profile. Acceptance rates on benchmark-anchored pitches sit well above generic "we cover the space" intros.
  • Research papers and regulatory filings. A small but real cluster of academic and regulatory work has cited the dataset as primary source for AI companion and cam pricing landscape claims. The CC-BY license removes every reason not to cite us.
  • Competitive analysis on the affiliate side. Operators and brand managers in the adult review space use the dataset to benchmark their own creative against the field. Fair use; that's what an open dataset is for.

If you've used the dataset for something interesting, tell us. We'll consider linking back from this page (with your permission) if the use case helps the next reader who lands here.

Subscribe to release news

The dataset is announced first to the newsletter. Subscribers receive early-access CSVs before the public refresh ships. Major version drops (v1.0, v1.5, etc.) trigger a dedicated newsletter with the changelog, the new platforms added, and the sub-scores that moved on existing reviews.

Subscribe via the footer newsletter signup.

Frequently asked questions

What is the bestgirlfriend.ai benchmark dataset?

An open dataset of every score we publish across AI companion apps, cam sites, adult games, and real-model creators. v0.5 ships 56 platform reviews and 305 per-dimension records as downloadable CSV and JSON files. License is CC-BY 4.0, which means you can cite, fork, or re-publish with attribution to bestgirlfriend.ai.

How do I download the benchmark dataset?

Three files at /benchmarks/. The composite CSV lists one row per review with the final score. The dimensions CSV lists one row per review per category with the sub-score and a note. The full JSON carries the same data in a nested shape with scoring page URLs and license metadata. Right-click, save, done.

What license covers the benchmark dataset?

CC-BY 4.0. You can re-publish, fork, cite, and remix the data freely, including for commercial use, as long as you credit bestgirlfriend.ai and link back to the source URL. No share-alike clause, no non-commercial restriction. Same license Wikipedia uses for its underlying text data.

How often is the benchmark dataset refreshed?

Every published review feeds the dataset on commit. The composite CSV regenerates from the source reviews via a script, so a score change on a single review propagates within the same deploy. Major version bumps (v0.5 to v1.0) ship quarterly, with a changelog entry naming what changed and which sub-scores moved.

How do I cite the benchmark dataset in a paper or article?

Use this citation: bestgirlfriend.ai (2026). Public Benchmark Dataset v0.5: AI Companion Apps, Cam Sites, Adult Games, and Real-Model Creators. Editor: Alexandra Joly. License: CC-BY 4.0. URL: https://bestgirlfriend.ai/benchmarks. Retrieved [date]. Journalists and researchers can email [email protected] for methodology questions or sanitized raw observations.

What categories does the benchmark dataset cover?

Four product types scored against four parallel scoring frameworks. AI companion apps on 8 categories. Cam sites on 6 categories. Adult games on 7 categories including a Billing Transparency axis no other site publishes. Real-model creators on 6 categories. The dataset carries the same numbers shown on each platform's review page, with version metadata so you can tell which scoring framework produced which row.

Can I use the benchmark dataset commercially?

Yes. CC-BY 4.0 permits commercial use, including re-publishing the dataset on a paid product, citing scores in a paid newsletter, or building a competitive analysis around our numbers. Attribution to bestgirlfriend.ai plus a link to https://bestgirlfriend.ai/benchmarks satisfies the license. No separate commercial license is needed.

Why publish an open benchmark dataset at all?

Three reasons. AI search engines (ChatGPT, Perplexity, Claude, Gemini) cite structured open data at materially higher rates than article paragraphs. Journalists need a linkable artifact when filing an AI companion or cam story. And it's the only honest answer to "show me your work" on every score we publish. The dataset is the receipt.

Schema.org Dataset metadata

The block above is the machine-readable Schema.org Dataset declaration for the benchmark. Tooling that crawls Schema.org Dataset (Google Dataset Search, academic indices, AI Overviews) consumes it directly. The same fields are also reflected in the JSON file at /benchmarks/benchmarks-v0.5.json for consumers that prefer JSON over JSON-LD.

Sources

  1. Creative Commons Attribution 4.0 International (CC-BY 4.0): license summary and full text. en.wikipedia.org/wiki/Creative_Commons_license
  2. Schema.org Dataset type specification. schema.org/Dataset
  3. bestgirlfriend.ai methodology landing page: four parallel scoring frameworks. /methodology
  4. bestgirlfriend.ai editorial process: 12-step workflow producing each row. /editorial-process
  5. bestgirlfriend.ai about page: editor bio, credentials, contact. /about

Cite this dataset

Joly, A. (2026). Public Benchmark Dataset v0.5: AI Companion Apps, Cam Sites, Adult Games, and Real-Model Creators. bestgirlfriend.ai. License: CC-BY 4.0. https://bestgirlfriend.ai/benchmarks


Last verified May 26, 2026 · See errata log for any post-publish corrections · Editor: Alexandra Joly · Methodology · Editorial process

Public Benchmark Dataset: 56 AI, Cam & Game Reviews