AI Poop-Analysis App Database Offered for Sale

404 Media reports that a Reddit post on r/DHExchange by user Ill_Car_7351 offered "150k+ labeled and classified images of 💩 from roughly 25K different people," according to the article. The images reportedly originated from an app called PoopCheck, operated by a company identified as Soft All Things; 404 Media cites the app advertising that "Our AI analyzes your poop using the Bristol Stool Scale and advanced pattern recognition." The app featured a public community with about 151,317 "shared stools," per 404 Media. The article's author contacted the Reddit poster and documented attempts to obtain access to the dataset and reactions from forum commenters. The reporting highlights a sale offer for a highly sensitive, user-generated image dataset collected by a health-adjacent consumer app.
What happened
404 Media reports that a Reddit post on r/DHExchange by user Ill_Car_7351 advertised "150k+ labeled and classified images of 💩 from roughly 25K different people," quoting the post verbatim. Per 404 Media, the dataset allegedly came from an app called PoopCheck, produced by a company referred to as Soft All Things. The article quotes the app that advertises, "Our AI analyzes your poop using the Bristol Stool Scale and advanced pattern recognition." 404 Media also reports the app displayed about 151,317 "shared stools" in its community and included a leaderboard feature.
Editorial analysis - technical context
Companies and researchers building image-based medical or health datasets should treat images of bodily outputs as high-risk, sensitive data. Industry-pattern observations: training ML models on such user-submitted images typically requires rigorous deidentification, informed consent, and IRB review when used for medical research, even if individual faces or names are not present.
Industry context
The 404 Media piece frames this episode as part of a broader underground market for app-collected user data. Industry-pattern observations: marketplace demand for niche, hard-to-obtain labeled datasets can create incentives for secondary sales of data originally gathered for consumer features.
What to watch
For practitioners and data stewards, monitor legal and ethical boundaries around reusing consumer-submitted health-adjacent images, platform terms that enable community sharing, and whether any entities claim ownership or offer to commercialize such collections. 404 Media documents the offer and forum reactions but does not cite statements from PoopCheck or Soft All Things on the dataset sale.
Scoring Rationale
This is notable for practitioners because it surfaces a rare, sensitive dataset and illustrates marketplace incentives that affect dataset provenance and compliance. The story is not a systemic industry shift but has clear ethical and legal implications for ML projects using user-submitted health-adjacent images.
Practice with real Health & Insurance data
90 SQL & Python problems · 15 industry datasets
250 free problems · No credit card
See all Health & Insurance problems
