Slide: only a fraction behaves as a real agent — key finding from the Devpost hackathon analysis

Public

Hackathon landscape analysis

Python pipeline that scraped, structured, clustered, and scored 4496 projects from the Gemini API Developer Competition on Devpost. Findings published as a LinkedIn post.

Stack

Python · BeautifulSoup · Playwright · pandas · scikit-learn · Claude API

Artifacts

Public demo / repo

TL;DR

  • Multi-stage pipeline: scrape → structure → cluster (TF-IDF + KMeans) → score → AI narrative (Claude).
  • 4496 projects analyzed; 4292 scored (95.5% success rate).
  • Key finding: Developer Tools and Health scored highest; Media/Creative most crowded and least differentiated.
  • Findings published as a LinkedIn post; analysis also informed product positioning for a later project.

Reusable patterns

  • Multi-stage pipeline with raw artifact preservation: each stage is resumable and independently debuggable.
  • TF-IDF + KMeans as fast baseline for domain clustering on short text (project descriptions).
  • Custom scoring rubric (innovation, impact, scalability) as a proxy for "interesting" before reading full docs.
  • State tracking (meta.json) for resumable scraping: restart a long run without re-fetching already-scraped pages.
  • Claude API for narrative deepening: LLM adds qualitative insight on top of quantitative cluster outputs.

Context

The Gemini API Developer Competition on Devpost received 4496 project submissions — too many to read manually.

Goal: understand the landscape systematically — dominant domains, under-served areas, technology signals, and award candidates.

Decisions

  • Multi-stage pipeline (fetch → structure → analyze → deepen): each stage writes outputs to disk before the next one starts — failures are cheap and runs are resumable.
  • Retry + jitter in the scraper (6 attempts, 0.7s ± 0.15s delays): steady throughput without triggering rate limits.
  • TF-IDF + KMeans for unsupervised clustering: fast, interpretable, no labeled data needed.
  • Custom scoring rubric: weighted composite of innovation, impact, and scalability signals extracted from project text.
  • Claude API for Phase 2 deepening: the quantitative pipeline surfaces structure; Claude adds narrative and qualitative context.

Architecture

  • fetch: scrapes gallery index pages + project detail pages with retry, jitter, and raw HTML preservation.
  • structure: normalizes JSONL into master table, extracts tech tags, links (repo/demo/video), team members.
  • analyze: TF-IDF + KMeans (18 clusters), domain rule classification, composite scoring, award candidate nomination.
  • deepen (Phase 2): Claude API generates narrative summaries on top of quantitative clusters.
  • All intermediate outputs written to disk (dump/, structured/, analysis/) — pipeline is fully resumable at any stage.

Outcome

  • 4292 of 4496 projects scored (95.5% success rate).
  • Key finding: Developer Tools (avg score 55.9) and Health & Wellness (55.6) were the highest-quality domains; Media/Creative Tools (48.5% of all submissions, avg 46.6) was the most crowded and least differentiated.
  • 31.98% of projects agentic; 32.25% multimodal — identified as emerging capability clusters worth tracking.
  • Findings published as a LinkedIn post; analysis also used for competitive positioning in a later product.

Links

Slides

Only a fraction behaves as a real agent. 31.98% self-declare as agentic; only 3.56% pass a strict verification.

1 / 7