$ ai_lab

Talk to this portfolio. It talks back.

A real LLM (Groq Llama 3.3 70B) sits in the corner with a portfolio-pinned system prompt. Streams answers as it generates — no spinner-then-paragraph. A scripted matcher handles the common questions for free; the LLM picks up everything else. Every ask is captured into the live observatory below.

Rate-limited, no chat history is stored — only aggregate counts.

$ request_lifecycle

Click → token → bubble — in one diagram

◉VisitorClick → question
▭Nuxt frontendVue chat panel
◆Fastify gatewayBudget + system prompt
✦Groq Llama 3.3 70BToken stream

SSE streamtokens render as they arrive — no spinner-then-paragraph

$ try_it

Six prompts that show what it can do

Click any prompt below — the chat opens and the answer streams in. The first three route to the LLM (evaluative / personality); the rest hit the scripted matcher (deterministic, instant).

$ chatbot_observatory

Booting observatory…

$ how_it_works

What's actually wired up behind the chat

The bot looks like a friendly chip-picker. Under it: a pinned system prompt, a three-layer abuse budget, and a scripted-first fallback chain so the free Groq tier never goes dark mid-day.

System prompt — stacked sections

Built from typed PortfolioContext fields, not a wall of free-text. Each layer is one assertable section, so the bot can't drift on facts and tests can pin each field.

01IdentityName, headline, current role, contact path.
02Pitch + signature sentenceThe one-liner the bot leads with on intro.
03Strengths + signature workThree concrete projects with taglines and routes.
04VoiceDescriptors, signature sentence, banned hype words.
05Journey + work styleMotivation, career arc, how tasks are approached.
06Interview storiesSTAR-condensed examples, used verbatim on behavioural Qs.
07Role fitWhat he's interested in, ideal env, non-fit signals.
08Tone + content rulesHard caps on length, banned phrases, fallback to chips.

8 typed sections compose every prompt sent to the model.

Abuse budget — narrowing funnel

One curious visitor can drain the free Groq daily cap in seconds. Four gates squeeze the request before it reaches the model — keyed on a daily-rotated SHA of the IP, so the raw address is never persisted.

Fastify route limit30req / h / IP

HTTP layer — caught before the budget module even sees it.

100%

Per-IP burst10req / 60s

Catches scripted scrape attempts hitting the bot in a tight loop.

78%

Per-IP daily20req / day / IP

Keyed on a daily-rotated SHA so the raw IP is never persisted.

56%

Global daily200req / day total

Protects the free Groq tier from going dark mid-day for everyone.

36%

Reaches Groq Llama 3.3 70B

Fallback chain — flowchart

Scripted-first means common asks return instantly with zero LLM cost. Free-text falls through to Groq. If the model is over budget or unreachable, the visitor sees a calm chip re-prompt, never a raw 503/429.

Question arrives

Scripted intent match?

Yes

terminal Scripted reply 0 cost · instant

LLM ok & under budget?

Yes

terminal LLM stream SSE · Groq Llama

terminal Chip fallback no raw error