MLOps.community

生产级 AI / MLOps 访谈，偏 agents、evals、部署、延迟和工程实践

13 集已生成 · 13 集收录

已生成

80 min

Sandboxing, Agent Harnesses, and Agent Teamwork

Shahram Anver · Cleric co-founder and CEO

Shahram Anver 以 Cleric 的演化为线索，解释为什么更强的模型需要更坚固的 sandbox，而生产级 AI SRE 的耐久壁垒正在从调查 loop 下沉到环境、验证与学习。对谈进一步讨论人类如何转向例外审批与品味判断，以及 coding、SRE、security agents 如何形成有治理的协作团队。

#coding-agents #ai-products #interview #ceo

查看 slides → YouTube ↗

MLOps.community

~70 min

Context Engineering for Coding Agents

Faus · Applied AI researcher / former restaurateur

一位 MLOps community Amsterdam 的现场分享。讲者把 "context engineering" 拆成一个很哲学的命题：每周一个新模型，你能控制的只有那个 context window —— 怎么往里塞东西，比模型本身更重要。从 25% 的经验法则、context 的三分法（deterministic / probabilistic / human），到用大脑结构作类比、把 Karpathy 提议的 "markdown 当 memory" 拓展成带衰减和重要度评分的 wiki —— 最后以一个 5 分钟硬 timer 的对比 demo 收尾。

#coding-agents #productivity #ai-products #monologue

查看 slides → YouTube ↗

MCP, Agents & the $40M Bet on Multiplayer AI

MLOps.community

≈ 50 min

MCP, Agents & the $40M Bet on Multiplayer AI

Dust Co-founder · Dust 联合创始人 — 前 Stripe（早期收购加入，亲历 150→3000 人）、前 OpenAI

Dust 联合创始人讨论"单人 AI → 多人 AI"的产品边界：为什么 Claude Code / Codex 仍然本质上是单人模式（任务时长仍是半天，模型能力是 jagged 的），以及他们的回答 —— Pod：一个把多代理、多人类、GCS-FUSE 共享文件系统封装在一起的工作单元。还讨论了 Stripe 早期的 flocking 算法机制、fog of AI 下的对齐难题、 tokconomics 的终局（commoditization）与过渡期（需求挤压维持高价）、为什么 flat pricing 死了、以及 stateful sandbox + SQLite 作为 SaaS 的可能替代。

#ai-products #coding-agents #productivity #startup

查看 slides → YouTube ↗

AI Is Fast. AI Projects Are Slow. Let's Fix That.

MLOps.community

≈ 57 min

AI Is Fast. AI Projects Are Slow. Let's Fix That.

Rocket Ride 联合创始人 · Rocket Ride 联合创始团队 — 开源框架 2026-03-04 发布，捐给 Linux Foundation (AIF)；前身做 data discovery & preparation，处理过 1.5 billion files 级 RAG

Rocket Ride 两位联合创始人讨论 "coding 不再是瓶颈" 之后的新工作 —— intentionality / tool discovery / quality 才是新岗位；为什么 Claude 在增量工程上偷懒、容易写出 "spaghetti code that's not worth shipping"； Rocket Ride 怎么把 AI 应用拆成 node + 五条 lane，让 Claude 用一句英文拼出 45 节点的 pipeline；以及一个 12+9 小时的 Crew AI 同步噩梦 (dog/cat/elephant) 如何揭出 "plumbing 才是大坑"；最后讨论 $25k AWS 账单背后的 cost observability 和 model server 聚合 (一张 "big-ass GPU" 服务 100 个客户)。

#coding-agents #ai-products #productivity #startup

查看 slides → YouTube ↗

MLOps.community

56min

架构现代 AI 系统：平台、Agent 与集成 | MLOps Community Panel

Frederique · Shao · Allan · 应用研究（MIA） · 企业 AI（Bell Canada） · 主权云（Buzz HPC）

三位来自不同位置的从业者——MIA 的应用研究者 Frederique、Bell Canada 的企业 AI 负责人 Shao、加拿大主权 GPU 云 Buzz HPC 的 Allan——围绕一场心理健康 AI 黑客松展开圆桌：从平台该提供什么、Codex/Claude Code/Cursor 这些"Pokemon 名字"工具其实可以替换底层模型，到从 API key 蜜月期走向自托管开源模型的 tokenomics 觉醒；从 GPU 硬件通胀（Michael Burry 错了，A100 七年仍涨价）到 Blackwell vs Ampere 的 VRAM 取舍；从 80% → 95% 的死亡幽谷、LLM-as-judge 的踢皮球，到 RL gyms、verifiable rewards、Lean 攻陷 Erdős；最后落到 constrained generation 为什么是闭源 API 永远不开的杀器、为什么 Deep Research 能远超 Claude Code、以及"agent 给 Jira 付了 10,000 次款"背后的治理真问题。

#ai-products #coding-agents #hardware #chips

查看 slides → YouTube ↗

MLOps.community

≈ 32 min

构建能在生产环境中存活的AI智能体

Haytham Abuelfutuh · Union AI首席技术官；Flyte联合作者

Haytham Abuelfutuh认为，生产环境智能体的难点不仅在于提示词和工具，还在于围绕长时运行、容易出错的会话的运营基底。演讲围绕动态执行、持久恢复和防御性沙箱三个维度构建了生产就绪性的框架，并以人类旅行社为例，说明了中断处理、状态保持和恢复比一个干净的演示循环更为重要。

#ai-agents #production #infrastructure #mlops

查看 slides → YouTube ↗

MLOps.community

Fixing GPU Starvation in Large-Scale Distributed Training

Kashish · Uber · ML Infra · Marketplace Matching Lead

Kashish (Uber ML infra, ex-Google YouTube Ads) walks Demetrios through a Sherlock-Holmes-grade Petastorm bug—GPU cluster stuck at 15-20% utilization, six debugging steps, two layers of bottleneck, and finally a "double bottleneck" reveal: PyArrow→NumPy translation was silently eating the headroom. Plus serving's latency-vs-utilization war, the reproducibility cost of parallelism, and a live diagnosis of a friend's slow DGX Spark.

#ai-products #hardware #chips #coding-agents

查看 slides → YouTube ↗

Getting Humans Out of the Way: How to Work with Teams of Agents

MLOps.community

≈ 50 min

Getting Humans Out of the Way: How to Work with Teams of Agents

Rob · Creator of Brumi (open-source multi-agent IDE)

Rob 是开源多 agent IDE Brumi 的作者. 这期他把"如何把人从 loop 里拿出来"的整套手艺摊开讲—— 从让 agent 截图自证 (feature walkthrough doc), 到自定义 lint 规则爆炸, 到 plan.md 替代 plan mode, 到并行 5 个 agent 挑赢家. 核心隐喻只有一句: 教 agent 怎么向上汇报.

#coding-agents #productivity #ai-products #founder

查看 slides → YouTube ↗

MLOps.community

≈ 54 min

The Modern Software Engineer

Mihail Eric · ML / AI infrastructure practitioner & instructor

Mihail Eric 和 Demetrios 在 SF 录音棚里把 AI coding agent 的真实工程问题挨个摊开: junior 被 cursor 截断的训练链, Eno @ Factory 强调的 validation harness, token 计费迟早被 task 计费取代, Twitter 上 "15 个 tiled Claude Code instances" 的并行神话, 团队该变小、PM 该会提 PR, 以及下一个 superpower 是 articulation. 全程没有 framework, 全是 day-to-day 判断, 最后一句是 "just breathe".

#coding-agents #ai-products #productivity #engineer

查看 slides → YouTube ↗

MLOps.community

~52m

A New Kind of Marketplace

Pedro Chaves & Donné Stevenson · Pedro Chaves · OLX (motors / real estate / classifieds) ｜ Donné Stevenson · Prosus AI

MLOps.community 在 Lisbon 的现场圆桌：OLX 产品方 Pedro Chaves 与 Prosus AI 团队的 Donné Stevenson 聊两个真实落地的项目（地产端"lifestyle agent"、汽车 dealer 的 chat + shortcut 助手），再延伸到 Pedro 的大愿景——为 agent-to-agent 交易搭一个还不存在的"harness"。一个意外有料的 quick-take：trust ladder、Ctrl+Z for agents、 GEO 取代 SEO，以及一句把整场拉到哲学高度的"It's not a simulation. It is a recommendation."

#ai-products #startup #product-management #design

查看 slides → YouTube ↗

MLOps.community

Why Agents are Driving Software Development to the Cloud

Zach Lloyd · Warp Founder & CEO · ex-Google Docs/Sheets

Warp 创始人 Zach Lloyd 在 MLOps.community 解释为什么 2026 是 agents 搬家的一年—— 从笔记本搬到云端，从 solo sport 变成 team sport。Oz 是他们的编排平台，agent 不是云电脑、是云上的同事，meta-app 正在让 SaaS 入口收敛到一个"会做事的浏览器"。

#coding-agents #ai-products #productivity #ceo

查看 slides → YouTube ↗

The Creator of Superpowers: Why Real Agentic Engineering Beats Vibe Coding

MLOps.community

≈ 67 min

The Creator of Superpowers: Why Real Agentic Engineering Beats Vibe Coding

Jesse Vincent · Creator of Superpowers (110k stars Claude Code skill kit)

Jesse Vincent —— Perl projects lead 出身、K-9 Mail 的原作者、25 年老兵 —— 把过去九个月驯服 Claude Code 的方法摊开来讲. 110k stars 的 superpowers 不是 vibe coding, 而是一套 orchestrator 架构 + 单使命 subagent 分工 + skill 系统的 agentic engineering 方法论. 这期还覆盖 Claude 删测试事件如何用一行 prompt 修好、为什么 swarm 是 2002 年的 Facebook、以及 2028 年 GitHub 可能不存代码只存 specs 的预言.

#coding-agents #productivity #ai-products #engineer

查看 slides → YouTube ↗

MLOps.community

It's 2026, and We're Still Talking Evals

Maggie Konstanty · ML Engineer · LLM Agent Evaluation Lead

Maggie Konstanty 在 MLOps.community 谈 LLM agent 评估的真实战场——为什么团队总是先发布再补 eval、为什么 pre-prod 和 production 是"两种动物"、以及为什么所有 vendor 工具都让她最终选择自己造。整期访谈最反直觉的 takeaway：evals 本身不难，难的是让团队对齐"什么叫好"。

#ai-products #coding-agents #engineer #interview

查看 slides → YouTube ↗