May 6

Everyone wants to build an AI chief of staff. Here's my honest take on the pros and cons of OpenClaw, Hermes, Claude Code, Codex, and Gemini

26 Comments

Tommy Walker

May 27

I've actually built this and have added a lot more to it.

It listens to my fathom transcripts, schedules followups, replies to emails I don't need to (though I approve everything in Slack) creates monthly reporting decks that curates all of my business critical metrics, creates briefs / drafts / pushes to the website, creates training pairs for editorial feedback, tracks urls in my SEO tools and in Google Analytics, updates my workflow tools, updates statuses, keeps a daily log of activities...

That only scratches the surface.

Alpha Metrics

May 10

I’d probably frame one part slightly differently: I’m not sure “AI chief of staff” is the end state people actually want.

Most users don’t want a highly autonomous operator making decisions. They want a low-friction cognitive extension that reduces coordination and context-switching without feeling invasive or unpredictable.

That’s why reliability matters so much more than demos right now. The trust threshold for delegation is dramatically higher than the trust threshold for chat.

Parth Tiwary

Jun 19Edited

the personal agent problem is mostly unsolved at the context layer, not the model layer. a model can reason and act. what it can’t do reliably is know enough about you to take the right action. that’s the hard part.

Abhishek

Jun 4

The race has been on in this space ever since Claude Code and OpenClaw launched. I disagree with the “it’s Google’s race to lose”. Google has the reputation that others don’t quite live upto and frankly, can’t. I think they’re already learning from all these use cases and waiting for the opportunistic moment to drop a big release that will shift things heavily in Google’s favour.

Elysha

May 31

I’ve been thinking of trying Hermes but the potential bot upvoting on X and Reddit kept me skeptical. You’ve inspired me to try it! Have you migrated from OC to Hermes completely or running in parallel?

Reply (1)

Peter Yang

May 31

I've migrated to Hermes completely. Honestly I was a little skeptical too (the profile is an anime girl after all) but it just works more reliably than OpenClaw right now. I'm actually working on an opinionated Hermes gude here that'll hopefully go live by mid-June: https://www.behindthecraft.com/

Dorian

May 25

The personal AI agent race will not be won by whoever builds the best chatbot with more tools.

It will be won by whoever solves the operating layer:

memory,

permissions,

workflow state,

context routing,

verification,

rollback,

and user trust.

Most “AI chief of staff” demos look impressive because they operate in clean toy environments.

Real work is messy.

Emails contradict calendars.

Docs have hidden context.

Tasks depend on judgment.

People change priorities.

Systems fail silently.

And one bad autonomous action can destroy trust instantly.

The real product is not an agent.

It is a personal operating system where AI can act without becoming dangerous, annoying, or wrong at scale.

Nobody has won yet because autonomy is easy to demo and hard to trust.

Mihail Musat

May 16

Could not agree more with this perspective. This is the current move all big AI players are pivoting towards.

Few comments from the field:

1. Gemini is actually great with emails, calendar, chats etc. Of course, limited to Google Workspace. At this point thier integration of Gemini within Workspace + the addition of Gems makes it a great competitor in this race.

2. Claude Cowork. The first product which by design working on the well 10 points you chose. A fork of Claude Code, especially designed for business.

3. Copilot Cowork from Microsoft. Microsoft, within their Copilot stack (Copilot M365, Copilot Studio and Agents365) is the closest to check all those boxes, and even for enterprise customers and highly regulated industries.

Daniel Joachim

May 11

Peter, I’m pretty stoked to say that my “Claude-Claw” setup (entirely Claude Code) is ticking each of your 10 boxes.

Reliable as in, haven’t ghosted me or frozen since set-up.

Voice / text replies via ElevenLabs / Telegram — any message below 200characters is a voice note.

Reply (1)

Daniel Joachim

May 11

His names Watney. Based in Mark Watney’s personality from The Martian — a fixer with impeccable optimism and humor

Think AI

May 10

Great framing, personal AI agents will only win when they understand context, trust, and daily workflows better than a normal chatbot.

David Almstrom

May 9

are you also using Hermes to orchestrate software development in Claude Code and potential other setups like Qwen (local compute)?

Ileana

May 9

Such a good overview! Thanks for sharing this 💫

Brandon Titus

May 9

It seems to me that prompt injection is still a huge blocker for productizing these tools for everyday use. Do you think that will be solved or mitigated any time soon?

Shawn Rouse

May 8

https://substack.com/@csarticles/note/p-196851209?r=8co1m9

Michelle Lawson

May 6

Therapy through voice replies is interesting… why that over text replies? (given your voice input)

Reply (1)

Peter Yang

May 6

Voice just sounds more personal and is easier to use while out on a walk

Reply (1)

Michelle Lawson

May 6

Have you defined a personality or soul.md for your Hermes agent?

Reply (1)

Peter Yang

May 6

No I just copied my OpenClaw soul.md over. It seems to work the same.

Pedro Nogueira

May 6

Hey Peter! Great post, thanks for sharing. I always love to hear your thoughts on this, have been following you since the first Openclaw outburst and really appreciate this comparative rundown.

Quite curious to hear your thoughts about Notion AI. I have been using it for a couple of months now, alongside with Claude, and I really think it is the unappreciated racer on this battle. Did you get the opportunity to test it already ?

Cheers from Brazil!

Reply (1)

Peter Yang

May 6

I have something good cooking with Notion will share more soon!

ALREADY ANXIOUS 🫠

May 6Edited

I’d say Hermes has won

The Race to Build a Personal AI Agent (And…