The goal is not to replace experts with AI. I will say that until I am tired, and then I will say it again.
The goal is to make experts more effective — faster orientation, better decisions, more capacity for the work only humans should own. Anything else is expensive noise with a GPU bill.
The collaboration failure mode
Most “AI assistance” I see in the wild is a firehose wearing a smile. More tokens. More context. More summaries of summaries. Experts do not need more text. They need better structure at decision time.
Our comparative studies keep showing the same pattern: unconstrained AI output can degrade expert performance. Not because experts are fragile — because expertise depends on curation. Flood the channel and you destroy the scan path.
Augmentation requires editing. Timing. Priority. Humility about what should be shown first.
Design principles I actually use
Mirror expert foraging, do not fight it. Present hierarchy before detail. Surface provenance where the eye naturally checks trust — titles, sources, relationships, confidence, not footnote graveyards.
Preserve agency. The best loops are suggest → expert decides → system learns. Copilots that auto-act on high-stakes workflows without explicit human commitment are not brave. They are liability cosplay.
Make reasoning scannable. Experts trust outputs when they can verify the reasoning chain quickly — not read a novel, verify a path.
Instrument accountability. Human-AI decisions need receipts: who approved, what model, what retrieval, what policy version. Collaboration without accountability is theater.
Training humans, not just models
We treat collaboration as a skill — because it is. Engineers, clinicians, operators, and analysts all need practice negotiating with probabilistic tools: when to accept, when to challenge, when to escalate, when to ignore.
Longitudinal work here is early but important. If juniors learn AI dependence before they learn domain perception, we may gain speed and lose judgment formation. That trade deserves honest study, not marketing.
Connection to production work
This research is not academic wallpaper for me. It shows up in agent UI choices, retrieval ordering, governance hooks, and training programs I have led in enterprise settings. The test is always the same: did the human leave the interaction more capable, or more dependent?
Open question
What if we measured AI success not by automation rate, but by expert amplification rate — decisions improved, recovery time shortened, learning accelerated?
I think we would build different systems. Maybe better ones.
Until then, I am building for collaboration that respects the human in the loop — not as a compliance checkbox, but as the point of the whole exercise.