Adaptive Music Middleware
By StoneKey
Music that thinks human.
Adaptive game music middleware trained on human improvisational intelligence. Built for the compound emotional states no current system can handle.
Current adaptive music systems tag a scene as tense or calm, trigger a pre-composed segment, and loop it. The emotional logic is algorithmic. It sounds like it.
What these systems cannot handle are compound emotional states. Grief alongside beauty. Loneliness shifting into connection. Fragile hope coexisting with unresolved fear. These are the states that make a scene unforgettable. They require a human being who has felt them.
"An AI trained on finished recordings learns what music sounds like. It does not learn how a musician decides what comes next when the emotional ground shifts."
Each training session captures what no public dataset has ever contained. A musician improvises. At the same moment, every emotional decision is narrated in real time — not reconstructed afterward. The result is a record of musical intelligence at the moment it happens.
Live piano recorded simultaneously as audio and MIDI. Not just what notes were played but how they were felt — velocity, timing, dynamics, the space between registers.
The musician narrates every emotional transition in real time as it happens. Labels are time-aligned to the musical event, not reconstructed afterward. The decision captured at the moment it is made.
A structured labelling protocol developed specifically for this problem. The methodology is what makes this dataset different from any other narrated performance recording.
Sessions are labelled by independent annotators without access to the original labels. Agreement rate between annotators is the primary measure of schema stability.
What you are about to watch is not a demonstration of Feltwork. It is proof of the human capability Feltwork is designed to capture.
Nicholas Clarke improvises on grand piano. Stevie Clarke narrates a scene in real time. Neither has foreknowledge of where the other will go. The emotional transitions you hear are real, unscripted, and unrehearsed. This is the practice the annotation schema is built to encode — the ability to navigate emotional territory in real time, at the moment of decision, without a script.
Performed live by Nicholas Clarke. Single take. The emotional arc emerged in the moment — it was not pre-planned.
A structured improvisation was performed with deliberate emotional architecture. Listeners were asked only: what emotional shifts do you hear?
All three listeners correctly identified each emotional state, confirming that the performances encode emotionally recoverable structure.
Session 001 was then submitted to Gemini 1.5 Pro with no labels or guidance as an independent structural check. The model identified 13 emotional transitions, 4 compound emotional states, and 3 unexpected musical responses — consistent with the annotator's labels. This is a plausibility signal, not a model validation.
The signal does not need to be argued. It is in the recording.
The research demonstration — short structured sessions with the emotional arc stated on camera before recording, transitions narrated in real time as they happen, and independent annotators labelling without access to the original labels — is currently in production as part of Phase 1 dataset construction.
Session 001 is encoded. The schema is locked. Recording is underway.
If you want to be notified when the research demo is available, use the contact form below.
The moat is not the music. It is not the middleware. It is the dataset.
A proprietary corpus of narrated real-time improvisational decisions, built through structured recording sessions using a methodology developed for this specific purpose. Timestamped, emotionally labelled, and built on a practice that takes years to develop. Not a budget. Not a scraper. Years.
Most AI startups begin with zero proprietary data. StoneKey begins with a years-long archive of improvised performances and a structured methodology for building a dataset no one else can replicate.
Session 001 is complete, annotated, and locked. Phase 1 target: 102 sessions.
Decision-level training data. Not songs. Moments of human musical reasoning, timestamped to the event.
Including 4 compound states and 3 unexpected responses. Schema is locked and validated.
No public dataset contains real-time emotional decision narration alongside live performance.
Feltwork outputs musical steering instructions rather than generating raw audio. Low compute, real-time compatible, integrates into existing middleware. Studios do not rebuild their pipeline. They add an emotional conductor to what they already have.
AI can generate music that sounds human.
Feltwork generates music that thinks human.
Investors, studios, co-founder candidates, collaborators. If you felt something watching the demo and you want to talk about what comes next, fill out the form.
Currently seeking: a technical co-founder with a background in sequence modelling, audio ML, or reinforcement learning. Equal founding partnership. Remote-friendly.