Request for comments: Simple Universal Prompting System

Status Update: Architecture Refactor and Tool Integration
Major Changes Since Last Update’’

Tool Usage Support: System now planned to support capturing model-generated tool calls, executing them in Python, and feeding responses back into workflows. Currently blocking execution (system pauses during tool calls) - sufficient for deploying in eval mode with batch size of one, or kludgy but functional SIMD-like data generation use case that accepts occasional blocks, particularly in on-machine python tool usage.

Core TTFA Redesign: Broke apart monolith into more responsibilities to be moduler under the chain of responsibility pattern. I am REALLY good at vector indexing, but frankly I think I still can only manage one or so problem at a time. Planning phase still, as it is much easier to plan these interactions out now then code it.

  • Main Orchestration Autonoma (MOA): Coordinator and program controller - watches for control tokens, manages PC state, no token replacement, does jumps and zone advancement.

Four injection transform are contained within the MOA, and listen to the tokens passing and the current Program Counter in order to inject tokens in the right order. They are resolved in priority order, and in essence this gives a tiny extendable OS kernel for this specific job . These resolve according to a chain of responsibility, replacing the model’s token stream as needed, and all resolve before the MOA resolves flow control:

  • Timeout Triggering Autonoma - forces zone advancement on timeout
  • Prompt Feeding Autonoma - teacher-forces prompt tokens
  • Stream Input Feeding Autonoma - feeds tool responses
  • Stream Output Capturing Autonoma - captures tool calls

Documentation is updated on the repo.

Tool Interface added in SACS/SFCS

pythontoolbox = program.new_toolbox(input_buffer_size=10000, output_buffer_size=10000)
search_tool = toolbox.setup_tool(web_search_callback)

program.capture("search_query", search_tool)  # Capture tool call from last zone in sequence. Do not expect this to be GPU fast when resolving.
program.feed("search_results")                # Feed response. Exhausts input buffer.

Implementation Status

  • All specifications now complete. Outstanding complexity problems resolved. Parsing chain clear. Compilation chain clear.
  • Open source is likely beneficial now since tool usage is confirmed as possible. One stop solution after all.

Next Steps

  • Doublecheck spec/documentation is self-consistent to avoid stupid oversights.
  • Develop contribution guidelines
  • Get repo ready for open source with licenses, etc.
  • Figure out how to setup CI again.
  • Code frontend parsers to reduce down to ZCP, ensure no unexpected issues occur before creating backend.
  • Include VERY thorough error checking in the parsers, with descriptive messages. I trust myself very little.
1 Like