Simms AI v1: On the Design of a Self-Trained Conversational Artificial Neural Network
Simms AI v1 was a Markov Chain-based chatbot, the culmination of several of my previous projects including WordNet, an application that attempted to spider websites and build associations between related websites based on keywords rather than mutual hyperlinking. The basic design of Simms is a centralized neural network database paired with a distributed processing array that parses input into long-term memory, constructs a response when necessary, and returns it to the module that delivered the input. Its neural structure is based on that of the human brain, in that each neuron or “node” maintains between one and thousands of connections to other nodes. In order to speed the development of its associative memory, a variety of techniques were employed, including spidering Wikipedia, reading chat logs, and engaging in conversation with human users, which quickly demonstrated the power and the limitations of the approach. Later in the project, I discvered that I had implemented a Markov Chain-based chat system, where the edge weights in the graph were trained from the input corpora.
SimmsAI was also one of my first forays into building a distributed system: components of SimmsAI managed communication and dispatched jobs between the several local compute nodes in which SimmsAI resided. My experiences building SimmsAI later contributed to my interest in pursuing a Ph.D. in distributed systems.
Speed and working database size became a primary concern as Simms’s neural network exceeded 300,000 nodes connected by two million connections, a suboptimal node-connection ratio caused by poor text-cleaning routines and storage efficiency. Careful analysis revealed fundamental flaws in the neural storage system that will be examined in designing Simms AI v2, the next iteration of the Simms project. In particular, I planned to add additional context to the edges between word nodes, including ideas for weighting edges with “emotions”. SimmsAI was implemented in PHP, and used MySQL to store its “memory”, another source of inefficiency. A full technical report about Simms AI v1 is available below.