This paper presents a case study of eleven sessions in which a single developer used large language models (LLMs), particularly GPT-3.5 and GPT-4, to assist with structured software development tasks. The study examines model performance in generating, testing, and documenting code through interactive, test-oriented workflows. Drawing on domain features from the D20 role-playing system, the sessions explore how prompt design, task complexity, and stepwise refinement influence code quality and reliability. Results suggest that LLMs can support modular development when guided through incremental tasks and used in ways analogous to pair programming or test-driven design. Practices such as schema definition, explicit testing, and conversational decomposition appeared to improve output consistency and functional correctness. The paper proposes a practical framework for working with LLMs in development contexts and discusses broader considerations for software teams, including developer training, productivity, and ethical concerns.