
Closed
Posted
Paid on delivery
Project Type: Contract / Task-based Category: Game Development, LuaU Programming, QA & Test Engineering About the Project We are looking for skilled taskers to help build reinforcement learning evaluation scenarios for Roblox's LLM Assistant. The LLM Assistant is an AI tool built into RobloxStudio that helps developers write and modify game code. Our goal is to evaluate and improve this assistant's performance using a framework called OpenGameEval, which runs the LLM against sample scenarios and assesses how well it completes prompted tasks. Your role is to take skeleton evaluation scripts — each containing a base game state and a prompt describing a development task — and complete them into fully functional evaluation scenarios. What You'll Be Doing For each assigned scenario, you will: Analyze the prompt and game environment — Open the provided .rbxl file in RobloxStudio, understand the game logic, and determine what a correct solution to the prompt looks like. Write a reference ("golden") solution — Implement an ideal LuaU solution to the prompt inside the eval's reference function. Design a rubric with evaluation criteria — Plan correctness, completeness, and precondition criteria that map directly to assert statements in your test harness. Build unit tests — Write assert-based tests across three evaluation stages: static checks (check_scene), server-side runtime checks (serverCheck), and client-side runtime checks (clientChecks). Provide instance guidance — Identify which script and non-script instances in the DataModel are relevant to evaluating the prompt. Document your approach — Write a markdown explanation covering your reference code, unit tests, blockers, and methodology. Required Skills & Experience Strong proficiency in LuaU / Lua and experience scripting within the Roblox ecosystem Solid understanding of Roblox's DataModel, including services such as Workspace, Players, ReplicatedStorage, ServerScriptService, StarterPlayer, StarterPack, and StarterGui Familiarity with Roblox's client-server architecture — understanding the distinction between Scripts, LocalScripts, and ModuleScripts, and how replication works Experience with RobloxStudio — navigating the Explorer, reading instance properties/attributes, and using testing modes (Test, Server & Clients) Unit testing mindset — ability to write tests that validate functionality without overfitting to a specific implementation. Tests must accept all reasonable solutions, not just your own. Familiarity with the Roblox API — especially Instance, Player, Humanoid, Tool, RemoteEvent, ProximityPrompt, and ValueBase classes Nice to Have Experience with Roblox game templates or community-published games Familiarity with reinforcement learning evaluation concepts Prior experience with test harness design or QA automation Knowledge of VirtualInputManager for simulating player inputs Tools You'll Use RobloxStudio with the Assistant Eval Studio Plugin OpenGameEval framework (for remote validation) Standard text editor for .lua script development Deliverables Per Task Completed .lua eval script with all fields filled in (reference, instance guidance, check_scene, serverCheck, clientChecks) Rubric file (.json) following the provided schema specification Documentation file (.md) explaining your reference and testing approach Key Principles to Keep in Mind Tests must be solution-agnostic. If an average Roblox developer would consider a solution acceptable, your tests should pass it. Brittle, overfitting tests are not acceptable. Design tests before writing the reference to avoid biasing your harness toward your own implementation. All validation must use Lua's assert function — other testing frameworks or error-throwing approaches won't work with the OpenGameEval pipeline. The reference function must be self-contained — no dependencies on the testing suite or EvalUtils modules. Avoid including any PII in submissions. How to Apply Please include the following in your application: A brief summary of your Roblox development experience, including any published games or plugins Examples of LuaU code you've written (game scripts, tools, systems) Any relevant experience with test design, QA, or evaluation frameworks Your availability and estimated capacity (tasks per week) Write "Lua" at the beginning of your proposal. We look forward to hearing from experienced Roblox developers who enjoy solving puzzles, writing clean code, and thinking critically about what makes a good evaluation!
Project ID: 40377210
20 proposals
Active 14 days ago
Location: United States
Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
20 freelancers are bidding on average $1,204 USD for this job

I have carefully reviewed your requirements and understand you need to build robust, solution-agnostic evaluation scenarios for Roblox’s LLM Assistant using LuaU and OpenGameEval. With 9+ years of development experience, I have strong expertise in Lua/LuaU scripting, Roblox DataModel, and client-server architecture. I am experienced in writing clean, modular code and designing reliable test systems with assert-based validation. I will analyze each scenario, implement accurate reference solutions, design clear rubrics, and build comprehensive test cases (check_scene, serverCheck, clientChecks) ensuring flexibility and correctness across valid implementations. My approach focuses on clean logic, proper coverage, and non-brittle testing. I can also provide structured documentation and maintain consistent quality across tasks. Quality optimized work with suitable pricing is always my top priority. Come inbox & let’s get started. Best regards, Sohaib Pasn Games
$1,125 USD in 7 days
6.4
6.4

As an AI Automation Specialist with a passion for game development, my skill set is not only precise, but also tailored specifically for the requirements of this project. I have extensive experience working within the Roblox ecosystem and have a solid understanding of its scripting and testing aspects. My familiarity with Roblox's DataModel, client-server architecture, and API will help me to navigate and effectively utilize RobloxStudio and the Assistant Eval Studio Plugin to create fully functional evaluation scenarios. In addition to my in-depth LuaU proficiency, I bring strong problem-solving skills and a unit-testing mindset. This means that any tests I design will validate functionality without overfitting to a specific implementation, ensuring they accept all reasonable solutions rather than just my own. My versatility extends beyond game development as well — I have prior experience in test harness design and QA automation, which could prove beneficial in this evaluation process. My 100% job completion rate, on-time delivery, and top-rated reviews across all projects attest to my commitment to deadlines and quality work. With my fast response time, full-time availability, and even emergency support when needed, your project will receive dedicated attention at all times. Let's leverage my expertise in AI automation to ensure the LLM Assistant can meet its full potential!
$1,125 USD in 7 days
4.6
4.6

⭐⭐⭐⭐⭐ Create Evaluation Scenarios for Roblox's LLM Assistant ❇️ Hi My Friend, I hope you are doing well. I've reviewed your project requirements and noticed you're looking for someone to build evaluation scenarios for Roblox's LLM Assistant. Look no further; Zohaib is here to help you! My team has successfully completed 50+ similar projects in game development and LuaU programming. I will analyze game environments, create ideal solutions, design evaluation rubrics, and build comprehensive unit tests to ensure the assistant performs well. ➡️ Why Me? I can easily create your evaluation scenarios as I have 5 years of experience in LuaU scripting, game logic analysis, and QA testing. My expertise includes Roblox's DataModel, client-server architecture, and effective testing strategies. Additionally, I have a strong grip on RobloxStudio and its tools, ensuring a smooth process for your project. ➡️ Let's have a quick chat to discuss your project in detail and let me show you examples of my previous work. Looking forward to discussing with you in chat. ➡️ Skills & Experience: ✅ LuaU Programming ✅ Game Development ✅ RobloxStudio Navigation ✅ Unit Testing ✅ Evaluation Criteria Design ✅ Game Logic Analysis ✅ Client-Server Architecture ✅ Test Harness Design ✅ API Familiarity ✅ Documentation Writing ✅ Problem-Solving ✅ Code Optimization Waiting for your response! Best Regards, Zohaib
$900 USD in 2 days
3.4
3.4

Lua Hi, I have strong experience in Roblox game development, with a deep proficiency in LuaU scripting and a solid understanding of the Roblox ecosystem. I am familiar with the Roblox DataModel, including services such as Workspace, Players, and ReplicatedStorage, and have worked extensively with RobloxStudio’s client-server architecture. For this project, I can efficiently analyze and complete reinforcement learning evaluation scenarios by writing reference LuaU solutions, designing rubrics with clear evaluation criteria, and building robust unit tests using assert-based checks across static, server-side, and client-side stages. My experience with Roblox’s testing modes and API will allow me to write flexible, solution-agnostic tests that ensure a valid evaluation process for the LLM Assistant. I also have experience in test harness design and am comfortable writing documentation explaining my approach and the results of my tests. Best regards, Juan
$850 USD in 7 days
2.8
2.8

As an experienced game developer with a solid understanding of Roblox's ecosystem, I believe I am the perfect fit for your LLM scenario development project. I have worked extensively with LuaU / Lua and have a deep understanding of Roblox's DataModel and services like Workspace, Players, ReplicatedStorage, ServerScriptService, StarterPlayer, StarterPack, and StarterGui. This knowledge allows me to quickly adapt to and navigate the RobloxStudio environment to analyze and complete the provided evaluation scripts. In addition to my language proficiency, I bring a strong unit testing mindset to the table that is crucial for this task. I am skilled at writing tests that validate functionality without overfitting; tests that accept all acceptable solutions rather than just my own. This skill is instrumental in a solution-agnostic approach your project demands. Lastly, my familiarity with the Roblox API including classes like Instance, Player, Humanoid etc., speaks to my deep expertise in the game development space. Given your preference for applicants experienced in reinforcement learning evaluation concepts and test harness design or QA automation - there are valuable advantages you can leverage for your project through me. Thanks for considering!
$1,500 USD in 7 days
0.0
0.0

Hi, I’ve read your description carefully. I have experience in this area. If you’d like, I can send you URLs as examples. I believe I can handle your task. If you hire me, I will give it my best effort. Let’s connect on chat. Looking forward to hearing from you!
$1,125 USD in 14 days
0.0
0.0

Córdoba, Spain
Member since Sep 2, 2025
$750-1000 USD
$30-250 USD
$8-15 USD / hour
$250-750 USD
$8-15 USD / hour
$250-750 USD
$250-750 USD
$10-30 USD
min €36 EUR / hour
$8-15 USD / hour
$10-50 USD
$10-30 USD
₹12500-37500 INR
$30-250 USD
₹12500-37500 INR
$30-250 CAD
€6-12 EUR / hour
£250-750 GBP
$10-30 USD
$750-1500 USD
min €36 EUR / hour
$1500-3000 USD
$250-750 USD
$10-30 CAD
$50-100 AUD