LLM Trainer for Simple Prompt tasks ,Any basic programming- Multiple languages

Upwork

Remoto

•

1 hora atrás

•

Nenhuma candidatura

Sobre

Job Title: LLM Trainer JD for Agent Completion tasks *Remote | 4-hour overlap with PST mornings required* Test 1 – Agent Completion Task Design 9 Single-Choice Questions (task sub-categories, tool usage, error handling, multi-turn interactions). 11 Scenario Questions (decide best assistant response using tool descriptions). Duration: 30 minutes. Focus on: Lazy User, Error Recovery, State Dependency, Task Switching. Must follow Agent Guidelines. Test 2 – Language Assessment Language-based evaluation. Duration: 30 minutes. Candidates must respond in their native language 🚫 Countries That Cannot Be Used (Restricted) As per compliance, candidates from the following countries are restricted and cannot be considered: OFAC / OFAC+ Countries: Cuba, Iran, North Korea, Syria, Crimea, Donetsk, Luhansk, Zaporizhzhia regions of Ukraine Russia, China, Afghanistan, Yemen, Central African Republic, Eritrea, Guinea, Guinea-Bissau, DR Congo, Venezuela, Liberia, Libya, Papua New Guinea, Somalia, South Sudan, Belarus, Myanmar, Sudan, Burundi, etc. ⚠️ Please avoid submitting profiles from these regions for any roles. ✅ Countries Allowed (Greenlist + Long Tail) Candidates can be considered from: Greenlist: Pakistan, Nigeria, Kenya, Egypt, Ghana, Bangladesh, Turkey, Mexico. Long Tail: Argentina, UAE, Vietnam, Ethiopia, Chile, Bulgaria, Croatia, Lithuania, Poland, Slovenia, Thailand, South Korea, Romania, Nepal, Philippines, India, Hungary, Malaysia, Indonesia, South Africa, Czech Republic, Estonia, Latvia, etc. 🌍 Countries by Language (Allowed Pools) French: Ivory Coast, Senegal, Cameroon, Madagascar, Tunisia, Morocco, Philippines. Portuguese: US, Brazil, Poland, Romania, Philippines, Morocco. Spanish: Mexico, Argentina, Colombia, Peru, Chile. Chinese: US, India, Brazil, Thailand, Malaysia. German: US, India, Romania, Poland, Hungary, Philippines, South Africa, Czech Republic. Italian: US, India, Brazil, Mexico, Argentina, Colombia. Dutch: US, India, Poland, Suriname, South Africa, Indonesia, Hungary, Philippines. Japanese: US, India, Brazil, Thailand, Vietnam, Malaysia. Korean: US, India, Vietnam, Malaysia, Philippines, Indonesia. Danish: Poland, Philippines. Norwegian: US, Lithuania, Poland, Latvia, Estonia, Philippines. Swedish: Poland, Estonia, Latvia, Philippines. ### *About the Role* This position is within a project with one of the foundational LLM companies. The goal is to assist these foundational LLM companies in enhancing their Large Language Models. One way we help these companies improve their models is by providing them with high-quality proprietary data. This data serves two main purposes: first, as a basis for fine-tuning their models, and second, as an evaluation set to benchmark the performance of their models or competitor models. For example, in the case of Agent Completion (AC) data generation, your task will be to simulate high-quality multi-turn conversations between a user and a smart assistant that utilizes function-calling tools to accomplish user goals. You will craft these dialogues by playing both the assistant and the user, while simulating tool use where necessary to guide the assistant through complex decision-making and real-world reasoning scenarios. What does day-to-day look like: Design multi-turn conversations that simulate real interactions between users and AI assistants using apps like calendar, email, maps, and drive. Emulate both the user and the assistant, including the assistant's tool calls (only when corrections are needed). Carefully select when and how the assistant uses available tools, ensuring logical flow and proper usage of function calls. Craft dialogues that demonstrate natural language, intelligent behavior, and contextual understanding across multiple turns. Generate examples that showcase the assistant’s ability to gracefully complete feasible tasks, recognize infeasible ones, and maintain engaging general chat when tools aren’t required. Ensure all conversations adhere to defined formatting and quality guidelines, using an internal playbook. Iterate on conversation examples based on feedback to continuously improve realism, clarity, and value for training purposes. Collaborate with peers and reviewers to maintain consistency and high standards in deliverables. Requirements: 3+ years of overall professional experience in a technical or analytical field. Experience in any programming language or tech stack is acceptable; a strong grasp of APIs, data formats (e.g., JSON), and logical thinking is more critical than specific toolsets. Strong general technical reasoning skills and the ability to model real-world assistant behavior using tool-based APIs. Ability to break down complex tasks and simulate realistic dialogues that reflect user expectations and assistant limitations. Excellent written communication skills in English, with a focus on clarity, tone, and instructional coherence. Creativity and attention to detail in crafting realistic scenarios and responses. Experience working with or around LLMs, virtual assistants, or function-calling frameworks is a plus. Ability to follow detailed guidelines and formatting standards with high consistency. Native or professional proficiency in at least one of the following 16 locales is required: pt_BR – Portuguese (Brazil) fr_FR – French (France) it_IT – Italian (Italy) de_DE – German (Germany) es_MX – Spanish (Mexico) zh_CN – Chinese (Simplified, China) zh_HK – Chinese (Traditional, Hong Kong) zh_TW – Chinese (Traditional, Taiwan) ja_JP – Japanese (Japan) ko_KR – Korean (South Korea) vi_VN – Vietnamese (Vietnam) tr_TR – Turkish (Turkey) nl_NL – Dutch (Netherlands) sv_SE – Swedish (Sweden) nb_NO – Norwegian Bokmål (Norway) da_DK – Danish (Denmark) Perks of Freelancing With JUPITER AI LABS: OPEN POSITIONS ARE : – LLM Trainer – Agentic tasks – Japanese (Japan) – LLM Trainer – Agentic tasks – Chinese (Simplified, China) – LLM Trainer – Agentic tasks – Chinese (Traditional, Taiwan) – LLM Trainer – Agentic tasks – Danish (Denmark) Penguin – LLM Trainer – Agentic tasks – Korean (South Korea) – LLM Trainer – Agentic tasks – Chinese (Traditional, Hong Kong) – LLM Trainer – Agentic tasks – Dutch (Netherlands) – LLM Trainer – Agentic tasks – Swedish (Sweden) – LLM Trainer – Agentic tasks – Norwegian Bokmål (Norway) Work in a fully remote environment. Opportunity to work on cutting-edge AI projects with leading LLM companies. Offer Details: Commitments Required: At least 4 hours per day and minimum 20 hours per week with overlap of 4 hours with PST. Engagement Type: Contractor assignment (no medical/paid leave) Duration of Contract: 6 Weeks Perks of working with us: Work in a fully remote environment. Opportunity to work on cutting-edge AI projects with leading LLM companies. Offer Details: Commitments Required: At least 4 hours per day and minimum 20 hours per week with overlap of 4 hours with PST. (We have 3 options of time commitment: 20 hrs/week, 30 hrs/week or 40 hrs/week) Engagement type : Contractor assignment (no medical/paid leave) Engagement length - 6-8 weeks in first phase Hourly rate: $12-20 or more USD/hr or monthly pay as per consensus Are you available from fixed pay of [160-168 hrs] for 8 hrs daily for one month cycle. If yes the payment will be in this way, if you work from 1 Mar to 31 Mar , our system pay by 30 Apr-5 May as per upwork payment , and similar the cycle continue every month,Opportunity: Full-time, 4 - 5 hours (PST overlap)