$105 / hr
Hourly contract
Remote
To improve the quality, usefulness, and reliability of general-purpose conversational AI systems. These systems are used across a wide range of everyday and professional scenarios, and their effectiveness depends on how clearly, accurately, and helpfully they respond to real user questions.
In financial contexts, even small errors or unclear reasoning can have significant downstream impact. This project focuses on evaluating and improving how conversational AI systems reason about, explain, and respond to finance-related queries. Your expertise helps ensure model outputs reflect real-world financial knowledge, sound quantitative reasoning, and clear professional communication.
What You’ll Do
Write and refine prompts to guide model behavior in financial contexts
Evaluate LLM-generated responses to finance-related user queries for accuracy, reasoning quality, and clarity
Conduct fact-checking using trusted public sources, financial references, and external tools
Annotate model responses by identifying strengths, areas of improvement, and factual or conceptual inaccuracies
Assess tone, completeness, and appropriateness of responses for real-world financial use cases
Ensure model responses align with expected conversational behavior and system guidelines
Apply consistent evaluation standards by following clear taxonomies, benchmarks, and detailed evaluation guidelines
Who You Are
You have a minimum of 5 years of real-world professional experience in Finance, supported by an associated Bachelor’s, Master’s, or PhD degree (e.g., Finance, Accounting, Economics, Business, or related fields)
You have experience in one or more of the following sub-domains:
Investment Banking
Corporate Finance
Accounting & Auditing
Asset Management
You have significant experience using large language models (LLMs) and understand how and why people use them
You have excellent writing skills and can clearly explain complex financial concepts
You have strong attention to detail and consistently notice subtle issues others may overlook
Nice-to-Have Specialties
Prior experience with RLHF, model evaluation, or data annotation work
Experience writing or editing high-quality financial written content
Experience translating complex financial ideas for non-expert audiences
Familiarity with evaluation rubrics, benchmarks, or quality scoring systems
What Success Looks Like
You identify financial inaccuracies, faulty assumptions, and weak reasoning in model responses
Your feedback directly improves the accuracy and usefulness of finance-related AI outputs
You deliver clear, reproducible evaluation artifacts that customers can act on
#remotejobs #financejobs #finance #aijobs

