AI Test & Evaluation Specialist

Work Role ID: 672  |  Workforce Element: AI / Data

What does this work role do? Performs testing, evaluation, verification, and validation on AI solutions to ensure they are developed to be and remain robust, resilient, responsible, secure, and trustworthy; and communicates results and concerns to leadership.

CORE KSATs
KSAT ID Description KSAT
22 * Knowledge of computer networking concepts and protocols, and network security methodologies. Knowledge
108 * Knowledge of risk management processes (e.g., methods for assessing and mitigating risk). Knowledge
1157 * Knowledge of national and international laws, regulations, policies, and ethics as they relate to cybersecurity. Knowledge
1158 * Knowledge of cybersecurity principles. Knowledge
1159 * Knowledge of cyber threats and vulnerabilities. Knowledge
6900 * Knowledge of specific operational impacts of cybersecurity lapses. Knowledge
6935 * Knowledge of cloud computing service models Software as a Service (SaaS), Infrastructure as a Service (IaaS), and Platform as a Service (PaaS). Knowledge
6938 * Knowledge of cloud computing deployment models in private, public, and hybrid environment and the difference between on-premises and off-premises environments. Knowledge
ADDITIONAL KSATs
KSAT ID Description KSAT
40 Knowledge of organization’s evaluation and validation requirements. Knowledge
182 Skill in determining an appropriate level of test rigor for a given system. Skill
508 Determine level of assurance of developed capabilities based on test results. Task
550 Develop test plans to address specifications and requirements. Task
694 Make recommendations based on test results. Task
765B Perform AI architecture security reviews, identify gaps, and develop a risk management plan to address issues. Task
858B Record and manage test data. Task
858A Test, evaluate, and verify hardware and/or software to determine compliance with defined specifications and requirements. Task
942 Knowledge of the organization’s core business/mission processes. Knowledge
1133 Knowledge of service management concepts for networks and related standards (e.g., Information Technology Infrastructure Library, current version [ITIL]). Knowledge
5120 Conduct hypothesis testing using statistical processes. Task
5848 Assess technical risks and limitations of planned tests on AI systems. Task
5850 Assist integrated project teams to identify, curate, and manage data. Task
5851 Build assurance cases for AI systems that support the needs of different stakeholders (e.g., acquisition community, commanders, and operators). Task
5858 Conduct AI risk assessments to ensure models and/or other solutions are performing as designed. Task
5866 Create or customize existing Test and Evaluation Master Plans (TEMPs) for AI systems. Task
5873 Determine methods and metrics for quantitative and qualitative measurement of AI risks so that sensitivity, specificity, likelihood, confidence levels, and other metrics are identified, documented, and applied. Task
5876 Develop machine learning code testing and validation procedures. Task
5877 Develop possible solutions for technical risks and limitations of planned tests on AI solutions. Task
5889 Identify and submit exemplary AI use cases, best practices, failure modes, and risk mitigation strategies, including after-action reports. Task
5896 Maintain current knowledge of advancements in DoD AI Ethical Principles and Responsible AI. Task
5901 Measure the effectiveness, security, robustness, and trustworthiness of AI tools. Task
5910 Provide quality assurance of AI products throughout their lifecycle. Task
5914 Report test and evaluation deficiencies and possible solutions to appropriate personnel. Task
5916 Select and use the appropriate models and prediction methods for evaluating AI performance. Task
5919 Test AI tools against adversarial attacks in operationally realistic environments. Task
5920 Test components to ensure they work as intended in a variety of scenarios for all aspects of the AI application. Task
5921 Test how users interact with AI solutions. Task
5922 Test the reliability, functionality, security, and compatibility of AI tools within systems. Task
5923 Test the trustworthiness of AI solutions. Task
5926 Use models and other methods for evaluating AI performance. Task
6060 Ability to collect, verify, and validate test data. Ability
6170 Ability to translate data and test results into evaluative conclusions. Ability
6311 Knowledge of machine learning theory and principles. Knowledge
6490 Skill in assessing the predictive power and subsequent generalizability of a model. Skill
6630 Skill in preparing Test & Evaluation reports. Skill
6641 Skill in providing Test & Evaluation resource estimate. Skill
7003 Knowledge of AI security risks, threats, and vulnerabilities and potential risk mitigation solutions. Knowledge
7004 Knowledge of AI Test & Evaluation frameworks. Knowledge
7006 Knowledge of best practices from industry and academia in test design activities for verification and validation of AI and machine learning systems. Knowledge
7009 Knowledge of coding and scripting in languages that support AI development and use. Knowledge
7012 Knowledge of current test standards and safety standards that are applicable to AI (e.g. MIL-STD 882E, DO-178C, ISO26262). Knowledge
7020 Knowledge of DoD AI Ethical Principles (e.g., responsible, equitable, traceable, reliable, and governable). Knowledge
7024 Knowledge of how AI is developed and operated. Knowledge
7025 Knowledge of how AI solutions integrate with cloud or other IT infrastructure. Knowledge
7028 Knowledge of how to automate development, testing, security, and deployment of AI/machine learning-enabled software to the DoD. Knowledge
7029 Knowledge of how to collect, store, and monitor data. Knowledge
7030 Knowledge of how to deploy test infrastructures with AI systems. Knowledge
7034 Knowledge of interactions and integration of DataOps, MLOps, and DevSecOps in AI. Knowledge
7036 Knowledge of laws, regulations, and policies related to AI, data security/privacy, and use of publicly procured data for government. Knowledge
7037 Knowledge of machine learning operations (MLOps) processes and best practices. Knowledge
7038 Knowledge of metrics to evaluate the effectiveness of machine learning models. Knowledge
7040 Knowledge of Personal Health Information (PHI), Personally Identifiable Information (PII), and other data privacy and data reusability considerations for AI solutions. Knowledge
7041 Knowledge of remedies against unintended bias in AI solutions. Knowledge
7044 Knowledge of testing, evaluation, validation, and verification (T&E V&V) tools and procedures to ensure systems are working as intended. Knowledge
7045 Knowledge of the AI lifecycle. Knowledge
7048 Knowledge of the benefits and limitations of AI capabilities. Knowledge
7051 Knowledge of the possible impacts of machine learning blind spots and edge cases. Knowledge
7053 Knowledge of the user experience (e.g., decision making, user design, and human-computer interaction) as it relates to AI systems. Knowledge
7054 Knowledge of tools for testing the robustness and resilience of AI products and solutions. Knowledge
7065 Skill in explaining AI concepts and terminology. Skill
7067 Skill in identifying low-probability, high-impact risks in machine learning training data sets. Skill
7069 Skill in identifying risk over the lifespan of an AI solution. Skill
7070 Skill in integrating AI Test & Evaluation frameworks into test strategies for specific projects. Skill
7075 Skill in testing and evaluating machine learning algorithms or AI solutions. Skill
7076 Skill in testing for bias in data sets and AI system outputs as well as determining historically or often underrepresented and marginalized groups are properly represented in the training, testing, and validation data sets and AI system outputs. Skill
7077 Skill in translating operation requirements for AI systems into testing requirements. Skill