Trade-offs Between AI Programming Assistant use in Assessments
Updated by Shayna Pittman
Trade-offs Between AI Programming Assistant use in Assessments
- Allow optional use: Candidates who choose not to use GPT-4 (or haven't acquired that skill) will have a very large disadvantage. To quantify with your 3 current scenarios, it's a HUGE advantage on 1 scenario (Prorating Subscriptions where copy/paste into GPT-4 + minor cleanup gets farther than any of your current team got and >90% of unassisted candidates get), a significant advantage on another (architecture debugging- it's HUGE on the communication and a medium advantage on the brainstorming), and a small advantage on the 3rd (Reviewing Audit Settings). GPT-3.5, Bard, and Anthropic all perform 1/2 to 1 seniority tier worse than GPT-4, though
- Require and expect use: We could implement a combination of increased scope/complexity and shortened timeboxes, then candidates will be on a level playing field. Downside is that candidates who aren't yet good with these tools won't advance, even if they're otherwise stellar engineers. This options seems like where more companies will move as AI Assistance move from a nice-to-have skill to a must-have skill when hiring.
- Medium term: Once AI Assistant skills are a must-have, we have ideas (and would love your ideas!) for new scenarios that will specifically evaluate those skills (e.g. starting with subtly-broken code, leaning into more code review, etc)
- Ban use (current status): Upside is that you get apples to apples comparison and a high-signal measure on relevant problem solving skills that has track record of success with the team. Downsides include not getting signal on skills with AI assistant use and potential candidate confusion around your posture towards those tools on the job (though I believe we could greatly mitigate this factor by reviewing communication going out to the candidates)