CISA’s chief data officer: Bias in AI models won’t be the same for every agency

Monitoring and logging are critical for agencies as they assess datasets, though “bias-free data might be a place we don’t get to,” the federal cyber agency’s CDO says.

By Matt Bracken

April 24, 2024

(Getty Images)

As chief data officer for the Cybersecurity and Infrastructure Security Agency, Preston Werntz has made it his business to understand bias in the datasets that fuel artificial intelligence systems. With a dozen AI use cases listed in CISA’s inventory and more on the way, one especially conspicuous data-related realization has set in.

“Bias means different things for different agencies,” Werntz said during a virtual agency event Tuesday. Bias that “deals with people and rights” will be relevant for many agencies, he added, but for CISA, the questions become: “Did I collect data from a number of large federal agencies versus a small federal agency [and] did I collect a lot of data in one critical infrastructure sector versus in another?”

Internal gut checks of this kind are likely to become increasingly important for chief data officers across the federal government. CDO Council callouts in President Joe Biden’s AI executive order cover everything from the hiring of data scientists to the development of guidelines for performing security reviews.

For Werntz, those added AI-related responsibilities come with an acknowledgment that “bias-free data might be a place we don’t get to,” making it all the more important for CISA to “have that conversation with the vendors internally about … where that bias is.”

“I might have a large dataset that I think is enough to train a model,” Werntz said. “But if I realize that data is skewed in some way and there’s some bias … I might have to go out and get other datasets that help fill in some of the gaps.”

Given the high-profile nature of agency AI use cases — and critiques that inventories are not fully comprehensive or accurate — Werntz said there’s an expectation of additional scrutiny on data asset purchases and AI procurement. As CISA acquires more data to train AI models, that will have to be “tracked properly” in the agency’s inventory so IT officials “know which models have been trained by which data assets.”

Adopting “data best practices and fundamentals” and monitoring for model drift and other potentially problematic AI concepts is also top of mind for Werntz, who emphasized the importance of performance security logging. That comes back to having an awareness of AI models’ “data lineage,” especially as data is “handed off between systems.”

Beyond CISA’s walls, Werntz said he’s focused on sharing lessons learned with other agencies, especially when it comes to how they acquire, consume, deploy and maintain AI tools. He’s also keeping an eye out for technologies that will support data-specific efforts, including those involving tagging, categorization and lineage.

“There’s a lot of onus on humans to do this kind of work,” he said. “I think there’s a lot of AI technologies that can help us with the volume of data we’ve got.” CISA wants “to be better about open data,” Werntz added, making more of it available to security researchers and the general public.

The agency also wants its workforce to be trained on commercial generative AI tools, with some guardrails in place. As AI “becomes more prolific,” Werntz said internal trainings are all about “changing the culture” at CISA to instill more comfort in working with the technology.

“We want to adopt this. We want to embrace this,” Werntz said. “We just need to make sure we do it in a secure, smart way where we’re not introducing privacy and safety and ethical kinds of concerns.”

CISA’s chief data officer: Bias in AI models won’t be the same for every agency

More Like This

MITRE’s Federal AI Sandbox will focus on critical infrastructure, weather modeling, social services

White House’s final ‘Trust Regulation’ aims to bolster confidence in federal statistics

Login.gov announces availability for facial recognition technology

Top Stories

SSA database to flag synthetic identity fraud has cost issues, GAO finds

Data, talent, funding among top barriers for federal agency AI implementation

Government websites aren’t created equal. GSA’s 10x program aims to change that

Announcing the 2024 FedScoop 50

House Republicans probe NIST on facial recognition for federal digital identity verification

CISA official: AI tools ‘need to have a human in the loop’

Latest Podcasts

A look at next week’s hearing on unidentified anomalous phenomena

Login.gov launches facial recognition option; the White House issues final Trust Regulation

HHS is working on a new AI strategic plan; Budget woes for government’s science and technology efforts

Meet the winners of the 2024 FedScoop 50; And, how the DOE sees itself counteracting the AI industry’s ‘profit motive’

Tech

Defense

Cyber

FedScoop TV