The most important digital currency in ecommerce is data. As personal information is collected and shared, risks to individual privacy climb, and UNC Charlotte computer scientist Liyue Fan is researching novel ways to better protect individuals’ personal data.
Through a five-year National Science Foundation Early CAREER Award, Fan is investigating data-sharing algorithms that can be applied to digital platforms to better safeguard personal privacy. An assistant professor of computer science in the College of Computing and Informatics, Fan explains data sharing as important to society because it provides researchers a more holistic view of an overall population. “It will lead to unleashing the true power of artificial intelligence,” she said.
Because data currently is highly centralized, it is fragmented across local and regional systems — such as in hospitals, banks and schools. For this reason, privacy-protecting data sharing is important to integrating these decentralized data.
“Privacy is a fundamental human right,” said Fan, an assistant professor of computer science. “While certain organizations and governmental agencies may be subject to privacy laws and regulations, individuals are still concerned that personal information can be leaked or misused.”
Protecting individual privacy is challenging. In 2017, personal information for 147 million people was exposed in an Equifax breach, and in 2019, survey results from the Pew Research Center indicated a majority of Americans “think their personal data is now less secure, that data collection poses more risks than benefits and believe it is not possible to go through daily life without being tracked.”
Fan confirms these concerns are justified. “As fine-grained personal data is continuously collected, traditional privacy solutions may fail to provide adequate privacy protection.”
“My research explores the tradeoffs between privacy and utility,” she explained. “I am seeking new ways to provide rigorous privacy guarantees without compromising the usefulness of the data.”
Data privacy: risks versus rewards
Retailers use data to build customer profiles for targeted advertising, often sharing or selling information to others. You’ve experienced this reality when after surfing the web for a product or service, it begins to show up in your personal social media feeds.
In some cases, sharing personal information digitally enhances users’ experiences and leads to a benefit — coupons, better services or exclusive offers. The rewards, however, are not without risks. Geospatial information on location could result in a person being stalked or homes burglarized. Exposing health data could lead to increased insurance premiums or service denials.
“Data sharing enables a wide range of beneficial applications and research studies. However, we want to protect the privacy of the individuals who contribute data, so we are developing a set of algorithmic tools to balance data privacy with information usefulness,” said Fan. She and her students have shown privacy-enhanced location traces can be shared to support mental health applications while mitigating privacy risks, such as disclosing identity and locations.
Building trust for data sharing
On a positive note, citizen participation in research data collection can have benefits. For example, wearable devices are becoming more prevalent in health care settings. Collecting behavioral lifestyle data could result in better outcomes related to physical or mental health and performance.
In addition, high-quality representative data can enable researchers to develop solutions beyond a one-size-fits-all approach, such as with advancing precision medicine.
Data shared by underrepresented populations is especially valuable, said Fan. “Research has shown providing individuals with privacy protection helps build trust and increases data sharing.”
Fan’s goal, through her research, is to bridge the gap between the two types of data collectors: trusted and untrusted. Traditional privacy-protecting data sharing solutions assume there is a trusted data aggregator. For example, the U.S. Census Bureau collects raw data from respondents and performs aggregation and privacy protection.
Her focus is on untrusted data collectors, such as websites and mobile applications. When individuals do not trust the data collector, it is challenging to balance individual privacy and information usefulness.
Practical use of Fan’s research
In developing computer algorithms to run on individual data records, Fan focuses on enabling privacy mechanisms to add protection with mathematical rigor.
She is employing specialized, optimized randomization techniques based upon the application’s utility. She works with a provable privacy notion called differential privacy, which provides rigorous protection by introducing randomness in the computation. Recently, differential privacy has been deployed at Google and Apple.
“By leveraging the domain application’s utility, we can find ways to optimize the design of differential privacy algorithms,” Fan stated.
In Fan’s work with geospatial data, her algorithm protects the actual user location by randomly reporting plausible locations in its proximity. By hiding the actual location, this would offer privacy protection while still sharing data that the collector would find useful.
“My research can benefit a wide range of application domains that rely upon individual contributed data, from geospatial to health care to social and behavioral science,” stated Fan. “I collaborate with experts in these domains to address specific privacy and utility needs.”
Fan’s NSF CAREER Award was for $574,877; these grants are awarded to only about 500 faculty each year across all disciplines nationally to support early career faculty who have the potential to serve as academic role models and leaders in research and education. UNC Charlotte faculty have received 13 CAREER Awards in the past five years.