A researcher from Queen’s University Belfast has developed an innovative new algorithm that will help make artificial intelligence (AI) fairer and less biased when processing data.
Companies often use AI technologies to sift through huge amounts of data in situations such as an oversubscribed job vacancy or in policing when there is a large volume of CCTV data linked to a crime.
AI sorts through the data, grouping it to form a manageable number of clusters, which are groups of data with common characteristics. It is then much easier for an organization to analyze manually and either shortlist or reject the entire group.
However, while AI can save on time, the process is often biased in terms of race, gender, age, religion and country of origin.
Dr Deepak Padmanabhan from Queen’s has been leading an international project, working with experts at the Indian Institute of Technology Madras (Savitha Abraham and Sowmya Sundaram), to tackle the discrimination problem within clustering algorithms.
A researcher in the School of Electronics, Electrical Engineering and Computer Science and the Institute of Electronics, Communications and Information Technology at Queen’s, Dr Padmanabhan explains: “AI techniques for data processing, known as clustering algorithms, are often criticized as being biased in terms of ‘sensitive attributes’ such as race, gender, age, religion and country of origin. It is important that AI techniques are fair while aiding shortlisting decisions, to ensure that they are not discriminatory on such attributes.”
It has been reported that white-sounding names received 50 per cent more call-backs than those with black-sounding names. Studies also suggest that call-back rates tend to fall substantially for workers in their 40s and beyond. Another discriminatory trend is the ‘motherhood penalty’, where working mothers are disadvantaged in the job market while working fathers do better, in what is known as the ‘fatherhood bonus’.
Dr Padmanabhan says: “When a company is faced with a process that involves lots of data, it is impossible to manually sift through this. Clustering is a common process to use in processes such as recruitment where there are thousands of applications submitted. While this may cut back on time in terms of sifting through large numbers of applications, there is a big catch. It is often observed that this clustering process exacerbates workplace discrimination by producing clusters that are highly skewed.”
Over the last few years ‘fair clustering’ techniques have been developed and these prevent bias in a single chosen attribute, such as gender. However, Dr Padmanabhan has now created a method that, for the first time, can achieve fairness in many attributes.
Dr Padmanabhan comments: “Our fair clustering algorithm, called FairKM, can be invoked with any number of specified sensitive attributes, leading to a much fairer process.
In a way, FairKM takes a significant step towards algorithms assuming the role of ensuring fairness in shortlisting, especially in terms of human resources. With a fairer process in place, the selection committees can focus on other core job-related criteria.
“FairKM can be applied across a number of data scenarios where AI is being used to aid decision making, such as pro-active policing for crime prevention and detection of suspicious activities. This, we believe, marks a significant step forward towards building fair machine learning algorithms that can deal with the demands of our modern democratic society.”
Savitha Abraham, researcher at the Indian Institute of Technology Madras, commented: “Fairness in AI techniques is of significance in developing countries such as India. These countries experience drastic social and economic disparities and these are reflected in the data.
Employing AI techniques directly on raw data results in biased insights, which influence public policy and this could amplify existing disparities. The uptake of fairer AI methods is critical, especially in the public sector, when it comes to such scenarios.”
The research, which was conducted at Queen’s University’s Computer Science building, will be presented in Copenhagen in April 2020 at the EDBT 2020 conference, which is renowned for data science research.