Big Data, Big Problem, Big Bias

Published: Feb. 24, 2021, 8:28 p.m.

b'

Data can do and has done great things to help improve life for people everywhere. That\\u2019s accelerated in the digital age (yes, data existed before computers, we just gathered it with our senses) as we\\u2019ve been able to gather more and more data faster from more people. Learning how to sort and analyze that data quickly has also been a game-changer in forming policy and developing new products. We\\u2019ve been able to better target where a problem may lie in a company, which roads need to be fixed, better ways to distribute medicine and the list goes on.

Of course, as with so many things, there is a downside. We\\u2019ve often looked at how companies can profit off of the data that you generate for them without your consent to gather it in the first place, or even if you have consented to the gathering, you might not appreciate some of the ways that data gets used. Often, a person might withdraw that consent if they knew how it was being utilized. However, there is another downside to the way data is currently handled that we haven\\u2019t discussed nearly as much. That would be biases in how it gets sorted, or that things are sorted in certain ways at all.\\xa0

The government, companies and other organizations often sort data into different categories based on race, cultural background, income, and shopping habits. What do you notice about all of that? Those are all attributes of people. Yes, it is often useful to classify and sort information into different categories. Yet, aren\\u2019t people more than the sum of a few superficial attributes? Aren\\u2019t people more than their race? More than their paycheck? TARTLE would like to think so.

What are some examples? Some universities will sort people based on these kinds of categories and then run it through a set of predictive algorithms to determine who they should and shouldn\\u2019t admit. So you have a kid from the inner city, low income, no father, a couple of petty robberies on his rap sheet. The algorithm rejects him. It\\u2019s easy to see why. Yet, what if this kid is eager to turn his life around and do better, to get out of a crappy cycle? What if all he needs is a chance? The algorithm won\\u2019t catch that. It doesn\\u2019t care that the kid is a human being and not a collection of attributes.

Another is at least one town used predictive analytics to determine who in the area was likely to be a criminal. That led to a lot of harassment when police officers took the information their algorithm spit out and started trying too hard to catch those people in a crime. In addition to the obvious injustice of being treated like a criminal before even committing a crime that also meant that resources weren\\u2019t getting directed where they needed to be. A number of crimes might have been prevented if the police weren\\u2019t focused on people their computers said were likely to be criminals. Not to mention, by repeatedly agitating some people, you might actually create a couple of criminals when they lash out.

This has of course infected the corporate world as well. Some companies actually grade their employees\\u2019 productivity based in part on how much digital interaction they engage in. People at these companies can be considered productive if they send a lot of emails and participate in the company\\u2019s group chat. Of course, the emails could be a series of memes saved on your phone, the group chat could be talking about your new car or any number of silly and irrelevant things that have absolutely nothing to do with productivity. It\\u2019s possible this is the worst metric ever.

All of these examples point to a central and significant problem, a problem that pervades Big Data. The problem of forgetting that behind all of those data points is a person, a person that probably will not fit perfectly into the box an algorithm will try to shove them in. What is the solution? How can we get and analyze data without losing sight of the people behind it? By going to the people themselves. By getting to know them, asking them questions, learning what their goals really are, instead of letting algorithms decide that for everyone. That is the mission of TARTLE, to get organizations to go to the source of the data, to go to you so they can get real information, information that will actually contribute to understanding what is really going on in the world. Something that no algorithm will ever be able to do.

What\\u2019s your data worth? www.tartle.co

'