Data Ethics

... sometimes machine learning models can go wrong. They can have bugs. They can be presented with data that they haven't seen before and behave in ways we don't expect. Or ... they can be used for something that we would much prefer they were never, ever used for.

... no one really agrees on what right and wrong are, whether they exist, how to spot them, which people are good and which bad, or pretty much anything else.

If anything, this is a call to humility, self-examination, and thoughtful dialog. Though we are increasingly living in a polarized world where one is judged by what particular slogans they choose, what party they belong too, who they follow on Facebook, and so on, we have the choice to not be such human beings. But in my experience that is easier said than done. It's too easy to shout at, rather than talk with, the "other" side because we can in blissful ignorance continue believing we got it right without ever being challenged. It's hard to really reason out our world views and argue with those who don't agree. If we did, we have find we have more in common that imagined and make further progress as a society into figuring these things out ... things like right and wrong, good and evil, justice and injustice, and how we can get along with each other despite our differences.

The point of this chapter is simple:The goal of ML isn't to find the model with the lowest loss ... it is to build a model that drives the right kind of actions


Recourse and accountability

In a complex system, it is easy for no one person to feel responsible for outcomes.

As deep learning practioniers, we have better insight than most into what kind of actions will be made as a result of our model's results. Therefore, if we care about people in general, we'll care about those outcomes as much as our model's validation loss.


Feedback Loops

Feedback loops can occur when your model is controlling the next round of data you get.

... an algorithm can interact with its environment to create a feedback loop, making predictions that reinforce actions taken in the real world, which lead to predictions even more pronounced in the same direction.

Part of the problem here is the centrality of metrics in driving a financially important system.

See the "Meetup" example on p.105

Once people join a single conspiracy-minded [Facebook] group, they are algorithmically routed to a plethora of others. Join an anti-vaccine group, and your suggestions will include anit-GMO, chemtrail watch, falt Earther (yes, really), and "curing cancer naturallay" groups. Rather than pulling a user out of the rabbit hole, the recommendation engine pushes them farther in.

FYI, I think most social media has a net-negative effect on us as humain beings. In particular, I try to avoid Facebook, Instagram, TikTok, and Snapchat while doing my best to limit my only social media account, a Twitter account, to things relevant to data science and public health (and that ain't easy).


Bias

There are different kinds of "data ethics" bias, here are 4 types:

Historical Bias

... comes from the fact that people are biased, processes are biased, and society is biased. [It] is a fundamental, structural issue with the first step of data generation process and can exist even given perfect sampling and feature selection.

Any dataset involving human can have this kind of bias:medical data, sales data, housing data, political data, and so on.

Important: Maybe the best way to understand historical biase in your dataset is by spending time looking at both the outcomes and how they might be used???
Important: Make sure your data is representative of what your model will see and to evaluate any automatic "labeling" features in your system. (see gorillas example on pp.107-108).

So what this showed is that the developers failed to utilize datasets containing enough darker faces, or test their product with darker faces.

A good reminder that your model will only be as good as the data you trained it on! Sound familiar?

... the vast majority of AI researches and developers are young white men. Most projects that we have seen do most user testing using friends and families of the immediate product development group. Given this, the kinds of problems we just discussed should not be suprising.

I think at the very least, we need to be forthright about our dataset as much as on model performance. That way, expectations can be managed and a confidence level assigned to the results. A threshold perhaps that could trigger human intervention.

Measurement bias

... occurs when our models make mistakes because we are measuring the wrong thing, or measuring it the wrong way, or incorporating that measurement into the model inappropriately.

Not sure why, but this is perhaps the most insidious bias because I think its the hardest to figure out.

Aggregation bias

... occurs when models do not aggregate data in a way that incorporates all of the appropriate factors, or when a model does not include the necessary interaction terms, nonlineraities, or so forth.

These are features that are not included though they would actually improve model performance if they were.

Representation bias

When there is a clear, easy-to-see underlying relationship, a simple model will often assume that this relationship holds all the time.

Essentially models can see this real imbalance and make it bigger than it is.


Disinformation

It is not necessarily about getting someone to belive something false, but rather often used to sow disharmony and uncertainty, and to get people to give up on seeking the truth. Receiving conflicting accounts can lead people to assume that they can never know whom or what to trust.

Disinformation will unfotunately be one of the greatest legacies of President Trump. A step backwards for American society. A culture that will back if you if you tell them what they want to hear, even if you're a compulsive liar and base your statements on "gut feel" rather than facts and logic.

While most of us like to think of ourselves as independent-minded, in reality we evolved to be influenced by others in our in-group, and in opposition to those in our out-group. Online discussions can influence our viewpoints, or alter the range of what we consider acceptable viewpoints. Humans are social animals, and as social animals, we are extremely influenced by the people around us. Increasingly, radicalization occurs in online environments; so influence is coming from people in the virtual space of online forums and social networks.

The biggest take here is that I am not as independently minded as I think I am. Knowing thyself is perhaps the best preventative of being swallowed up by disinformation. Limiting social media is another.

Disinformation through autogenerated text is a particularly significant issue

As an NLP guy, this one scares me since part of my work is to summarize text. Knowing this, the first step I've taken is to let all business owners know the risk of text generation algorithms generating text that is either false and/or not necessarily reflective of the inputs, as in the case of abstract summarization. The second step I took was to introduce human beings into the process and a workflow that has them look at at least the most potentially wrong summarizations before reports go out.


What to do???

You must assume that any personal data that Facebook or Android keeps are data that governments around the world will try to get or that thieves will try to steal.

Data use and storage are things you need to think about.

I think these are good questions to ask/answer in any project to ensure good outcomes:> * Whose interests, desires, skills, experiences, and values have we simply assumed rather than> actually consulted?

  • Who are all the stakeholders who will be directly affected by our product? How have their interests been protected? How do we know what their interests really are - have we asked?
  • Whowhich groups and individuals will be indirectly affected in signficant ways?
  • Who might use this product that we didn't expect to use it, or for purposes we didn't initially intend?

See pp.119-120 for a bunch of good questions to put into your practice!

When everybody on a team has similar backgrounds, they are likely to have similar blind spots around ethical tasks.

... first come up with a process, definition, set of questions etc., which is designed to resolve a problem. Then try to come up with an example in which the apparent solution results in a proposal that no one would consider acceptable. This can then lead to further refinement of the solution.

Thinking about all these things may lead one to analysis paralysis or even worse, complete apathy. We need to start with something and be okay with criticism and refactoring. Additionally, we need to be thoughtful in even spot on criticism of others' systems. I don't think most folks try to make something racist or mysoginistic or whatever, so instead of calling them a "Hitler" on Twitter when we taste something that looks to us like fasicism, maybe a phone call and one-on-one chat is the better and more productive move.


Resources

  1. https://book.fast.ai - The book's website; it's updated regularly with new content and recommendations from everything to GPUs to use, how to run things locally and on the cloud, etc...

  2. https://forums.fast.ai/c/data-ethics/47 - Forum subcategory for all things "data ethics".