Media and Machine Learning

--

A recap and analysis of the 2017 Media and Machine Learning Conference at Bloomberg by Kristina Pedersen

On Tuesday April 25th, NYC Media Lab hosted a daylong conference on Media and Machine Learning at Bloomberg. There were ten panels throughout the course of the day in which experts, editors, and executives across both media and machine learning institutions discussed topics including current practices in automating media, the state of the art in machine learning, the dark side of machine learning, the role of platforms and bots, experience and interface design, and how journalism as a practice is evolving in an age of automated media.

I want to preface some of the thoughts and ideas discussed at the conference with a bit of an abridged and unauthorized history of efficiency as a singular aesthetic in the free market’s dogma of Progress. (I wrote an essay about this aesthetic as it relates to contemporary ‘experience products.’ The essay, ‘The Experience Economy is for Likers,’ will be published in the ‘Speed’ issue of Under the Influence Magazine in May.) We, as a society, decided to collectively believe in scientific progress, and that this progress, above all else, should be our greatest pursuit. We act like our faith is not a faith at all but rather that any narrative synthesized (by humans) from data or statistics is somehow more ‘real’ than any other human-synthesized narrative, as if an average or a correlation is a capital T-Truth and not just an average or correlation. We fetishize narrative because we experience time in a spatial way, giving preference to linear thinking. But no narrative is more ‘real’ than any other. Trusting data to make a decision is like trusting astrology to make a decision: as long as you believe in its rules, it’s right most of the time. Heidegger speaks of these limitations in terms of ontological access: if we are not questioning the limitations of our access as matter in observing other matter, then scientific progress is as faith-based as any religious practice. To make a long story long, here’s a sketch of the relationship between efficiency and the free market as described in my essay (skip it if you like, the full essay will be back for blood soon):

“…like the homely and prudent children of God, we keep our heads bowed as we blindly pursue efficiency, the paradoxical nirvana of Progress. This internalised, faith-like pursuit eventually collapses in on itself without ever reaching its crux. Because pursuing progress — though an assiduously righteous pursuit — is pursuing the end of progress. Because pursuing absolute efficiency is pursuing nonexistence.

Our pursuit of efficiency begins when the individual discovers itself. Freshly washed and dried in the values of a new Free Market, expressing individuality through economic self-sufficiency is now both his liberty and his mandate. Before this, the Middle Ages were still a time of ‘target incomes’: ‘the typical reaction to economic good times was to take more days off’ (Graeber 498) and it wasn’t until the development of the market economy that people started seeing themselves as ‘isolated beings who defined their relations with the world not in terms of social relations but in terms of property rights’ (Graeber 497). Those who owned factories or ships, and not necessarily those of any given birth, came to wield power.

The promise of Progress comforted individuals in an unprecedented way: that an individual’s newfound freedom to generate his own ‘human capital’ through hard work alone (systemic boosters and barriers be damned) would allow him to move fluidly across classes and accumulate private property (power). However, while scientific progress (the invention of antiseptics mid 19th century alone) eliminated not all but most premature and infection-related death, enlightenment values both eliminated and further encouraged exponentially greater efforts of enhanced survival: ‘the basic problem of survival — though solved — is solved in such a way that it is not disposed of, but rather forever cropping up again at a higher level’ (Debord 40).

Efficiency as a pursuit in and of itself is pursuing survival if being the most efficient is the highest held value of competing bodies. An incorporated body of individuals is surely more efficient than an individual, thus corporate enterprises eventually came to have privilege above the individual in public policy (and constitutionally exercises many of the same rights). ‘The destruction of an ever-increasing number of smaller enterprises for the sake of growth of ever larger corporations was an economic necessity that one might regret but that one had to accept as if it were the outcome of natural law’ (Fromm 8). In an article about the epic unravelling of antitrust laws for The Washington Post, Eleanor M Fox quotes John Sherman, author of the Sherman Antitrust Act, in 1890: ‘If we will not tolerate a monarch, we will not tolerate a king of trade.’ But during the Reagan and Carter presidencies, infamous for their deregulation, antitrust became ‘pro-efficiency’ and not ‘anti-bigness’. Efficiency and Progress, colonised by competing states in a newly global economy, put the individual back in her place: not as an isolated single actor competing amongst other single actors, but as an isolated actor serving a larger state in its competition for economic dominance through greater efficiency.”

The pursuit of efficiency is, as it has come to exist in the market, morally bankrupt. And I think, in a general way, we all already see that — though surely none of us want to be caught dead denying the possibilities and opportunities of machine learning and artificial intelligence. The speakers at the Media and Machine Learning conference were all, for the most part, self-proclaimed optimists themselves. And these are all extremely smart people creating mind-boggling machines that will (and currently are) directly affect(ing) the automation of media so you have to hope that they are at least a little bit convinced that they are making the world ‘better’. Here are some of the most interesting things we discussed.

Good enough.

The conference began with an opening address from Bloomberg’s Editor-in-Chief John Micklethwait. He spoke with unapologetic enthusiasm about the possibilities of machine translation and the automated generation of news articles in multiple languages. His address posed the question that set a tone for the entire discussion of automated media: what is going to be good enough for the consumer? Not what is going to be ‘great’ or even what is going to be ‘better’ but what is going to be good enough? He substantiated this question with the anecdote that a machine had generated and translated 6 times more articles than a human translator could do in a day (all with say one human overseer per six automated articles, as opposed to one journalist generating/translating every article). Though natural language processing algorithms get smarter and better the more they are used, the product still doesn’t hit near human efficacy levels: Micklethwait specifically noted that they had mistranslated ‘ties’ as ‘multiple neckties’ instead of ‘connections.’ He then went on to say that, in an ideal situation for himself, this would be good enough for consumers. It is true, verb conjugation in the romantic languages is pretty arduous and almost stupid and definitely inefficient, but how can we so blatantly miss how the celebration of ‘good enough’ rings dissonant with the current obsession with ‘Truth’ in media. The issue, to me, isn’t with job creation or destruction (we address this later). Rather, it seems disingenuous to think that ‘Truth’ is simply a matter of facts versus lies and that language has nothing to do with either. I think we get ourselves into dangerous waters when we assume an absolute truth or lack of bias in automation, or even that automated media is the diametric opposite of editorial media.

Indeed, Natural Language Processing (NLP) was a big topic in the first few panels of the conference. Hilary Mason, Founder of Fast Forward Labs, had a funny example of a company she worked with that completes all its government compliance filing automatically: it is machines reading the regulations, machines making filings re: the regulations, and it’s ultimately machines processing (‘reading’) the filings. No human ever looked at the regulations or the paperwork until TBD. Machines are speaking to each other yet we are trying to teach them human language. They don’t need human language, we need them to need human language. In the media world, machines generate news articles with information (data) they understand as machines amongst machines, and they are then tasked with generating an ‘article’ made of ‘language’ so that we humans can understand the information (in our symbol-reading/interpreting-based way of learning), information that machines not only know but simply are. Machines don’t think algorithms: they don’t think about things and come to conclusions. They simply are algorithms: they are at once the data and the conclusion. (And yes, today machines are generating automated news articles).

We don’t actually know how machines ‘think’.

We don’t even fully know how humans think but at least humans can try to explain themselves, however futile the attempt. Machine learning is the end of reason. As I mentioned earlier, we, as humans, understand things and think in a very linear way. We understand everything through how we understand time (which, as Heidegger postulates, is really based on a spatial experience e.g. sun, orbit, the universe, etc.). But machines and computations aren’t part of the physical universe. Machines don’t ‘understand’ ‘time’ — they only process it as a measurement. They don’t Experience. Thus machines do not at all think in the linear, reason-based way that we do. (Again, we also know so little about our own brains to begin with!). It is possible that besides simple computational decision-making, machines could have a kind of associative way of ‘thinking’ because many algorithms are built around the idea of a networks metaphor. Data is key and how to generate data?: one way is through connectivity and engagement. (If you provoke more engagement, you can create more data and make your algorithm stronger and more efficient. But the provocation, the manipulation, is key. This begs the question: what kind of data is provoked data [data created through provocation/manipulation]? Is the computation — the decision — then a representation of an alternative manipulated reality and not necessarily any given ‘true’ reality?).

But none of that may even be true. We don’t know how machines learn and think, and they can’t exactly explain it to us, which is the real problem. When a doctor gives a diagnosis, she can explain how she came to that diagnosis — she can explain her thinking — and when an investor makes a decision, he too can explain the ‘reason’ behind his decision, the pattern of thinking that brought him to his conclusion, from beginning to end. But when a computer makes a decision, it can’t explain why because 1) it doesn’t even know what why is (a machine doesn’t Experience and meaning comes from experience) and 2) It also doesn’t naturally have a method of communication (call it a language, call it a visual aid, whatever) to express it’s thinking, it only has DATA + ALGORITHM + what you program it to know or do. It can only explain itself through its very being — which is a symbol of at once 1) itself (its algorithmic essence), 2) its decision, and 3) its reason why. This is why NLP is such a huge field.

Ethics.

The ethical rendezvous here is that algorithms are not unbiased yet they can offer no ethically-biased solutions. We all remember Microsoft’s bot Tay (RIP) (see: Twitter taught Microsoft’s AI chatbot to be a racists asshole in less than a day in The Verge). If bias goes in then bias comes out. The same is true for the algorithms of platforms (which prompted the nationwide discussion of bias bubbles), but platforms have no liability for the content published on them or else it would be impossible for them to allow free expression. If an algorithm is policing and censoring, the question of course becomes ‘who owns that algorithm?’ and then ‘who voted them into power?’ When a user verbally abuses a bot like Alexa or Siri, in a sexual way or otherwise, is it a bot’s place to tell a human what they can and can’t do? Or possibly what they should and shouldn’t do? The obvious answer (today) is: of course not. But then we are left with a giant moral black hole. Progress has no answer for what you should or should not do, it can only offer data that represents averages of what is or is not, empirically speaking, and some institutions choose to create narratives around this information that interpret, synthesize, and suggest what you should or should not do.

For example, climate change has been empirically demonstrated. But the fact that we should do anything about it at all is based on the assumption that either 1) it’s important for human beings to continue existing, or 2) that it’s important that the earth continue existing, or 3) that we — and everything around us— matter at all, or 4) other. The imperative is man-made, the data is data. So in terms of algorithms, where data from human behavior is biased by each user’s manifestation of bias, Francesco Marconi, manager of strategy and development at the Associated Press, said, “At the end of the day, it’s a human decision behind all automated results.” The point of crisis becomes coping with the reality that algorithmic results and entities are biased thus their policing of bias would also contain bias.

Labor.

Throughout the panels, there weren’t a ton of great conversations about job destruction and creation, though when discussing the theoretical possibilities of absolute automation the subject can’t really be avoided. It was disappointing (but not surprising) that the few thoughts shared by panelists on the matter were in no way innovative and revolved around the fetishization of wage labor: they basically said that we will just train people who used to work in factories to program robots. This kind of thinking misses the ENTIRE POINT of automation, which is to liberate or alleviate humans from labor. Purposefully, I think, it also categorically ignores the digital labor performed by ‘consumers’ (who should really be called users or prosumers) that make the algorithms run and learn in the first place.

Users built the Google and Facebook algorithmic empires and they did it for free. And Facebook is not a landlord: they don’t rent space to advertisers, they sell audiences. Their business model is built on users as products. So not only are these entities running on users’ data, but they are doubly profiting off this data. What, then, are the possibilities for a user to own their own data? (And by ‘data’ I don’t just mean the social objects they create/post — images, the copy of their statuses, etc. — but by the data proliferated from their engagement and user behavior.) Sure, let the platforms keep using the data to feed the algorithms — to power their machines — because nothing in life is free. But should we continue accepting that this ‘work’ we do is also being exploited? Imagine an LLC-esque operating system that captures all the data residue of your digital labor, and in which you can opt-in to having it all be ‘for sale.’ This would be an interesting step towards universal income. Platforms wouldn’t need the advertising-sponsored business model anymore if we had a data opt-in model. (There are a lot of assumptions I’m making in this largely theoretical scenario, but I think the possibility is something worth thinking about.)

Finally, the day ended on a very interesting note. Lev Manovich, new media theorist and professor of Computer Science at Columbia, posed a paralyzing question to the final panel. We had spent all day talking about how we can innovate and ‘fix’ various aspects of machine learning within media and within news, but how, Dr. Manovich asked, “can we fix the news?” No one, not Jason Kint (CEO of Digital Content Next), Mark Hansen (Director of the Brown Institute for Media Innovation at Columbia), Jared Sandberg (Senior Executive Director of Bloomberg Digital), nor Monika Jedrzejowksa (Privacy Counselor at Hearst) had an answer to this question. All day we had been ignoring the that fact that the news, the media, and its associated institutions are for-profit businesses and enterprises, fundamentally biased and driven by their mandate to make money. I’m not being naive, I don’t think making money is inherently evil. But to have left (and to continue leaving) this point of crisis out of the conversation as we rapidly move forward with general machine learning and automated media is disingenuous and betrays any grand pursuit of Progress for humanity we are all pretending exists.

--

--