Video media and monkey brains
Let’s run a quick thought experiment. If you travelled back in time 30,000 years (when storytelling first emerged1) and played Riverdale season 3 for your ancestors, what would they think? If you said “they would literally have no idea what was going on” then I would emphatically agree - the plot of Riverdale season 3 was largely nonsense.
However if you relayed to them a story that was coherent and understandable given the nuances of their time, evolutionary psychology would argue that their reaction would largely match yours.
This is fairly insane to think about. Thousands of years ago natural selection shaped the human brain to respond to certain stimulus in a certain way, and thousands of years later your brain is still responding in the same way but now compelling you to binge watch Bridgerton. Now I don’t know about you but I spend an insane amount of time watching videos on the internet. And you know what - I lied - I actually know about you too: the average adult spent 29 hours per week watching videos in 20202. If we’re assuming you pace yourself (which we both know you don’t) that’s over 4 hours per day.
It seems then that these evolutionary underpinnings are defining a rather large chunk of our lives. So it’s surprising to me that the question “why are human beings obsessed with video?” is relatively unexplored in most of the analyses I’ve read about video media companies like Netflix, YouTube and TikTok. What buttons in our monkey brains are these companies pushing and what can we learn from this? As always we can turn to evolution for a deeply dissatisfying and complicated answer. I’ll first break down why we react to video in the ways we do, and then we can apply these nuggets of wisdom to the wider video media industry at large.
Reason 1: You find learning fun, whether you admit it or not
So let’s return to your ancestors. In many ways their world would have been so wildly different to ours it would’ve been hard to grasp that you were related at all. To survive the men would go out to hunt and the women to scavenge for berries. At the end of a hard day they would return to their cave and presumably discuss how awful their lives were. However there are some behaviours that would have been surprisingly familiar - most notably wanting to have fun through play and storytelling.
These traits reveal an inclination of natural selection: to favour living things who enjoy simulations3. Let’s think about ‘play’. As a species we make games out of pretty much anything as children, have conjured a wide variety of sports again seemingly just for fun, and are now being increasingly assaulted online by the burgeoning digital game industry.
Animals who find play gratifying are advantaged because they are incentivised to explore, learn and practice skills in safe simulated environments. If you play fight with one of the other kids when you’re young, you acquire skills you can use in a real fight when you’re older. ‘Play’ is a common practice we see even in many non-human animals.
My 5 year old self would shudder to hear this, but in many ways play has actually always been about learning.
The evolution of storytelling has a very similar…… story. Let’s imagine you were in the position of a caveman. If you need to learn what a wolf is and how to beat one through trial and error then, well, you’re going to have a bad time. But if you can safely learn this ahead of the encounter, you’re probably still going to have a bad time, but your chances of survival are much better. Stories allowed humans to learn information in safe simulated environments without needing to rely on first-hand experience4. And when you apply this to every possible thing that you’d otherwise need to learn first-hand - where the best berries are, that touching fire is bad, that Gronk has syphillis - the benefits really compound. So historically it paid to be really into certain stories, again for the same reason as play: because you could learn from them.
Notably our ancestors evolved to find these simulations intrinsically enjoyable, without needing even to be conscious of the fact they were learning something. This was necessary because children in particular would not have had the capacity to understand nor care about the importance of the inherent lessons within5. Plus it’s important to note that such a predisposition was useful for survival specifically for the environment at the time, and you can see this with the dominant stories of the time centred around things like human behaviour, birth & death, topography and animal characteristics & behaviours6.
So now human beings still like simulations! Especially the simulations which teach things that were historically useful for survival. However this is where Ted Sarandos’ luck ran out. As the makers of Pokemon Go well know, despite initial enjoyment a lot of human beings will quickly get bored of a simulation and move on to the next thing. It is the simulations which employ novelty that are generally best at holding an individual’s attention. Novelty triggers curiosity, another evolutionary adaptation that drives us to learn about our environment. To the human brain novelty shouts ‘You there boy! Here is something unknown and mysterious! Pay heed lest it literally be the death of you!’.
We should note though that novelty is not all powerful. By definition simulations need to represent something else they lose all real world value. Accordingly a person couldn’t shout nonsense (even the most novel nonsense you had ever heard) and expect to have the rest of the tribe gather round eagerly to listen (though the early 21st century seems to be disproving this theory).
Our desire to investigate novelty is also balanced by ‘mere exposure effect’, which dictates that exposure to stimulus increases liking for that stimulus (more than 200 published studies have confirmed this effect)7. While a desire to investigate the novel and a preference for the familiar might seem paradoxical, it can be reconciled when we realise there is such thing as ‘too novel’. For example a Scandinavian study showed that humans finding something interesting is dependent on the presence of both novel and familiar elements8. This makes some evolutionary sense - while novelty evoking curiosity for the purposes of exploration is advantageous for survival, you obviously don’t want animals absolutely yeeting themselves into the complete unknown.
And so rests the first gift of natural selection upon which the video media industry sits. Natural selection favours beings who enjoy simulations because this preps said beings for the real world. The simulations which are best at holding attention are those which teach things historically useful for survival (like human behaviour) and which provoke just enough (and not too much) curiosity through novelty.
Reason 2: You live under the tyranny of sex & emotion
‘Aesthetics’ shockingly were not an obsession solely borne of the 2010 residents of Tumblr, but have long been admired by all humans. The evolutionary psychology behind this was one of the trickier things to tease out, so I’m going to first talk about the ‘aesthetics’ that are clearly about sex and then talk about the ‘aesthetics' that are maybe about sex but also maybe not.
This is an obvious one but as humans sex is incredibly captivating to us, and for good reason. As a slight tangent I sometimes find it insane that while the porn industry is the absolute behemoth it is, it’s rarely talked about in tech circles given society threatens to make social pariahs of those who invite such impolite conversation. Well lucky for you I have very little social standing to begin with, so for me this is by and large an empty threat. Take that circles!
Sex is the evolutionary psychology darling - having sex with the right people is what should create offspring that go on to survive and also have sex with the right people, such that your genes might continue on ad infinitum. Natural selection hence favours creatures who enjoy sex (given this incentivises reproduction), and who are attracted to people based on certain signals. Features which are typically attractive will generally signal fitness - for example facial symmetry, glossy hair, a clear complexion9. There’s a reason all influencers are starting to look the same.
There are also things we find attractive which can be more unique to us as individuals - and some of that is related to finding genetic diversity attractive. For example a man’s scent can signal to a woman if his immune system is different and would be complementary to her own. This is supported by studies which show that while group consensus exists when it comes to attractiveness, there is some variance which appears to take the form of a normal distribution10.
And so it is no wonder that when we see attractive people our brain tells us to pay attention, and that it therefore pays to have a significant store of attractive people on your network. But we'll return to TikTok later.
However there are many things that we find ‘aesthetic’ that aren’t obviously about sex - for example a beautiful piece of art like Monet’s ‘Water Lilies’ or Shrek 2.
The wider evolutionary psychology community (all 3 of them) seems to have more mixed opinions on the origins of the human aesthetic sense when it doesn’t relate to other humans, but one theory is that it’s the human equivalent of the peacock’s tail11. The peacock’s tail is incredibly costly to grow and obvious to predators - and yet evolution demanded bigger and flashier (this is common knowledge, but watch the video below because it’s even bigger and flashier than you think). Supposedly this is all in service of the lascivious peahen, who was able to use the tail to discern the fittest peacock, capable of growing and maintaining such a tail despite the risks it posed. And so the argument goes that perhaps humans use certain features of the brain (for example an aesthetic sense) to demonstrate their ability to maintain the feature at an excessive cost. It is indeed striking that while non-human primates have up to 20 distinct calls, the average human knows about 60,000 words but uses only around 4,000 in daily speech. Accordingly one suggestion is that finding non-human objects ‘aesthetic’ is not so different from finding other humans ‘aesthetic’ - it is reserved for ‘objects’ which signal high fitness qualities in the creator - for example co-ordination, creativity, an ability to learn difficult skills and/or free time.
I’m not sure how convinced I am by this, mainly because of the detachment between the ‘object of aesthetic admiration’ and the creator. I like Monet’s paintings and yet don’t feel like I want to have sex with him more because of that fact. If aesthetic appreciation of non-human things is not about sex though, the next best alternative explanation seems to be that it is merely a by-product of a mix of adaptations - for example intelligence, an ability to simulate, emotionality and our constant search for connection12.
Personally I’m inclined to believe emotionality plays a significant role in aesthetic appreciation. Emotions evolved to motivate us towards outcomes that would help us survive - we are happy when we gain status, or resources, or are socially accepted (for example) and this feels good, and we are sad when the opposite happens and this feels bad13. Feeling these emotions encourages us to take and repeat the actions that make us happy and to avoid the actions that make us sad. I’ll add here this is a slight oversimplification (there are more emotions than happiness and sadness), and this isn’t alway the case (many people like sad movies for example). However it’s helpful to understand that generally emotions can be a motivator for us seeking out stimuli, on the basis that certain stimuli will make us feel some type of way. For example there are studies to show bored individuals will seek more exciting stimuli whereas stressed individuals will seek more relaxing stimuli14.
Accordingly the second rule of natural selection that inclines us toward video media is an obsession with attractive people and a general ‘aesthetic’ sense which might also be about sex but I’d say is more likely to be a function of our emotionality.
Reason 3: You’re desperate for connection
I’ve always found it striking how much watching a TV show can make you feel like the characters are genuinely your friends. Many TV shows play on this ‘friends’ dynamic, centring around a group of friends who are just hanging out, but perhaps the most notable show about ‘friends’ is………… ‘New Girl’.
Given our tribal past humans evolved to desperately seek a feeling of ‘belongingness’. This is with good reason - the tribe provided a trusted party whom you could work together with for survival, and so if you didn’t ‘belong’ to a tribe you would be much more likely to die. The consequence is that the human who was constantly looking for connection in other people, and who was alarmed into action by a lack of connection, was more likely to survive.
Now thousands of years later when we watch David Dobrik’s vlog squad, our brain is still thirsty for connection, and will turn even to simulated people and search for signals that they represent an in-group to which we could be a part. There is actually a name for one-sided interactions like this: ‘parasocial relationships’15, and it’s presumably only going to grow in terms of its centrality to our lives as the internet and creator sphere continues to blow up.
The theory of how this works is two-fold. First off all of us, with the obvious exception of Piers Morgan, possess empathy. This evolved because understanding how others feel (and being emotionally responsive) is a superpower when it comes to the tribe getting along and therefore surviving. We can accordingly now empathise with pretty much anyone we see (given the right inputs) and our emotions end up entangled in what happens to them.
What we feel when we watch video media will depend on a character’s behaviours and motivations. Evolution has us apply a moral lens to evaluate the suitability of characters for the purposes of an in-group16. This creates in us either positive or negative sentiment, depending on whether the character meets our criteria, and this determines our ultimate feelings towards the character and their arc. Notice how in Game of Thrones you like the characters who obey a consistent moral code and are willing to put the needs of the group before even their own: traits you might want in a tribe member if your goal is for the tribe to function effectively and ultimately survive.
We should note that while this need to be part of the in-group affects how we relate to the people we see in video media, it also has consequences outside of the canonical context of any given story. If everyone around you is obsessed with The Queen’s Gambit, the chance of you getting into The Queen’s Gambit increases exponentially. This is because we take more of our behaviours from the people around us than we care to admit; again all in service of making sure we will be considered part of the in-group and won’t be left out in the cold.
And so Mother Natural Selection decreed that the final reason for our obsession with video media would be a desperate desire to connect, at all costs, so much so that we form relationships with even the simulated people we see and are significantly influenced by the viewing behaviours of others. The good news is next time you feel like you’re being needy don’t blame yourself: you literally just do be built like that.
Defining ‘good’ content
From understanding why our monkey brains ogle at video content the following questions emerge, which we can use to determine whether a piece of video content might be considered ‘good’:
Does it simulate an environment containing lessons which would have been useful to our ancestors, with just enough novelty to provoke curiosity?
Does it contain people I find attractive, or otherwise lend itself to aesthetic appreciation?
Does it help me to feel like I’m part of an in-group, either through relation to the characters or the audience?
This is useful because we can use these questions to assess any video media company’s propensity to generate quality content. So we no longer need to treat all content as equal, which it obviously is not. Academics everywhere can rejoice in having finally served their purpose!
The prophecy
Before we do this however, let’s take a step back and think about content in more abstract terms. Years ago in a time perhaps even less recognisable than our hunter-gatherer past, 2004, Chris Anderson published his ‘long tail’ thesis17. The thesis basically argues that with the internet, the business model for niche content would become viable given the magic of the internet could reduce both production costs (in that digital goods would be much cheaper to produce than physical goods) and distribution costs (in that it would be much cheaper to send something through the internet than to physically deliver it).
Historically of course the safest thing for video producers had been to create ‘hits’ that would appeal to the widest possible audience, given this would maximise the possible return from the high cost of a given production. However as the internet emerged, and just as the oracle Chris Anderson foretold, more niche content became viable as costs were brought down.
Production costs were made more tolerable in 2 notable cases. Firstly (and thanks to Matthew Ball for this insight) the shift from live TV time slots to on-demand streaming has meant a given piece of video content can keep making money almost ad infinitum - which makes costs much more tolerable given they can continue to be offset over time18. Prior to this when you watched a TV show that aired only at a given time, the opportunity for the content to make up for its costs was largely limited to the ads aired during that time. After it went off-air, the time for it to make money was largely up (short of reruns and other less profitable methods of distribution).
Secondly production cost per unit of content was heavily reduced in the case of the introduction of user-generated content (UGC). We can see that overall content spend is actually strikingly similar between a particular UGC platform and more ‘professional’ producers of content. YouTube paid on average $10bn a year for content over the last 3 years19. This makes them the second highest spender when it comes to overall content spend, of all video streaming services. They are behind only Netflix, who spent an average of $15bn a year over the last 3 years20. However there is one notable difference: Netflix has the rights to ~14,000 titles21 whereas YouTube has 31 million channels22. So on a per unit of content basis, YouTube is spending a lot less.
Distribution also experienced benefits which led to more tolerable costs. The greatest enemy of both humanity and as it turns out video media companies, having a finite amount of time, ceased to be an issue. Previously finite airtime restricted how many shows could be aired at once, but with ‘on demand access’ the amount of content that could be created, deployed and ultimately consumed by audiences became uncapped at little additional cost. This solved the finite time problem (for the companies, humanity is still waiting). This was of course essential for the emergence of UGC platforms, who were already prone to generating significantly more content across a broader spectrum of categories compared to the the standard producer driven model.
This splintering of what was possible over time has led to the variety of existing video platforms taking different approaches:
Now let’s return to our trifecta of variables that determine ‘good’ content to evaluate these strategies.
The joy of learning in video today
As a reminder the ‘joy of learning’ argument is that we evolved to intrinsically enjoy the type of simulations which thousands of years ago taught lessons that were useful for survival at the time. So before an internet villain shouts me down under the assumption that I’m arguing ‘To All The Boys I’ve Loved Before’ contains learnings essential for survival today, please note this is in fact not what I’m saying. Our brain is just continuing to chase the same high through pattern matching. The simulations which do this while deploying novelty effectively are the most effective at maintaining the attention of the human brain.
So the next question here is: are there differences between how the content platforms play with this concept? With professionally produced video we mostly see producers (the good ones anyway) deliberately deploy novelty into a ‘simulation’, causing a variety of consequences to unfold within the simulated world. This a) evokes immediate curiosity, and b) offers our monkey brains a model of the world which promises to reveal how things work, leading to intrinsic enjoyment. Such novelty can be deployed literally (e.g. through dragons or aliens), or it can be more subtle and come in the form of general ‘unknowns’ like mysterious character motivations and behaviours.
In The Walking Dead the zombie outbreak creates a brand new world which provokes immediate curiosity in the viewer. What are ‘zombies’? How much of a threat do they pose? Is there a cure? But zombies are just the tip of the iceberg - the main role they play is to create an environment through which deeper unknowns can be answered. They create situations which promise to reveal to the audience the dictates of human behaviour itself. The show tells us that people will prioritise their own survival and the survival of other members of their in-group. It also makes an argument about the world: namely that it is structured around mental and physical strength, the possession of resources and ultimately immediate term needs. From an evolutionary standpoint you can see how finding stories like this enjoyable would have been advantageous in the past: they offer a theorised model of our social environment and the possible consequences of a wide variety of behaviours given a certain set of circumstances. We enjoy the stories that offer us an explanation of how ‘life’ works.
I find it interesting how the content of UGC platforms is distinctly different compared to traditional TV & film. A 2020 study found that the most popular content on TikTok lies in the verticals of comedy, musical performances, beauty and DIY23. By comparison a slightly older (but still relevant) study found that on YouTube ‘People & Blogs’ channels were 74% of new channel creation24. One explanation for the difference we see here is this is the result of democratising the ability to be a creator in the video marketplace. The subjects represent both what creators want (and are able) to create and what audiences demand. Now let’s look at an example.
This of course feels distinctly different from your traditional long form narrative arc. What is the lesson for the monkey brain here? You may be surprised to hear that I think the lessons that make many UGC videos inherently enjoyable are the same as what we see within The Walking Dead: through these sorts of simulation our brains can continue to evaluate and learn about the behaviour of our fellow humans. Aside from being funny, this TikTok tells you that people run on autopilot and make mistakes which can lead to unexpected outcomes. It also tells you that people find it funny when a person’s behaviour is different from what you might expect. As someone who needs to deal with people regularly to survive, this is useful information.
There are however 2 key differences with UGC’s deployment of novelty to hold the individual's attention that I think are worth calling out. The first point which all UGC platforms benefit from is that what is considered novel versus familiar is different per culture at the very least, and per individual at the most. Personalisation supercharges a UGC’s platform’s ability to find the content that best meets this criteria for a given individual.
Let’s compare this to the high risk endeavour for producer-driven video platforms: to find a plot that ‘works’. ‘Pilots’ are often used to pitch shows to television networks, and this involves test audiences and decisions by experienced executives. However even with this mechanic you've invested a lot of time and money before you find out whether the plot resonates. Of course when it works it works - but a significant amount of film & TV fails in its ability to stay gratifying to the viewer.
By contrast UGC platforms create a pressure cooker in which content that is very low risk to generate (requiring less time and money) is continuously tested with potential audiences to see if it meets the novelty / familiarity threshold for any group of individuals. Eugene Wei describes the ‘For You’ page on TikTok as applying ‘[evolutionary] selection pressure’ to memes. Continuous remixing of media (encouraged by TikTok through features like ‘Duet’, which splices videos together side by side) means new memes are quickly borne which inherit traits of the source meme but also mutate themselves25. The algorithm then unleashes a battle royale of new and evolved memes to sort the winners from the losers. Manually testing pilots with severely condensed test audiences seems also primitive by comparison.
Secondly TikTok in particular has taken balancing the familiar and the novel almost as a literal instruction. One of the key principles is that videos by different creators literally recycle the same sounds and actions, using the existing context and expectations of the viewer (who is likely to have seen the pattern before) but deploying a small amount of novelty. The novelty can range from a new person doing the exact same song & dance, through to a complete subversion of the expectations set up by the original context.
One of the things that makes this novel/familiar combo so powerful is ‘mere exposure effect’, which as discussed earlier is the effect which makes us like stimuli the more we are exposed to said stimuli. There are some very pertinent points about how this effect works - it’s strongest when it’s subliminal and the familiarisation duration is short and repeated26. Sound familiar? In this way TikTok and its creators anchor us in what we know and like before hitting us with a bit of novelty to ensure we stay interested.
Sex & emotion in video today
People like looking at attractive people, you didn’t need a thinkboi Substack post to tell you that. This is of course true regardless of platform or medium.
Producer-driven video usually brings in attractive people by virtue of pure access. Hollywood and other such ‘woods’ have access to pretty much the top 1% of attractive people: aka celebrities. However it is in Hollywood’s interest to define attraction according to group consensus, given the industry generally creates one media artefact at a time that they want to scale as much as possible. This means UGC platforms have a comparative advantage: personalisation in combination with the wide berth of creators means they can also account for variance in what the individual finds attractive.
I would additionally suggest that UGC content offers two things which professional video does to a lesser degree when it comes to servicing feelings of attraction. Firstly it offers a voyeuristic look into the lives of those one finds attractive (almost literally - a significant number of TikToks are literally a look through the creator’s phone camera into their bedroom). When we find someone attractive we want to see and find out more about them. A UK study showed teens generally admit to voyeuristic motivations when it comes to use of social media27. Hollywood historically outsourced this to a tangential industry (the paparazzi), who existed for just long enough to inspire a hit Lady Gaga song before being disrupted by social platforms. From that point celebrities took control of voyeuristic insights into their lives (or at least their PR teams’ did) on social media. This is an interesting comparison because producer-driven video platforms do not benefit from voyeuristic interest in the attractive people they showcase to the same degree as UGC platforms.
Let’s suppose hypothetically that someone has an interest in Tom Holland, and they derive some pleasure in finding out more and more about him. Hypothetically they might watch a movie on Disney+ because it has Tom Holland in it, but their ability to satiate themselves is capped by the finite runtime of the movie and it offers little in the way of personal insight. They consequently might watch YouTube interviews of Tom Holland once it is over - their continued interest in the actor overflows into and is capitalised upon by another platform. If Tom Holland were a Youtube or TikTok creator, you would benefit from a) more content (given the lower standard and ease of creation), and b) a tendency for the content to err towards personal insight. So satiating voyeuristic needs when it comes to a person of interest is comparatively uncapped on UGC platforms. Hypothetically.
Plus attractive individuals on UGC platforms have social capital built up and stored in their platforms of choice and hence have more of a vested interest in those specific platforms, whereas this is less true for celebrities on producer-driven video platforms (though of course it is in some cases - a notable example would be actors who are contracted to play Marvel characters and are therefore tied to Disney). Here I of course have to call out the TikTok ban scare in the US, and the rush of creators to diversify social capital across platforms in order to derisk. This shows that their vested interests are far from absolute. However creators are subject to the same network effects of mass UGC platforms as the rest of us and this amplifies the ability of said platform to capitalise on our evolutionary interest in certain people.
The second advantage of UGC platforms is the blurring of the parasocial and social relationship specifically in the realm of attraction. As humans attraction is a motivator to engage in a multiplayer game. We are attracted to people so that we know who might benefit our genes if we mated with them. This means breaking down the parasocial into a real social relationship has a certain amount of value to us. Producer-driven video doesn’t facilitate this - I can’t comment ‘Dan Nicky your Bobbie s’ on an episode of Riverdale - however you absolutely can on YouTube and TikTok. Hence attraction can be better ‘realised’ through the intended social dimensions on UGC platforms than professional video platforms.
Outside of sex, as I’ve mentioned before I believe emotionality is a significant part of our aesthetic sense. Why do I return to ‘lo-fi hip hop radio beats to relax/study to’? Because it’s relaxing……. and great to study to. The thing about emotions is they change very frequently and this again means UGC platforms find themselves in a better place strategically. With both shorter content and a more extensive store of content to choose from, it is realistic for a UGC platform to interpret a user’s mood based on the content they choose to watch (and skip) in a given session and to quickly adjust what they watch next accordingly. This is something which is possible but much harder to do with long form content given the number of units watched per session is much lower. There is significant risk and investment required to watch a TV show or movie which may not fully match my emotional needs. This is why trailers exist, but these are very rough translations and don’t really mirror the emotional outcome you will get from watching a given piece of media. If you switch away after 10 minutes then a streaming platform has one incremental unit of data about your emotional needs. In the same amount of time TikTok or YouTube can accumulate many more units of data about what videos you liked and what you didn’t, and can adjust their recommendations accordingly.
The one leg up producer-driven video has is the ability to command extensive resources for the purpose of creating media that feels immersive - e.g. producing a full score for a given TV show or movie, or investing in CGI and incredibly expensive cameras. There is evidence that immersion (or ‘narrative transportation’) is linked to the emotional outcome which results from a given piece of media28. Your modern amateur content creator by comparison faces significant friction in producing content which can match professional content in terms of immersion. This is partially a function of time constraints (many creators are not full time) and partially a lack of access to the right tools (which can be very expensive and/or unintuitive to the time-poor layperson). However the creator economy is one of the fastest growing buzzwords in the tech industry today, so it would seem it’s only a matter of it time before video production goes the same way as professional journalism - democratised through access to the right tools. There is no doubt that the skill and imagination to generate strong emotional outcomes is not missing in user-generated content today:
It’s no coincidence that TikTok is both the fastest growing social media app in history29 and is both a) powered by a democratised feed that works agnostic of a follower graph (gold dust for new up and coming creators), and b) one of the first video platforms to focus on video editing tools designed to help the creator entertain, in a way that feels fun and frictionless. Presumably we will see a burgeoning marketplace of tools that help content creators to produce better and better media. But I won’t get into the details of why and how - if you’re interested Li Jin has written about this with more clarity and detail than I ever could30.
The need for connection in video today
Lastly let’s turn to the social value inherent to video and how the various platforms capitalise on this human quirk. D'Arcy Coolican from A16Z has argued that all industries will eventually incorporate a social element given how much we crave connection as human beings31. I strongly agree with this (as a general rule I agree with anyone with an apostrophe in their name).
Professional video has made the parasocial relationship something of an art-form. Making characters with whom an audience will form a connection is one of them most effective ways to make media appealing to an audience. Professional video producers have some degree of advantage over content creators here: they can create contexts in which characters display who they are via certain actions, which are hard to recreate in user-generated video. In Game of Thrones the world is built out to be both ruthless and rife with betrayal, hence you can render a character as noble if they stick to moral principles regardless. Our monkey brains will latch on strongly to such characters given moral motivations make for good tribe members. It is hard to recreate situations in which characters can prove themselves in similar ways in short-form UGC. But this is mainly a limitation of the length of content, which isn’t impossible to circumvent. For example take the following TikTok series:
Additionally professional video platforms hold widely recognised IP which generates value on the social front in the real world. Such IP creates a strong network effect thanks to the existing fan base because, as previously mentioned, we’re inclined to take our behaviours from the people around us to fit into the in-group. If you’ve ever shouted someone down or been shouted down yourself for never having seen The Avengers you know what I’m talking about. While I’ve often thought it is insane that the likes of Netflix and Disney+ have no social features, arguably the widely regarded IP they possess is their social feature.
However UGC platforms hold some tricks up their sleeve too. I won’t mention personalisation again but suffice it to say that if you’re searching for which characters an audience member might mentally assign to their in-group, having a cornucopia of creators to choose from is an advantage. However more than that, as mentioned in the ‘sex & emotion’ section YouTube and TikTok benefit from blurring the line between the parasocial and the social relationship. When I watch a @flossybaby TikTok or a Brian Jordan Alvarez YouTube video, I can comment and receive a like or even a reply back in return. What’s important here is arguably as the number of creators balloons you start to have smaller audiences per creator, at least in theory, in a world of democratic distribution. It’s not obvious that democratic distribution is the future of UGC networks, but I personally think it’ll be hard for future networks to stay appealing to new creators without a TikTok style follower agnostic feed. With smaller audiences per creator, back and forths between creators and fans should in theory become more possible (versus today where you might validly argue it is highly unlikely that most creators will give you any individual attention). In any case this sort of thing is clearly not something I can do as easily with Jake Peralta from Brooklyn 99, no matter how desperate I am for it to happen.
While IP builds a network effect for various platforms IRL, memes generated by UGC platforms have a similar effect. If you speak to a young person nowadays an emergent behaviour is bonding with others on the playground over memes (interestingly even across in-group lines). You definitely don’t want to be the person who isn’t able to get the reference, so this creates a competing network effect for these platforms IRL too.
The creation of these network effects is aided by the propensity for certain content to go viral. The ‘Share’ button is now a staple of the internet and exists equally on YouTube, TikTok, Netflix and Disney+. But it’s much lower friction to watch a recommended meme than a recommended TV show. Plus ‘sharing’ (both in terms of sending and watching a video someone has sent) is much more natural from a phone or computer than a TV screen, given this is where most of your social apps live, and this is territory which UGC dominates.
Lastly UGC platforms have been much more experimental with the social viewing experience, and probably the biggest winner here to date is the TikTok comments section. It’s incredible that Reddit pretty much captures all the social value from the TV shows I enjoy. I’ve spent an unfortunate amount of time on r/rupaulsdragrace and honestly I’m not even a particularly big fan. If you've ever watched the show you’ll understand when I say the show is designed to provoke a reaction, and as a human being desperately interested in tribal belonging I just want emotional reciprocation and to have the way I feel validated.
This is what YouTube and TikTok arguably do with the comment section - they bring the emotional reciprocation into the platform itself. Eugene Wei comments rather beautifully:
Reading the comments on TikTok serves a communal function. It's like hearing the laughter of the crowd at a comedy show. One of the existential challenges of life is truly connecting with other people's thoughts. Who can ever know that series of emotions and thoughts and dreams we call our consciousness? True human connection seems always out of grasp32.
What it all means
So what does this actually mean for the video media wars? Well like so many clamouring babies, content companies demand time and attention from us consumers, and as consumers we have finite time and attention to allocate. As Sylvia Plath wrote so potently in The Bell Jar, life continues to be but a fig tree of opportunities in which you must make your choice before the figs wrinkle and go black: the only difference in 2021 is the tree is filled with likes of Bella Poarch, HBO’s Girls and YOU on Netflix.
However as Matthew Ball points out, while attention competition exists, the market is not winner takes all and indeed different companies need to capture different amounts of attention depending on the underlying business model33. So the fig tree analogy isn’t perfect (undoubtedly one of the greater tragedies of Sylvia Plath’s work), and it’s not the case that we will crown one of these companies the winner of the video industry Game of Thrones with the others perishing at its feet.
However all platforms need to a) acquire, and b) retain users at some cost to themselves, and we see variation in how efficiently each platform is able to do this. Across platforms deeply valued content is the best driver of acquisition, plus it has lasting value for retention given it is significantly more like users will return to it in the future. Comparatively average content is only useful for next period retention, because users watch it and kind of like it, but in a few months forget about it. In 5 years no one is going to revisit Ryan Murphy’s The Politician.
Producer-driven video platforms pay above and beyond per unit of content compared to UGC platforms however, and so they try and make content mainstream and universal to justify the cost. Comparatively on a per unit of content basis, UGC doesn’t need to reach as wide an audience for the associated costs to be justified in terms of the potential acquisition and retention benefits, because the cost per piece of content produced is simply lower.
Of course this maybe isn’t a problem for subscription-based streaming platforms if they spend more in order to create content that is more deeply valued, right? Well as we can see from the evolutionary factors, UGC platforms actually have an edge when judged against our evolutionary criteria. UGC content:
Has a tighter feedback loop for testing videos, to find the right ‘novelty / familiarity’ threshold for a given individual
Uses repetitive short-form content to create stickiness through mere exposure effect
Better accounts for variance in attraction, given the bigger bank of people to draw from who are all incentivised to stay thanks to network effects
Promises a less one-sided relationship with the people you build parasocial relationships with, and captures value from the voyeuristic motivations that comes with attraction in particular
Better accounts for emotional state, given they collect more signals on how a person feels in a single session
Fulfils our desire for emotional reciprocation after viewing videos by offering reactions in-situ, without leaving it to chance that you have someone you can talk to about what you’ve seen
Is better for virality because it’s more mobile native, which is where sharing happens
Plus the only advantages producer-driven video has look like they will soon be eroded:
Producer-driven video is better at immersion which it can use to drive strong emotional outcomes due to access to resources, however creator tools are rapidly improving which will empower UGC creators to match these outcomes more and more
Producer-driven video is long form which allows characters to be put into more complex situations where their behaviour allows us to judge and develop stronger allegiances - however this is possible in UGC too (for example through multi-part series’ like NPC)
Interestingly we see Netflix going for both a wide library suggested to users via personalised discovery, and a producer-driven approach. This has complications for the content it produces. As one of the first movers in the space they jumped at the opportunity presented from being free of the finite time slots of traditional broadcast television. They were allowed to produce riskier shows because it was a) ok for these shows to take some time to find an audience, and b) possible for different audiences to only be exposed to different sets of shows (at least in theory). Personalisation seemed like a superpower! Plus they could use the data on individuals they got to target the same show to different audiences, using different title cards and trailers depending on what they felt would resonate.
However we find that the producer-driven model here is limiting for Netflix. Given the cost it is in their best interest to create content that meets and push users towards more ‘global’ themes - such that the same content is relevant to more people and they can get more bang for the same price. The existence of the ‘Top 10’ feature on Netflix is evidence that the dream of personalisation exists alongside and separate to the dream of maximising return on investment. This arguably ‘flattens’ the content produced to some degree. They are leaning into content that incorporates multiple niche elements that could appeal to and be targeted toward a range of audience segments, but they can’t go too deep on a particular niche lest they alienate given audiences. So they need a little bit of romance, a little bit of mystery, a little bit of action, in every piece of content and everything starts to feel a bit samey.
And they all lived happily ever after
You’ve probably been gagging for it this whole time, and here it is! I’m going to conclude with everyone’s favourite type of media: graphs. Let’s see how the video wars are playing out given all this context.
First we can see professionally produced video is maintaining a lead by watch time overall.
When looking at TV sets, the leading device for video, we also see that YouTube is the only UGC platform visibly fighting for a share of streaming minutes.
However UGC platforms are leading in terms of monthly active users on a per platform basis343536, and YouTube is the clear leader for time spent on mobile (representing 70% of time spent in the top 5 streaming apps)37. This is notable because media watch time on non-TV platforms is growing while watch time on TV is shrinking38. Plus when we dig into the data we can also see that: a) UGC video watch time has grown while professionally produced video watch time has shrunk YoY, and b) UGC video is almost neck and neck with producer-driven video on overall watch time for younger audiences39. This signals something important: we’ve seen the standard for young audiences become the new normal before.
So if I were a betting man, and I am, I would suggest in the upcoming decades the lines will become more blurred between professional and UGC video, and we’ll see UGC video become more favoured over time. Multiple things could happen here. Producer-driven video platforms could:
Open themselves up to the population of creators in an effort to expand their libraries
Build more social features (e.g. to enable more seamless sharing between viewers)
Start creating contracts with specific actors, or create features that allow audiences to reach their favourite celebrities within their platforms
Lean into more short-form content to capture more data-points per session (see Netflix’s recently announced Fast Laughs, perhaps the worst name I’ve ever heard)
Or something else could change in the industry that flips the script. For example better and cheaper CGI could alleviate the cost constraints which currently stop producer-driven video from being more personalised on a per individual basis (e.g. this could allow platforms to replace actors in any show with people who might be more attractive to the individual).
It of course feels a little unnerving that all these companies are racing to exploit our biological programming. Although on the other hand the things mentioned throughout this post are what we as humans fundamentally desire, and so perhaps it would actually be worse to not have these desires better serviced over time. I’ll leave it for you to decide - I’m going to go watch TV.