15 Mar 2025 - tsp
Last update 15 Mar 2025
24 mins
TL;DR: Is social media really fueling hate on purpose? Contrary to popular belief, there’s no solid evidence of deliberate harmful intent. Instead, algorithms amplify whatever sparks the most engagement—often divisive or emotional posts—simply because we, as users, pay more attention to them. This dynamic functions more like a mirror of society’s existing biases than a direct manipulation scheme. The real tension emerges when the rush for clicks collides with long-term user well-being, revealing that negativity thrives largely through our own online behaviors rather than calculated corporate strategy.
Social media feeds are typically curated by algorithms that prioritize engaging content – posts likely to get clicks, comments, shares, or longer watch time [1]. This engagement-driven design often leads to amplification of emotionally charged or divisive material. In other words, the algorithms don't consciously seek out “hate” or negativity, but they boost whatever sparks user reactions, and extreme or negative posts frequently do. For example, a recent experiment on Twitter (pre-“X”) found that its engagement-based ranking algorithm significantly amplified content with strong emotional and divisive cues – notably, tweets expressing out-group hostility (us-vs-them anger) were shown more in algorithmic feeds than in chronological feeds[2]. Users in this study reported that these algorithmically boosted political tweets made them feel worse about the opposing group, even though the users did not actually prefer such provocative content in their feeds[2]. This shows how the algorithm’s focus on engagement can push polarizing posts beyond what people say they want to see.
Multiple studies have observed that content evoking moral outrage or negative emotions spreads more virally online. A 2017 peer-reviewed study of Twitter concluded that each additional moral-emotional word in a tweet (terms that convey outrage, disgust, etc.) increased the tweet’s retweet rate by about 17% on average[3]. In practice, this means a tweet containing charged terms about a political opponent or a hot-button issue is more likely to go viral than a neutral post. Similarly, Facebook’s own engineers found that posts prompting the “angry” reaction tended to get disproportionately high reach. In 2018, Facebook’s algorithm was weighting reaction emojis more than likes – with “anger” reactions weighted five times as much as a like – in an effort to promote content that sparked interaction. The result, however, was that the most commented and reacted-to posts were often those that “made people the angriest,” favoring outrage and low-quality, toxic content [1]. Users complained about the prevalence of angry, divisive posts, and Facebook eventually dialed back the weight of the anger emoji (reducing it from 5× a like to zero by 2020) to stop over-promoting anger-inducing content [1]. This case highlights that the algorithm was amplifying negative material because of high engagement, not because Facebook deliberately wanted to spread anger – and when the effect became clear, the platform adjusted the algorithm.
Research in the misinformation realm reinforces the idea that engagement-driven algorithms inherently favor startling or extreme content. A well-known study in Science analyzed millions of tweets and found “falsehoods travel faster than the truth” on Twitter, meaning misinformation (which often has sensational or emotionally charged narratives) spreads more widely and quickly than factual news[11]. Other analyses have noted that politically extreme sources tend to generate more user interactions than moderate or centrist sources on social media[11]. In short, posts that trigger outrage, fear, or strong emotions (including hate speech or divisive propaganda) can perform extremely well in terms of likes, shares, and comments. The algorithms, in chasing those metrics, will keep showing such posts to more users – thus amplifying their reach. This algorithmic amplification is often indistinguishable from “promoting” the content, even if promotion was not the intentional goal. Critics argue this creates a dangerous feedback loop: users react to inflammatory material, the platform algorithm boosts it to more people, provoking even more reactions.
However, it’s important to note that algorithmic amplification is largely agnostic to the actual sentiment or truthfulness of content – the machine learning systems typically optimize for engagement signals, not whether a post is hateful or beneficial. As one scholarly review put it, “Engagement metrics primarily promote content that fits immediate human social and affective preferences and biases rather than quality content or long-term values.”[1] If a user base tends to interact more with cute cat videos, the algorithm will amplify those; if they interact more with toxic memes, it will amplify those. Unfortunately, human psychology and attention biases mean that, in aggregate, emotionally charged negative content often rises to the top.
There is growing evidence that the content that maximizes short-term engagement is not always conducive to long-term user satisfaction or retention. In the immediate moment, people tend to pay attention to negative, shocking, or hate-filled content, which drives clicks and interaction metrics upward. But over time, a feed flooded with toxic or upsetting posts can alienate users, making them anxious, unhappy, or likely to disengage from the platform. Social media companies have themselves recognized this balance. Facebook’s 2018 overhaul of its News Feed algorithm (the “Meaningful Social Interactions” change) was explicitly aimed at improving long-term user well-being, even at the cost of some short-term engagement. Mark Zuckerberg noted that the change – which showed more posts from friends and family instead of endless viral videos – led people to spend less time on Facebook, but was done because internal research indicated it was “the right thing for people’s well-being.”[5] This suggests Facebook saw that chasing maximal viral engagement (often driven by clickbait or emotionally charged public content) was making the user experience worse in ways that could hurt the platform in the long run.
Evidence from Facebook’s own testing underscores the short-vs-long-term tradeoff. As mentioned, when the company had initially juiced the News Feed with more “provocative” content by overweighting reactions, it indeed caused a spike in interactions – but that came with a wave of user backlash and “decreases in interaction” over time as people became fed up with the divisive tone [1]. In response, Facebook had to course-correct the algorithm to deemphasize outrage. In other words, angry, negative posts drove quick engagement numbers but ultimately started to drive users away, which is not a sustainable outcome for a social network.
Academic studies mirror these observations. The Twitter experiment cited earlier revealed an interesting disconnect: content that the algorithm thought would engage people (e.g. partisan outrage tweets) did get high engagement in the short run, but users reported lower satisfaction with their feeds when exposed to that content[2]. When asked, many users said they would prefer more content that was less angering or divisive, suggesting that what grabs our attention in the moment isn’t necessarily what we want to consume continually[2]. This hints at a potential long-term effect: if people keep getting a diet of content that they find stressful or misaligned with their true preferences, they may use the platform less or feel worse about it over time.
Psychological and social well-being research backs up the idea that constant exposure to negativity can reduce user engagement and health in the long term. Studies of “digital well-being” note that heavy social media use, especially consumption of hostile or negative material, can increase anxiety, depression, and fatigue[11]. Users who feel harassed or who only see toxic discourse might eventually withdraw – effectively churning out of the platform to protect their mental health. This outcome is obviously bad for a company’s long-term user retention. Former Facebook executive Tim Kendall testified that the services he and others built ended up “torn[ing] people apart with alarming speed and intensity,” eroding shared understanding and even, in his view, pushing society toward conflict[11]. While extreme, his remarks underline that if a platform becomes synonymous with hate and hostility, it risks losing broad user trust and participation over time.
Platforms have a strong incentive to avoid that fate. Their business depends not just on one-time clicks, but on keeping users coming back day after day. If the timeline is full of anger-inducing or disturbing posts, many users eventually tune out. This is why we’ve seen moves like Facebook’s pivot to “well-being” metrics, Instagram experimenting with hiding “like” counts (to reduce pressure and negativity), or YouTube adjusting recommendations to down-rank what it calls “borderline” content. All these changes reflect an understanding that sustainable engagement requires balancing short-term attention grabs with long-term user satisfaction. As one research article observed, if algorithms were tweaked to optimize for more meaningful or reflective user preferences (instead of raw clicks), we might see “a reduction in angry, partisan, and out-group hostile content” in feeds – albeit with potential trade-offs like smaller-scale echo chambers [2]. The challenge for platforms is finding that equilibrium where users are both engaged and happy enough to stick around.
A crucial question is whether social media companies intentionally design their algorithms to promote hate speech or extremely divisive content, or if these outcomes are unintended side effects. In public discourse, critics often accuse the platforms of knowingly pushing harmful content because “it’s profitable.” However, peer-reviewed research has not produced evidence that companies are actively manipulating algorithms specifically to spread hate speech. Instead, studies indicate that algorithms amplify engagement, whatever its source, and that user behavior is a major driver of what content circulates.
Several large-scale analyses suggest that the algorithmic effect on what people see, while significant, is often secondary to users’ own choices (whom you follow, what you click) in determining exposure to content. A Facebook-based study published in Science found that a user’s social network and personal choices influenced the content they saw far more than the News Feed ranking algorithm did [1]. In other words, if you only friend or follow people who share hateful or extreme posts, you will mostly see that kind of content regardless of the algorithm – and if you follow diverse, civil voices, the algorithm isn’t likely to override that by shoving hate posts at you out of the blue. Likewise, research on Google and YouTube has shown that user preferences heavily shape outcomes. One study found users clicked on more partisan news sources than a neutral Google Search algorithm would typically recommend, implying people self-select into bias beyond what the algorithm suggests[1]. And regarding YouTube, a 2021 peer-reviewed study tracked hundreds of thousands of users and concluded that the platform’s recommendation algorithm rarely drives people from mainstream videos into extremist rabbit holes. In that study, only about 1 in 100,000 viewers who began watching moderate content later progressed to far-right videos via recommendations – a vanishingly small fraction [1]. Most people who watched far-right or hate content on YouTube arrived there by deliberately searching for it or via external links, not because YouTube’s algorithm force-fed it to them [1]. And those who were prone to consuming such content often subscribed to those channels or consistently sought them out, meaning the demand was user-driven.
These findings support the view that platforms are not secretly plotting to promote hate, but their neutral goal of maximizing engagement can incidentally result in harmful content getting amplified. No known peer-reviewed study documents a case of engineers tweaking code with the express purpose “let’s show users more hate speech.” In fact, outright hate speech (e.g. racial slurs, direct harassment) violates the terms of service of major platforms and is subject to removal when detected. So, if “hate content” is being promoted, it’s usually content that skirts the line of policy — divisive, misleading, or inflammatory material that provokes reaction without using bannable language. The algorithms don’t have a built-in moral compass to down-rank something just for being socially harmful; they only see the engagement metrics. As one comprehensive review notes, current evidence “neither shows that algorithms cause echo chambers, nor that echo chambers cause polarization” on their own [1]. That is, algorithms by themselves are not proven to create hardened hate-filled filter bubbles out of neutral users. The polarization and extreme discourse we observe online arise from a mix of human tendencies, social dynamics, and algorithmic amplification – with the algorithm part being largely reactive to what engages users, rather than an aggressive agenda set by the company to promote one type of content.
It’s also worth noting that lack of transparency makes it difficult for independent researchers to fully answer this question. Platforms closely guard their algorithmic data, citing security (if PR agencies know how the algorithms work they can and will exploit them) and competition, which means academics often have to rely on observational or limited data. A report by researchers at NYU’s Cybersecurity for Democracy project pointed out that without greater data access, we can’t definitively assess how much algorithms might favor or suppress certain content[11]. Thus, absence of evidence is not exactly evidence of absence – but the patterns we do see (engagement rules, user choice driving exposure, etc.) align more with unintended consequences than with deliberate malice. To date, the consensus in the scientific community is that algorithmic promotion of hate is an emergent phenomenon, not an intentional design – a byproduct of algorithms optimizing for engagement in a media ecosystem where outrage often wins attention[1]
Social media companies vehemently deny that they knowingly push hateful or harmful content – and they argue it’s not even in their interest to do so. In response to allegations (especially after whistleblower and media reports in 2021), Facebook’s CEO Mark Zuckerberg publicly stated that the idea the company allows or encourages “angry” content for profit is “deeply illogical.” He explained that *Facebook’s revenue comes from advertising, and **“advertisers consistently tell us they don’t want their ads next to harmful or angry content.”****[5]. In other words, showing users a bunch of toxic posts might increase engagement metrics briefly, but it would scare away advertisers (and potentially users), undermining the business. *“I don’t know any tech company that sets out to build products that make people angry or depressed,” Zuckerberg wrote, emphasizing that Facebook’s “moral, business, and product incentives all point in the opposite direction” of promoting harmful content[5].
Concrete actions by the companies back up these claims of incentive alignment. Zuckerberg pointed to the News Feed overhaul in 2018 (mentioned above) as evidence: Facebook knowingly made a change that led people to spend less time on the platform (a short-term engagement drop), because the company believed it would improve user experience and well-being in the long term[5]. This sacrifice of “time spent” for a healthier feed is not what one would expect if the company’s strategy was to ruthlessly maximize engagement at any cost. Facebook has also invested heavily in AI systems and human moderators to detect and remove hate speech (reporting routinely that the vast majority of hate content taken down was removed before users reported it, thanks to automated detection). While critics can argue about the effectiveness of these measures, the company line is that allowing toxic hate speech to flourish is bad for business and something they actively work against.
Other platforms make similar assertions. YouTube, for instance, has publicly detailed efforts to curb the spread of what it calls “borderline content” – videos that don’t explicitly break rules but come close (e.g. conspiracy theories, incendiary propaganda). In late 2019, YouTube announced it had implemented over 30 changes to its recommendation algorithm to reduce recommendations of such borderline and harmful content, resulting in a reported 70% drop in watch time from recommendations for that material[6]. This is essentially YouTube saying: we don’t want to purely optimize for watch time if it means sending users into a toxic spiral that could damage trust in the platform. It’s an example of aligning the algorithm with long-term quality metrics over short-term popularity.
Twitter (now X) has also publicly grappled with this issue. In 2018, then-CEO Jack Dorsey acknowledged problems with the platform’s incentive structures and funded research into measuring the “health” of conversations on Twitter (beyond just engagement numbers). Twitter introduced prompts like “Want to read the article first?” before retweeting links, attempting to slow down impulsive sharing of potentially inflammatory content. And in a transparency report in 2021, Twitter’s team revealed findings that its home timeline algorithm tended to amplify tweets from political right-leaning accounts more than left-leaning ones – they didn’t fully understand why, but it prompted discussion about whether the algorithm should be adjusted to avoid any unintentional bias. All these moves suggest Twitter is aware that more engagement does not always mean a better experience, and that it must continually check its algorithms to ensure they’re not encouraging the wrong things.
In summary, social media companies publicly insist they do not have a motive to promote hate or extreme negativity, and they point to various product changes and policies aimed at reducing the visibility of harmful content. They stress that their long-term success depends on user trust and comfort on the platform, not just raw engagement figures. While skeptics might question the sincerity or completeness of these claims, it is clear that at least on the record, platforms see themselves as needing to limit hateful or excessively divisive content to maintain a healthy user base and advertising ecosystem.
The debate over algorithms and harmful content has also been informed by investigative journalism, whistleblower revelations, and independent research. These sources sometimes paint a less flattering picture of platform behavior, though it’s important to distinguish between accusations and proven facts. One of the most striking disclosures came from Facebook’s internal documents (the “Facebook Papers”) leaked in 2021, which included a 2018 presentation by Facebook researchers warning that “our algorithms exploit the human brain’s attraction to divisiveness. If left unchecked, Facebook would feed users more and more divisive content in an effort to gain user attention & increase time on the platform.”[4]. This internal memo essentially confirmed that Facebook’s own data scientists observed a tendency for the News Feed algorithm to push polarizing, controversial material because it maximized engagement. The memo suggested interventions to mitigate these effects. However, according to reporting by The Wall Street Journal, Facebook’s leadership shelved or slowed many of those proposed fixes, partly out of concern that reducing the virality of divisive content would disproportionately affect conservative pages and spark political backlash[4]. This revelation has fueled public cynicism — the idea that Facebook’s growth and avoidance of controversy took priority over clamping down on harmful algorithmic tendencies.
Whistleblower testimonies have echoed these claims. Frances Haugen, a former Facebook product manager who leaked documents in 2021, testified that Facebook routinely chose “profit over safety,” alleging that when the platform had opportunities to make the platform less hate-filled or angry, it often resisted if it meant lowering engagement. She pointed out, for instance, that the company disbanded its Civic Integrity team after the 2020 U.S. election and loosened certain safeguards, after which polarizing content and misinformation surged again. Haugen’s core argument was that Facebook knew its algorithm’s emphasis on engagement was contributing to the spread of hate speech and misinformation, yet was slow to change because those divisive posts kept users clicking and scrolling in the short term[5,4]. These claims grabbed headlines and led to Congressional hearings. It’s crucial to note, however, that Haugen’s assertions, while backed by internal research slides, are not the same as peer-reviewed scientific conclusions – they represent one insider’s account of corporate behavior. The company strongly disputed her characterization, as discussed earlier, pointing to their investments in content moderation and adjustments for well-being.
From a scientific perspective, many researchers caution against oversimplified narratives. While acknowledging problems, scholars often emphasize that the “algorithm made me do it” explanation for societal hate and polarization is too crude. Empirical studies on polarization have found mixed results regarding social media’s role. For example, one field experiment actually showed that when people on Twitter were exposed to more posts from their political opposites (i.e. breaking their personalized echo chamber), they became more polarized in their views, not less [1]. This implies that algorithms which don’t filter out opposing views (thus exposing users to content they strongly dislike) can also increase hostility. In other words, whether algorithms show you more of what you like (creating a comfortable echo chamber) or show you things you hate (which anger you), either scenario can potentially amp up division – it’s not a simple one-directional effect.
Another insight from research is that the supply of extreme content meets a demand. Extremist or hate content often originates from specific fringe communities or media sources. Studies of YouTube’s ecosystem find that surges in far-right video views around 2016-2017 correlated with high interest in those topics and a relative lack of milder alternatives for certain viewpoints[1]. In essence, if millions of users are actively seeking inflammatory content, the algorithms (unless heavily curated otherwise) will serve it to them because that’s what’s being clicked on. Some scholars like Kevin Munger and Joseph Phillips argue that the “rabbit hole” problem (users being radicalized by ever more extreme recommendations) has been overstated, and that audience preferences and social influence outside the platform play a larger role in guiding people to hateful content [1]. However, this view doesn’t exonerate the platforms – it simply suggests that solving the issue is as much about changing user behavior and demand as it is about changing the algorithm.
Given the complexity, scientific critiques often call for more transparency and data-sharing from platforms to enable deeper study. Regulators and researchers are pushing initiatives to audit algorithms for biases or harmful impacts. For instance, the EU’s Digital Services Act now requires major tech platforms to allow independent vetting of their algorithms’ risk impact, precisely because so far, much of what we “know” has come from either internal leaks or limited external studies. Only with fuller data can researchers determine, for example, if an algorithm tweak is subtly pushing hate content or if it’s purely reacting to user signals. Until then, we rely on partial evidence: a combination of peer-reviewed studies (which, so far, suggest no deliberate hate-profiteering but do highlight inadvertent amplification), plus whistleblower and journalistic reports (which suggest some knowing negligence or slow action by companies on this problem).
In examining whether social media platforms actively promote hateful content or simply amplify what users engage with, the most substantiated answer is that it’s largely the latter – amplification of engagement – with a critical caveat that this amplification can indeed flood feeds with harmful material. Peer-reviewed research paints a picture where algorithms are guided by engagement metrics and human behavior: they turbocharge content that triggers reactions (often outrage or fear), which can include hate and extremism, but they are not consciously plotting to prefer “hate speech” specifically [1]. Short-term engagement spikes from negative content are well-documented, whereas the long-term downsides – user fatigue, dissatisfaction, polarization – are increasingly coming to light, prompting both academic and internal recognition that unbridled engagement optimization isn’t sustainable [1,2].
On the other hand, public discourse and some investigative reports have accused companies of nefarious motives. It’s true that internal documents show companies were aware of the problem (that their algorithms can escalate divisive/harmful content)[4], and at times they struggled or hesitated to fully address it due to business concerns or political optics[4]. But it’s a leap from this to say platforms want to promote hate. So far, no empirical study conclusively demonstrates a deliberate corporate strategy to maximize hate speech exposure – and company officials strongly refute that idea, citing business incentives that run counter to it[5]. In fact, platforms have made high-profile changes (like Facebook’s feed overhaul, YouTube’s crackdown on borderline content) that acknowledge the issue and attempt to rein in the very engagement-driven excesses that fuel harmful content [5,6].
In summary, social media algorithms do amplify content that users (in aggregate) respond to – and unfortunately, that can mean a lot of incendiary and negative posts get amplified. Users tend to engage with shocking or emotionally charged content, and the algorithm dutifully serves it up, creating a cycle that can look like promotion of hate. However, framing it as intentional promotion by the platform oversimplifies the situation. The weight of peer-reviewed evidence suggests unintentional amplification and a misalignment between short-term engagement and long-term user well-being, rather than a grand plot to spread hate for profit. Platforms themselves claim to be moving toward models that emphasize user satisfaction and safety over raw engagement, though skeptics argue they could do more, faster. Going forward, ongoing independent research – with greater access to platform data – is needed to continue separating myth from reality and to hold platforms accountable to the effects of their algorithmic choices, whether inadvertent or not.
[1] Metzler, H., & Garcia, D. (2023). Social Drivers and Algorithmic Mechanisms on Digital Media. Perspectives on Psychological Science, 19(5), 735-748.
[2] Smitha Milli, et al., User Satisfaction, and the Amplification of Divisive Content on Social Media, 24-01 Knight First Amend. Inst. (Jan. 3, 2024)
[3] W.J. Brady,J.A. Wills,J.T. Jost,J.A. Tucker,& J.J. Van Bavel, Emotion shapes the diffusion of moralized content in social networks, Proc. Natl. Acad. Sci. U.S.A. 114 (28) 7313-7318 (2017).
[4] Zhang, S. (Facebook internal research, 2018). "Our algorithms exploit the human brain’s attraction to divisiveness..." (Facebook internal slide, as quoted in WSJ).
[5] Zuckerberg, M. (2021). Zuckerberg says claims about FB prioritising profit over safety untrue, Marketing-Interactive
[6] YouTube Official Blog (2019). Raising authoritative content and reducing borderline content, YouTube’s announcement of algorithm changes
[7] Eytan Bakshy et al., Exposure to ideologically diverse news and opinion on Facebook. Science348, 1130-1132(2015). DOI:10.1126/science.aaa1160, Science
[8] Hosseinmardi H, Ghasemian A, Clauset A, Mobius M, Rothschild DM, Watts DJ. Examining the consumption of radical content on YouTube. Proc Natl Acad Sci U S A. 2021 Aug 10;118(32):e2101967118. doi: 10.1073/pnas.2101967118. PMID: 34341121; PMCID: PMC8364190.
[9] C.A. Bail,L.P. Argyle,T.W. Brown,J.P. Bumpus,H. Chen,M.B.F. Hunzaker,J. Lee,M. Mann,F. Merhout,& A. Volfovsky, Exposure to opposing views on social media can increase political polarization, Proc. Natl. Acad. Sci. U.S.A. 115 (37) 9216-9221 (2018).
[10] Guess, Andrew & Lyons, Benjamin & Nyhan, Brendan & Reifler, Jason. (2018). Avoiding the echo chamber about echo chambers: Why selective exposure to like-minded political news is less prevalent than you think.
[11] Lauer, D. Facebook’s ethical failures are not accidental; they are part of the business model. AI Ethics 1, 395–403 (2021). https://doi.org/10.1007/s43681-021-00068-x
This article is tagged:
Dipl.-Ing. Thomas Spielauer, Wien (webcomplains389t48957@tspi.at)
This webpage is also available via TOR at http://rh6v563nt2dnxd5h2vhhqkudmyvjaevgiv77c62xflas52d5omtkxuid.onion/