The impact of toxic trolling comments on anti-vaccine YouTube videos | Scientific Reports – Nature.com

Prevalence of fear and toxicity in YouTube comments

Our objective was to assess the effects of trolling comments on anti-vaccine videos, specifically examining whether toxicity in comments for a video is associated with the level of fear expressed in the comments for the same video on YouTube. We first analysed the relationship between toxicity and fear at the video level. We employed a machine learning approachGoogles Perspective API and a RoBERTa-based modelto quantify fear and toxicity levels in each comment, and wecomputed their mean for each video (see Methods). Our results demonstrate that highly fearful and highly toxic comments only constitute a portion of all comments. As depicted in Fig.1a, the top 20th percentile comments have fear and toxicity scores of 0.03 and 0.29, respectively, on a 01 scale. These scores illustrate highly skewed distributions, with a narrow dynamic range for fear and a wide dynamic range for toxicity.

We then investigated the temporal dynamics of fear and toxicity within comment sections. Figure1b provides an example of the absence of evident temporal patterns for fear and toxicity, where highly toxic and highly fearful comments appeared sporadically and abruptly. We also analysed the temporal intermittency of such highly toxic and highly fearful comments and found a heavy-tailed distribution for the intervals between highly toxic and highly fearful comments (Fig.1c)30.

One may question whether fear and toxicity are inherently correlated within an individual comment. By analysing the scatter plot of fear and toxicity scores for each comment (Fig.1d), we found that this is not the case. To summarise the findings thus far, both toxicity and fear in comments exhibit burst-like dynamics; however, they do not have a one-to-one correspondence and are not the same signal.

Fear and toxicity in YouTube comments at the comment and video levels. (a) Distribution of fear and toxicity scores for each comment, with the top 20th percentile thresholds indicated by vertical lines. Both distributions exhibit strong skewness, suggesting that many comments possess low values. (b) Time series of fear and toxicity scores for the first 100 comments on a specific video. The horizontal lines correspond to the 20th percentile thresholds from (a), and the data points surpassing these thresholds are marked. (c) A log-log plot of CCDFs indicating the probability of the interval time between all highly fearful and highly toxic comments at the comment level, which indicates that this interval time adheres to a heavy-tailed distribution. (d) Scatterplot of fear and toxicity scores per comment. (e) Distribution of average fear and toxicity scores per video. (f) Scatter plots of average fear and toxicity scores per video. The red line shows a regression line. The Pearson correlation coefficient between toxicity and fear was 0.10 ((p=0.00)), the coefficient of the single regression analysis was 0.06, and the distance correlation was 0.16.

We quantified the distributions of fear and toxicity and their correlation in comments at the video level. Figure1e shows the distribution of average fear and toxicity scores for comments on each video. These distributions differ from those at the comment level (Fig.1a) and more closely resemble a Gaussian distribution. Figure1f shows a scatter plot of fear and toxicity scores, indicating a weak correlation between average fear and toxicity scores at the video level. Based on these findings, we decided to focus on the average fear and toxicity scores compiled across entire or partial comment sections as significant metrics31.

Next, we examined the features associated with fear at the video level. To account for potential covariates, we conducted an ordinary least square (OLS) regression analysis with the videos as data points and the average fear of comments per video as the dependent variable. The independent variables in the regression analysis include the videos base features (e.g., view counts), the videos emotion-related features (e.g., fear score in a title), the videos topics, and the comment features (e.g., fear, toxicity), all detailed in Methods.

Figure2 shows the results of the regression analysis at the video level. Notably, the analysis identified that average toxicity is a significant variable even when controlling for other variables, implying a strong association between toxicity and fear in comments aggregated at the video level. Since the fear and toxicity scores are all on a scale from 0 to 1, the coefficients of the regression indicate how much fear increases when the toxicity score increases from 0 to 1. Alarge and significant cofficient indicates that the variable is correlated with the degree of toxicity in the comment section of the video. Additionally, we found that fear in comments was significantly associated with the topics of viruses and childrens diseases, which aligns with previous research linking these topics to fear among anti-vaccine groups5,32. Furthermore, fear in the title, description, and transcript is positively associated with fear in comments, which supports previous research indicating that the emotional content of videos can be associated with viewers emotions33. By contrast, the analysis revealed the toxicity of video content is only minimally related to fear in comments. The analysis did not reveal any significant relationship between pseudoscience and fear in comments, suggesting that the scientific nature of the content does not significantly affect fear in comments. See Supplementary Information for models with some features ablated.

Results for video-level regression. (a) The coefficient of variables with 95% CIs. The stars indicate the p values of the t-test: *** for (p < 0.001), ** for (p < 0.01), and * for (p < 0.05). The model intercept parameter is not shown.

Lastly, we explored the impact of early toxic comments on subsequent comments, focusing on emotional aspects. Based on previous research that demonstrated a connection between early comment features and later sentiments in comments34, we employed a similar approach. A key factor to consider is the window size of comments, which sets the threshold for determining the number of early comments (Fig.3). We compared various window sizes, with (k={10,20,30,40,50}) to gauge their effects. The objective was to calculate the average toxicity and fear in comments within these window sizes (see Methods), and subsequently incorporate the variable groups 1, 2, and 3 features used in the previous regression analysis to estimate the average fear in comments following the threshold k (Model 4 in Fig.3a).

Considering thatYouTube comments are not necessarily arranged in chronological order, we included not only the recency of comments but also their engagement, specifically the number of likes, which highly affects the order of comments. The number of likes is crucial when assessing the impact of toxic comments because the higher the value, the more likely a comment is to appear at the top of the comment list and, consequently, has a greater likelihood of influencing other comments19,35. In this study, we aimed to account for the effect of highly liked comments. Therefore, we used the average toxicity of comments within window k that have a like count in the top 20th percentile or higher, as the explanatory variable (i.e., the toxicity of highly liked comments, see Methods) (Model 5 in Fig.3a). This approach replaced the use of the average toxicity of comments within window k in Model 4.

Figure3b shows the coefficients for all comments (top) and highly liked comments (bottom) across thewindow sizes (k={10,20,30,40,50}) in the regression analysis. For all comments and after controlling for fear in early comments, the toxicity of early comments is slightly positively related to fear in later comments across all window sizes (k) (but not significant), after controlling for fear in early comments. This result suggests that early toxicity is associated with later fear independently of early fear. Note that early fear is strongly associated with later fear, confirming the contagion of homogeneous emotions. Looking at the toxicity of highly liked comments at the bottom of Fig.3b, we can also see that only the coefficients for the toxicity of highly liked comments are significant (4 out of 5 cases). Moreover, the toxicity of highly liked comments has a particularly large coefficient (about 1.3 times), indicating that the association of the liked comments is stronger than all comments. It should also be noted that in the regression analysis, the other video-related variables were controlled, as described in Factors that elicit fear in YouTube comments, suggesting that emotional contagion was likely to occur between comments.

Measuring the association of toxicity of early comments with the fear in later comments. (a) Illustration of the problem setting. N comments in chronological order for a given video are divided into early and later halves, separated by k. Then, the average fear of comments in the comment range is predicted by the variables noted in Model 4 and Model 5, respectively, and the coefficients are obtained. (b) Forest plots showing the coefficients of average toxicity of comments and highly liked comments across window size (k={10,20,30,40,50}). Both are positive regardless of k, but only the mean toxicity of highly liked comments is largely significant. The average toxicity of highly liked comments has a high coefficient compared to the average toxicity of all comments (1.3 times higher in the average value in the five windows).

One might wonder whether there is the opposite direction of the effect, i.e., is early fear associated with later toxicity in comments? To answer this question, we examined two more modelsmodel 6 and model 7. In both models, we assigned the mean toxicity of later comments as a dependent variable, and in model 7, we modified model 5 by replacing the average toxicity of highly liked early comments with the mean fear of highly liked early comments in the independent variable (Fig.4a). Figure4b suggests that the coefficients for fear in early comments are all positive. Also, the coefficients for fear of highly liked comments are largely significant (4 out of 5 cases). These findings indicate that in the comment section of anti-vaccine YouTube videos, the influence of the toxicity and fear in comments is bidirectional, with the toxicity of early highly liked comments having an influence on the fear in subsequent comments and vice versa.

Measuring the association of the fear of early comments with the toxicity in later comments. (a) Illustration of the problem set. N comments in chronological order for a given video are divided into early and later halves, separated by k. Then, the mean fear in comments in the comment range is inferred by the variables noted in Model 6 and Model 7, respectively, and the coefficients are obtained. (b) Forest plots showing the coefficients of the fear in comments and the fear in highly liked comments, for (k={10,20,30,40,50}). Only the coefficients for fear in highly liked comments are largely significant (3 out of 5 cases).

See the original post:

The impact of toxic trolling comments on anti-vaccine YouTube videos | Scientific Reports - Nature.com

Related Posts
Tags: