Site icon Facebook baixar gratis

Social balance in directed networks

Social balance in directed networks

The spectrum of directed null models

Network structure is shaped by node-level preferences as well as pairwise or higher-order wiring mechanisms. A key step towards unveiling the wiring mechanisms is to compare with null models that match the key features of individual nodes. Formally, a null model is an ensemble of random graphs that is constrained by some selected features of the original network. In directed networks, it is natural to consider the signed in- and out-degrees separately since they might originate from different mechanisms. For instance, the positive in-degree can indicate popularity or prestige, while the positive out-degree can result from sociability or influence-seeking behavior25,26. Thus, the in- and out-degrees are not necessarily well correlated in empirical networks as shown in Fig. 1a.

Reciprocity is another key aspect of social systems. As shown in Fig. 1b, for negative links, both the Bitcoin-Alpha and Slashdot networks exhibit moderate correlations between reciprocated degrees and total degrees, indicating heterogeneous reciprocity patterns across nodes. This heterogeneity suggests that nodes with similar total negative degrees may display varying tendencies to engage in mutual negative relationships. For positive links, reciprocity can be either highly homogeneous (Bitcoin-Alpha) or not (Slashdot), depending on the dataset. Although conflicting links typically account for less than 2% of total links, we observed that the tendency to form conflicting links can be highly heterogeneous among nodes (Supplementary Fig. 1). We further check the correlations between all six independent primary node degrees and find that Pearson correlation coefficients are mostly below 0.50, with a few exceptions that can go up to 0.77 (Fig. 1f, g and Supplementary Tables 1–5). These weak to moderate correlations likely suggest that each link type is influenced differently by social processes. For example, reciprocated positive links may represent mutual friendship, while conflicting links could indicate complex status relationships17. Critically, these correlations demonstrate that constraining one type of degree does not automatically constrain others. For example, maintaining the positive unidirectional in-degree does not guarantee the preservation of the positive reciprocated degree. Hence, providing motivation for the maximally constrained null model that preserves each primary degree separately. However, note that constraining the network topology makes some primary degrees depend on other degrees. For example, fixing the topology preserves both in- and out-degrees for unidirectional links. Consequently, if the positive unidirectional in-degree is preserved, the corresponding negative unidirectional out-degree is also automatically preserved. A similar relationship exists for signed reciprocated and conflicting degrees, which reduces the independent degrees considered in the null model, as described in detail in Supplementary Note 2.1.

On the way to the maximally constrained null model, as we incorporate more constraints into the null model, the fraction of the network that is being randomized decreases (Fig. 3). While the maximally constrained null model matches all node-level features, in principle, we could include additional constraints based on higher-order network properties. Such a non-local null model would eventually capture all characteristics of the empirical social networks. At this point, the non-local null model could serve as a generative model, capable of producing synthetic networks statistically indistinguishable from the observed network across multiple measures. In this sense, the progression from null models to generative models represents a continuum of increasing structural fidelity, reflecting our evolving understanding of the fundamental organizing principles in social networks.

Fig. 3: Randomizing signed directed networks.
figure 3

The diagram illustrates a progression from topology-disrupted models (left) to generative models (right), with intermediate models. The top network represents the original structure, with positive links in blue and negative links in red. Two example null models are shown: (1) the signed directed model that preserves directed topology and signed in- and out-degrees (39% randomized), and (2) the maximally constrained model that preserves directed topology with signed in-, out-, reciprocated-, and conflicting-degrees (18% randomized). Darker colors highlight randomized links.

However, as we mentioned above, for the purpose of detecting the patterns emerging from wiring mechanisms, we only want to include local constraints at the level of individual nodes. As a strong argument for the maximally constrained null model, the best null model is expected to be the one that is closest to the data without being constrained on higher-order features. Another strong argument for the maximally constrained null model emerges from considering incomplete, “V-shaped” patterns. In a V-shaped pattern, two of the three considered nodes have no relations to each other. Intuitively, such patterns are not expected to provide insights about genuine three-node interaction tendencies. In line with this expectation, the maximally constrained null model matches all V-shaped motifs. This follows from the fact that counts of such motifs are just the number of ways we can pair different types of links for each node, each of which link type matched by the null model for each node. Note, however, that V-shaped graphlets can still deviate from the model, echoing the presence of genuine three-node patterns, where all pairs of the three nodes are interacting. Of course, considering such V-shaped patterns does not provide new information in this case in addition to what we learned from the connected triads shown in Fig. 2. We also note that the maximally constrained null model is not the only one that fully captures incomplete motifs. The version that similarly constrains all primary degrees without fixing the topology also has this desirable property.

In this study, we restrict ourselves to present only two null models to illustrate how incorporating appropriate constraints reveals the hidden structure of empirical social networks. Based on a previous undirected study1, we consider both network topology and signed degrees as fundamental constraints. We extend this approach to signed directed networks by preserving directed topology and signed in- and out-degrees (signed directed null model). This null model is efficiently generated by applying maximum-entropy randomization to directed networks, maintaining average in- and out-degrees across the ensemble (see Supplementary Note 2.3 for details). Alternatively, the maximally constrained null model considers all primary node degrees from three distinct types of directed links: unidirectional, reciprocated, and conflicting. In addition to directed topology, the maximally constrained null model preserves the signed in-, out-, reciprocated-, and conflicting-degree of each node, when averaged over the ensemble of the null model. We implement this model by decomposing the directed network into three independent subgraphs: (1) unidirectional positive and negative links, (2) reciprocated positive and negative links, and (3) conflicting links. Maximum-entropy randomization is applied to each subgraph separately. The union of the links in the resulting randomized subgraphs provides the complete null model, as formulated in Supplementary Note 2.2.

Directed notions of balance

Extending the definition of balance to directed network necessitates understanding the role played by reciprocated and conflicting links. For example, negative reciprocated links can either be considered as equivalent to a negative link in the undirected case, or be interpreted as a sign of balance, due to sign concordance18. However, as in undirected networks, a definition of balance should not be based on isolated links as such an approach ignores the complex interdependencies of social systems. Instead, a comprehensive definition of balance should consider the dynamics among a group of entities, resulting in patterns beyond those given by individual node-level features. Here, we discuss several definitions of strong balance based on (fully connected) triads in directed networks, extending balance theory from the undirected case to the directed case, as illustrated in Fig. 4.

Fig. 4: Definitions of balance for signed directed triads.

An example triad is considered under different definitions of balance. Positive links are shown in blue and negative links are shown in red.

As direct extensions of undirected balance theory, we first propose two definitions that transform directed links into undirected links. The Undirected definition simplifies the network by converting reciprocated  + / + to positive and  − / − to negative links, while treating conflicting  + / − as negative. In contrast, the Consistency definition focuses on sign agreement, treating both  + / + and  − / − as positive, with only  + / − considered as a negative. This distinction depends on the interpretation of reciprocated negative links (− / −): the Undirected approach views them as elevated discord, while the Consistency approach sees them as a form of consistency in negative sentiment. Both definitions consider unidirectional links as undirected and preserve their signs.

As an alternative, we build upon the observation that the pattern A → B → C ← A, known as a transitive cycle27, frequently appears in most considered datasets (> 30% of total triads in Epinions, Slashdot, Pardus). Thus, we consider a definition of balance based on transitive cycles (Cycle). This definition considers the consistency of sentiment between the direct interaction from A to C and the indirect interaction from A to C through B. We consider a cycle as balanced if it has an even number of negative links, and unbalanced otherwise. A triad is considered balanced only if all transitive cycles are balanced.

Another definition of directed balance considers the closed walks involving all three triad nodes (Walk). Closed walks represent paths where information or influence can flow back to its origin. This circular flow within a triad can reinforce or counteract itself, depending on the signs of the links. We consider closed walks that encompass all nodes without repeating nodes and consider them as balanced if a given walk contains an even number of negative links, and unbalanced otherwise. A triad is considered balanced only if all closed walks are balanced.

The final definition we consider is grounded in status theory17,28, which offers a distinct perspective on balance in signed directed networks, particularly relevant in hierarchical social structures. According to status theory, the sign of a link between two nodes is determined by the perceived difference in their social status (Status). Specifically, a positive link from node A to node B indicates that A perceives B as having a higher status, while a negative link suggests that A views B as having a lower status. In this definition, balance is achieved when all three nodes of a triad can be placed in a consistent status order. While previous studies of status theory have focused less on reciprocated links, real social systems often contain reciprocated links, potentially corresponding to equal status. Thus, here we introduce an extended notion of status theory by considering reciprocated positive or negative links between two nodes as indicators of having equal status. For example, if A positively links to B and B also positively links to A, such a triad is considered as balanced as long as both A and B have higher, lower, or equal status relative to C.

Note that these prospective definitions of balance already differ even at the level of fully reciprocated configuration with consistent signs (A1, A11, A3, A16 in Fig. 5). In these cases, according to the Undirected definition, directionality plays no role, as mutual links with identical signs can be considered as undirected links without loss of information. On the contrary, both the Consistency and Status definitions suggest that triads A3 and A16 should be balanced, while the Undirected definition indicates they should be unbalanced (Fig. 5). This discrepancy indicates that the Consistency and Status definitions should not be considered as extensions of the undirected notion of balance. Instead, these definitions may offer complementary insights into signed directed networks if they demonstrate consistency with empirical data.

Fig. 5: Comparison of the observed triad statistics to null models and balance theories for triads without conflicting links.

Triads are ordered based on their expected balance according to the Undirected definition, with balanced triads presented first, followed by a black line and then unbalanced triads. The “Maximally constrained” and “Signed directed” rows show z-scores quantifying the deviation of triad frequencies in empirical social networks from corresponding null model expectations. Orange (gray) dots indicate significant overrepresentation (underrepresentation) with z > 2 (z < − 2), while lighter colors indicate results with (z < = 2). The “Theory” rows indicate whether each triad configuration is classified as balanced (orange) or unbalanced (gray) according to a certain definition of balance. The spot is left blank if the theory is inconclusive about the balance of that triad. Triads that contain only reciprocal links are shaded in red. Inconclusive triads without significant results are shaded in gray.

Balance is observed in directed social networks

To quantitatively understand the balance in signed directed networks, we consider all signed and directed triadic graphlets presented in the empirical social systems, as shown in Fig. 2. As a standard measure, we use the z-score to quantify how the observed triad frequencies deviate from the null models. The z-score is calculated as

$$z=\frac{{f}_{obs}-\langle \,{f}_{null}\rangle }{\sqrt{{\sigma }_{obs}^{2}+{\sigma }_{null}^{2}}},$$

(1)

where fobs is the observed frequency of a given triad and 〈 fnull〉 is the mean frequency of the same triad type averaged over 1000 independently generated null model networks. The denominator represents the total uncertainty, combining two sources of uncertainty: σnull, the standard deviation of triad frequencies across all null model samples, and σobs, the estimated shot noise \({\sigma }_{obs}\approx \sqrt{{f}_{obs}}\), assuming a Poisson distribution for the occurrence of each triad type. Note that for frequent motifs, with a large count, the shot noise has a negligible downward impact on the z-score, however, it saves us from misinterpreting small counts. This approach allows us to account for both the variability in the null model and the inherent statistical fluctuations in the observed network, providing a robust measure of the significance of triad frequency deviations. A triad is considered as significantly overrepresented when z > 2 and significantly underrepresented when z < − 2. Any z < 2 score means that the triad does not deviate substantially from the null model.

First, we consider the maximally constrained null model, which preserves topology and all primary node degrees. For each triad, we note consistency with a given definition of balance if there are no statistically significant contradicting conclusions regarding over- or under-representation across all empirical datasets. In Fig. 5, we ordered all triads without conflicting links based on whether they are balanced or not under the Undirected definition. We observe consistent alignment within all datasets examined. Moreover, the results are also consistent with the Cycle definition of balance. The exceptions are triads G1 − G4, where the results are not significant in most datasets, indicating that these specific configurations occur at frequencies similar to what would be expected by chance, given the preserved network constraints. Regarding triads with conflicting links, at least partially due to their small numbers, the majority lacks sufficient statistical significance to determine over- or underrepresentation compared to the null model, leading to no definite conclusions of balance (Supplementary Fig. 2). Another possible interpretation is that the maximally constrained null model already captures the statistics of triads with conflicting links. This means that once we account for the signed degrees of unidirectional, reciprocated and conflicting links of each node, the frequencies of triads containing conflicting links are found to be fully explained by these lower-order network properties.

On the contrary, when comparing observed frequencies to the signed directed null model, which preserves directed topology and signed in- and out-degrees, we do not observe a clear pattern that aligns with any proposed balance definition (Fig. 5). The most notable trend is the underrepresentation of all triads with conflicting links across datasets, with the sole exception of triad A8 in the Bitcoin-Alpha network (Supplementary Fig. 2). However, this observation does not reveal direct insights about balance in these networks. Instead, it is a consequence of the overrepresentation of reciprocated links and underrepresentation of conflicting links at the link level, as suggested by other studies18. This link-level pattern propagates to the triadic level, resulting in the observed underrepresentation of triads containing conflicting links and the overrepresentation of triads without conflicting links in most cases. We expect similar effects to take place for other null models that are leaving some constraints out of the maximally constrained null model.

link

Exit mobile version