Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Multi-Stage Machine Learning Model for Hierarchical Tie Valence Prediction

Multi-Stage Machine Learning Model for Hierarchical Tie Valence Prediction Multi-Stage Machine Learning Model for Hierarchical Tie Valence Prediction KARANDEEP SINGH , Data Science Group, Institute for Basic Science, South Korea SEUNGEON LEE , Data Science Group, Institute for Basic Science; School of Computing, KAIST, South Korea GIUSEPPE (JOE) LABIANCA, Department of Management, UMass Amherst, USA; Department of Manage- ment, University of Exeter, UK JESSE MICHAEL FAGAN, Department of Management, University of Exeter, UK MEEYOUNG CHA, Data Science Group, Institute for Basic Science; School of Computing, KAIST, South Korea Individuals interacting in organizational settings involving varying levels of formal hierarchy naturally form a complex network of social ties having diferent tie valences (e.g., positive and negative connections). Social ties critically afect employees’ satisfaction, behaviors, cognition, and outcomes Ð yet identifying them solely through survey data is challenging because of the large size of some organizations or the often hidden nature of these ties and their valences. We present a novel deep learning model encompassing NLP and graph neural network techniques that identiies positive and negative ties in a hierarchical network. The proposed model uses human resource attributes as node information and web-logged work conversation data as link information. Our indings suggest that the presence of conversation data improves the tie valence classiication by 8.91% compared to employing user attributes alone. This gain came from accurately distinguishing positive ties, particularly for male, non-minority, and older employee groups. We also show a substantial diference in conversation patterns for positive and negative ties with positive ties being associated with more messages exchanged on weekends, and lower use of words related to anger and sadness. These indings have broad implications for facilitating collaboration and managing conlict within organizational and other social networks. CCS Concepts: · Applied computing→ Business Intelligence; · Information systems→ Enterprise information systems. Additional Key Words and Phrases: Signed link prediction, Sentiment embeddings, Graph neural networks, Tie-Valence prediction, Organizational social network 1 INTRODUCTION There is growing interest in understanding the role of positive and negative network ties or links ś recurring relationships that involve enduring valenced interpersonal judgments ś in explaining actors’ attitudes, behaviors, cognition, and outcomes 25[]. Identiication of tie valences can be helpful for a variety of practical downstream tasks such as recommending products26 [ , 51] or friends22[], estimating the impact of a publication 7], and [ Both authors contributed equally to this research. Authors’ addresses: Karandeep Singh, Data Science Group, Institute for Basic Science, South Korea; Seungeon Lee, Data Science Group, Institute for Basic Science; School of Computing, KAIST, South Korea; Giuseppe (Joe) Labianca, Department of Management, UMass Amherst, USA; Department of Management, University of Exeter, UK; Jesse Michael Fagan, Department of Management, University of Exeter, UK; Meeyoung Cha, Data Science Group, Institute for Basic Science; School of Computing, KAIST, South Korea. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for proit or commercial advantage and that copies bear this notice and the full citation on the irst page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior speciic permission and/or a fee. Request permissions from permissions@acm.org. © 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM. 1556-4681/2023/1-ART $15.00 https://doi.org/10.1145/3579096 ACM Trans. Knowl. Discov. Data. 2 • Singh et al. predicting the dynamics of complex social netw 24,orks 49]. Ho [ wever, despite the amount of time that individuals spend within social structures involving hierarchy and some level of competition, such as work organizations, little is known about tie valence within these networks. Work contexts can, for example, control which individual can gain a promotion; create competition for scarce organizational resources; place people into specialized units that are often at odds with other units; and introduce power relationships that are diicult to ignore. While viewing organizations as a nexus of social relationships has gained ground in the past decade, the use of large-scale data for tie valence detection in a setting involving hierarchy and some conlict is still in its infancy. What has made this research and its applications particularly challenging is that negative ties are counter- normative (i.e., are generally frowned upon) and are often hidden from 37].vie While w [ colleagues in the same rank might notice conlicts, these conlicts are intentionally suppressed from higher-ranked individuals. This produces a phenomenon where top management is often unaware of important conlicts occurring below them (e.g., interpersonal or interdepartmental disputes) that might need active attention because these conlicts can grow and draw in others42 [ ], ultimately threatening the organization’s proper functioning 3]. Thus, a[critical task is to understand where there are positive ties that can encourage collaboration and employee attachment to the organization and negative ties that undermine organizational solidarity and goal achievement. Numerous previous studies have shown that the number of positive and negative ties in an organization afects individual and group-based outcomes, including work performance, job satisfaction, and employee turnover [30]. Electronically-mediated communication and electronic information exchange patterns could be relective of the tie-valences between the concerned parties. Additionally, the advent of digitization has enabled data collection at an unprecedented scale which could be mined for discovering hidden patterns of interest. This, coupled with the recognition of the importance of understanding workplace ties, has resulted in the development of various data-driven solutions like Microsoft Workplace Analytics 39], OrgMapp [ er [38], and Humanyze Workplace Analytics28[]. These tools utilize corporate data to drive workplace improvements based on a broad set of features, including those from emails, meetings, collaboration activities, unscheduled calls, and instant messages. In some cases, these providers already collect these data legally on behalf of the client organizations for other purposes. These digital information exchanges can be used to learn relationship patterns and create actionable insights while still protecting individuals’ privacy. Yet much of this potential remains underutilized because tie valence prediction is not incorporated into social network analytic tools. Given the potential ofered by these relatively unexplored data, we propose a computational and data-driven approach to the problem of tie-valence prediction in networks involving hierarchy and competition. We utilize disparate sources of real data from an organization and propose a neural network model that extracts the relational sentiment information from unstructured and structured data sources. More speciically, our model, exTVcalled (Model toextract TieValences), is trained on anonymized work conversation data, employees’ human resources (HR) data, and a sociometric survey of positive and negative ties among a subsample of members. exTV is the irst of its kind to utilize anonymized oicial conversation texts exchanged between members of varying ranks in an organizational hierarchy to learn the numeric embeddings that are representative of people’s sentiments toward one another. This step employs a meticulously designed natural language processing algorithm to handle unstructured textual information and other critical meta information such as messaging frequency and times. Next, these embeddings are used alongside the HR data to build a neural network that identiies relational information among the people and inally delivers the output to be used as the tie valence label. We explore the model and the underlying data with the results obtained and test what sources and types of data are important for the inal classiication. In doing so, we discover intricate relationships among individuals of diferent ranks (e.g., supervisor and subordinate) and individuals of particular proiles (e.g., managers, females, ethnic minorities). We also ind that work conversations are a rich resource that uncovers valuable information for discerning additional, and otherwise ambiguous, tie-valences. Our model, exTV, can classify tie valence in the studied organizational network with an AUROC scor 0.8190 e of. We intend to apply our model in the ACM Trans. Knowl. Discov. Data. Multi-Stage Machine Learning Model for Hierarchical Tie Valence Prediction • 3 future to understand the rest of our large organization’s network tie valences. Our model can ofer key insights into downstream tasks such as improving organizational restructuring and post-merger integration, increasing workplace attachment, decreasing conlict and employee attrition rates, leading to improved organizational functioning and employee well-being. Ethical considerations. Analyzing interpersonal connections in general and tie valence, in particular, can provide valuable insights into the workplace environment even while being conducted ethically in a manner that protects individual privacy. One key is separating the irm gathering the data and conducting the network analyses (the third-party provider) from the client organization that receives recommendations, much as is done in numerous other contexts (e.g., third-party irms often handle client companies’ sexual harassment or whistleblower allegations to maintain conidentiality for all parties and provide neutral assessments). The client only receives anonymized, aggregated results that, while actionable, protect individuals’ data privacy (e.g., identifying where there is increasing interdepartmental conlict in the organization to initiate conlict management techniques quickly). These data are obtained legally because, in most of the world, the data are owned by companies, and employees are made aware that anything they transmit over company networks can be accessed by those companies (with the European Union being the main notable exception). 2 RELATED WORK Numerous social theories have been proposed to understand positive and negative ties in networks, including in organizational conte50 xts ]; these [ theories are often used when attempting tie valence detection. The most commonly used theory is structural balance the8or ] [y14 [ ], along with its newer variants such 44as ]. Sentiment [ lexicons like 40] [[21] have also been a popular choice for linguistic-features-based sentiment-classiication. Other popular methods in link prediction such as Jaccard coeicient, resource allocation index, and preferential attachment can be used to predict łmissingž links 35]. [In this work, we focus on the tie-valence of existing edges in an organizational social network. From a computational perspective, work on predicting network tie valence in organizational contexts is nearly non-existent. Though there is plenty of research on link prediction in social networks, transferring this to organizational contexts is impracticable due to diferent network dynamics in a formal work setting (e.g., one with competing/collaborating departments and a formal reporting hierarchy). From a machine learning point-of-view, GNN 41][ based models can operate on and learn representations of graph-structured data and have shown improved performance over traditional deep learning approaches. The strength of GNNs comes from their ability to implicitly learn the graph’s structure and the neighboring contextual information. Additionally, works like SGCN 16[], SiGNet29 [ ], and SiGAT27 [ ] have adopted GNNs to handle directional and signed networks. These approaches often incorporate social balance theories into the training process, thereby amalgamating computational and sociological approaches. For instance, SGCN includes structural balance theory, and SiGAT captures both the structural balance theory and status theory [15]. Matrix factorization has also been used commonly for analyzing networks and link prediction 2, 4]. tasks [ For example 4[] proposes a matrix factorization-based model that also cashes in on the users’ personality information. 32][ rethinks the problem of link prediction by identifying ‘no-relation’ as a possible future status of the node pairs. Studies like 5, 6] intr [ oduce advanced graph embedding methods with techniques such as preferential random walks. Research has also been focused on using latent factor models for link prediction tasks, such as [43, 49]. ACM Trans. Knowl. Discov. Data. 4 • Singh et al. Fig. 1. An example of signed, directed network with partially missing information (i.e., tie valence from Tom to Sean). People with diferent roles and atributes form the organization depicted. 3 PROBLEM SETTING 3.1 Dataset One of the key factors in determining the performance of a machine learning model is the data being utilized for training and testing purposes. Our research problem is not often attempted because employee data from a work organization is rarely available for research consumption. Even rarer is the data pertaining to the employees’ liking and disliking of their fellow colleagues. Understandably, these data are limited and relatively very limited for a computational approach that a machine learning model heavily drives. The various data sources that we utilize to supervise the training and assess the model output are discussed in the following paragraphs. Conversation data consist of oicial conversation exchanges among employees over two years. While the digital conversations could be in any form, such as emails, instant messages, meeting invites, or seminar chat logs, we use emails in this study. The digital exchange of information among employees forms an information exchange network. Along with the exchange patterns, we also have at our disposal the anonymized text of the conversations, where the data is stripped of any personally identiiable information. Each record in this data has multiple features, including message text, ID, timestamp, and the hash digests of the sender and receiver IDs. In total, there are 1,403,303 messages exchanged between 3,404 users. Further, using the exchange patterns, we generate an additionalmeta 73 features from conversation data that may relect the link polarity, e.g., the number of total emails between a user pair, the number of weekend emails. Human resources (HR) data contain the demographic and work-related attributes for each employee. Features in these data are typical of HR data (e.g., age, gender, department, rank, experience). Other information, such as salary, is anonymized due to privacy concerns. The HR data is only available for 1022 employees (e.g., it does not cover part-time employees or contractors). Sociometric survey data from the studied organization contains self-reported workplace relationships in a work unit. The survey contained a questionnaire that inquired about the attitudes 127 emplo of yees towards their colleagues. Responses were provided on a seven-point Likert scale, 1 = ‘dislike a lot’ and 7 = ‘like a lot.’ We Code link: https://github.com/k-s-b/extv Due to our non-disclosure agreement with the organization and our Institutional Review Board data management protocol, the raw data cannot be shared. ACM Trans. Knowl. Discov. Data. Multi-Stage Machine Learning Model for Hierarchical Tie Valence Prediction • 5 utilize these data to build the dichotomized ground truth labels for our research problem; speciically, people in an employee’s łfriendž network have positive labels and the łavoidž network have negative labels. State-of-the-art techniques for valid, reliable, and ethical social network survey data collection, including using optimal question wording, Social Network Data Labs, and institutional ethical oversight wer1e].emplo A.3 presents yed [ additonal details of the survey. We reconcile and merge these datasets via employees’ anonymized IDs and arrive at the inal dataset used as an input to our model. The email and HR data are from an overlapping period, which makes the data unique. This intersection of results in data 98 on unique employees with 967 labeled relations among them. Further, these data are split into train, valid, and test sets for model training and analysis purposes. While the inal dataset is small in size, our proposed approach is nonetheless able to extract insightful information from the available data annotation. A.1 presents selected exploratory analyses on HR and conversation data. 3.2 Signed Edge Prediction Problem Employees constantly interact with each other in organizational settings, forming a localized social interaction network. Increasingly, these individuals also exchange digital information (e.g., emails, instant messages), forming a virtual social interaction network. The nodes in this network are the individuals, and the traces of digital information exchange would determine edges. When this virtual network is reconciled with the survey responses indicating the perceived link polarity between the people in the oline network, the virtual network’s edges could be annotated as positive or negative ties. Overall, this process constitutes directed, signe a d, as well no asde, and edge attributed social network, where the employees are the nodes and their tie-valences the network’s edges. Given the importance of understanding and predicting the valence of ties between employees, our research problem can be deined as a binary classiication task with{0, 1lab }, denoting els negative and positive ties. We formulate this problem with a multi-modal deep-learning approach that uses natural language processing (NLP) and graph-based deep-learning techniques. More formally, an organizations’ social network can be viewed as a + − + − graphG = (V,E ,E ), whereV = {� ,� , . . .,� } is the set of � employees while E ⊂ V ×V andE ⊂ V ×V 1 2 � represents set of positive and negative edges. A labeled edge between an employ�ee :pair � ,�, ∈ V and �,� � � + − ∀� ∈ {E ∪E }, is representative of the sentiment fr � om to � . Note that the signs of� and � could be �,� � � �,� �,� diferent, as one’s perception of a person as a łfriendž, may not necessarily be reciprx ocate denotes d. the feature + − matrix that contains each employee’s personal information ∀�∈ � . D and D are the matrices that contains + + − − edge features including conversation information of e∀dges, � ∈ E forand∀� ∈ E each. 4 METHODS The model’s irst stage processes the raw email text and aims to extract numeric embeddings that represent the underlying reported sentiments. We irst explore standard NLP approaches, including sentiment lexicons like LIWC [40] and VADER [21]. Then, we design a neural network that builds upon state-of-the-art deep learning NLP models while accommodating outputs from the above-mentioned standard approaches. Because the network is composed of sentient people who can inluence each other’s attitudes, which can in turn alter their relationships with other people, we need to add a relational component to the machine learning process. Therefore, in the second stage of exTV, the outputs from stage one, along with other designed meta-features, build upon approaches like graph neural networks and matrix-factorization that take relational information into account while learning the target embeddings. Further, the łsignedž models of these approaches also segregate these relational contexts into positive and negative before learning the embeddings to be further utilized for the inal classiication. The overall model architecture is depicted in Figure 2. ACM Trans. Knowl. Discov. Data. 6 • Singh et al. � × � � N o de Featu re ∈ℝ � × � � H R D ata ∈ℝ t o si ti v e B B Negati v e S en ti men t Grap h Neu ral C o n stru ct Edge t redi cti o n N L t a o del Netw o rk Grap h Netw o rk ( EL EC TRA ) � × � � Edge Featu re ∈ℝ � S en ti men t L ab el s � Emai l s � � � × � � b etween � t eo p l e b etween A & B S en ti men t Emb eddi n g ∈ℝ Fig. 2. Overview of the model: exTV consists of two stages ś the NLP stage and the semi-supervised signed GNN stage. The NLP stage extracts text embeddings (łSentiment Embeddingž in this figure) with the context of emails exchanged between the employee-pairs. The ground truth is the sociometric survey data on the tie-valence between the employees. Network graph is constructed with employees as nodes, and the email exchange as the edges. As the learnt embeddings represent the sentiment between a pair of employees, these embeddings are concatenated with other edge-level feature obtained from the input data. Finally, the node features, updated edge features, and other meta features are leveraged in the semi-supervised signed GNN stage to perform the final classification. 4.1 Text-to-Sentiment Embeddings This step’s goal is to extract numeric embeddings from the email conversation data that are representative of the reported tie-valences among the employees. An employee interacts and exchanges information with other employees via emails (and other means), and the nature of this communication is determined by factors such as department ailiation, roles, rank, and perceived tie-valences. We posit that the email text should carry information that is relective of the nature of the relationship between people and it could be utilised for the inal classiication task. We begin by cleaning the conversation data (i.e., email in this study but could be replaced by other types) of unwanted noise by removing signatures and addresses, automated messages, and salutations. The ground-truth labels (the survey data) only exist for a small portion of the otherwise large network with ample conversation text data (approx. 1 million messages in total). Utilizing the whole network and the accompanying exchange data can be potentially advantageous in discovering important hidden features and in enabling the model to łlearnž the structure of underlying text. To enable this utilization, we undertake the ine-tuning of pre-trained and state-of-the-art ELECTRA [13] model in an unsupervised fashion. This is accomplished by ine-tuning the model with masked language modeling (MLM). Eq 1 represents the training objective of MLMΠwhile denotes the index of masked tokens, and � and � denotes set of masked tokens and unmasked tokens, respectively [33] Π −Π � (� |� ) = log�(� |� ) (1) ��� Π −Π � −Π �=1 As strong sentiments are rarely expressed in workplace messages, the lack of informative signal could potentially lead to the model learning over-smooth sentiment embeddings. Letting our model ingest the entire conversation data customizes the model weights to the mostly neutral, formal conversation style of an organizational workplace. ACM Trans. Knowl. Discov. Data. Multi-Stage Machine Learning Model for Hierarchical Tie Valence Prediction • 7 Thereafter, we employ unsupervised ine-tuned ELECTRA to generate numeric embeddings for all messages pertaining to all pairs of users involved in the email exchange. The email exchange data is naturally unstructured as a variable number of messages of diferent lengths are exchanged by each person-pair. We design a methodology where this data could be transformed into a ixed-sized embedding for each pair of users in the exchange data. For each pair, the numeric embeddings for each message obtained from the previous step are stacked and passed through a multi-head-self-attention (MHSA) layer as described in Eq 2. MHSA step relates diferent messages in the input and updates the initial embeddings to further łhighlightž the underlying sentiment. � � � Let F , F , and F denote matrices for queries, keys, and values in self-attention 45], respe[ctively Scale . d dot product attention is deined as: � � ⊤ F (F ) � � � � ����(F , F , F ) = ��� ���� F , (2) � � which is used to capture the similarity betwFeen, and the F vectors. Instead of performing a single attention function, the queries, keys, and values can be projected with diferent learned linear projections, on which the attention mechanism can be performed in parallel. To accomplish this goal of modeling diferent aspects of interactions between the diferent messages, multi-head (MH) self-attention is utilized: � � � � �� (F , F , F ) = Concat(ℎ��� ,ℎ��� , ...,ℎ��� )W , (3) 1 2 � � � � � � � � � � �×� � � � ×� � ℎ � where ℎ��� = ����(F W , F W , F W ), W , W , W ∈ R and W ∈ R are learnable weights, � � � � � � � � and � is the number of heads. The dense layers (i.e W .,, W , W ) are used to project the queries, keys, and � � values into their vector spaces. Since the queries, keys, and values are all equal to the messages pertaining to a � � � person-pair, i.eF., = F = F = S, we can produce the multi-head attention-aware sentiment embedding matrix as S = �� (S, S, S). Input to the MHSA layer is padded (and masked) equal to the length of vector of maximum length in a batch. Next, the output of the MHSA layer is pooled via mean-pooling to arrive at a ixed-sized embedding across all user pairs. Subject to ELECTRA, email messages longer than 512 words are truncated, and those of shorter lengths are padded, though as shown in Figure 5(b), most email messages are shorter than the upper limit of 512 words and there is rarely any information loss due to this limitation. To strengthen and aid the sentiment-extraction process, we utilize the outputssentiment of two lexicons: LIWC [40] and VADER [21]. Both LIWC and VADER are fed the concatenated messages for all user-pairs. LIWC reads the input text and outputs the percentage of words that relect diferent emotions, thinking styles, social concerns. VADER considers the polarity and intensity of emotion of the input text and gives four output scores, positive, negative, neutral and compound (computed by normalising the other three scores). We train XGBoost9[] models on the outputs of LIWC and VADER and then let the neural network learn the weights of the concatenated input of original features and the one-hot encoded decision paths in the extracted tree leaves from the trained XGBoost model. This enables the model to learn the relational information between diferent features of sentiment lexicons. Finally, ELECTRA embeddings, outputs from sentiment lexicons, and one-hot encoded leaf embeddings from sentiment-lexicon-XGBoost-models are concatenated and fed to a FC layer to produce a ixed-sized embedding vector. The training process is accomplished by performing binary classiication against the ground-truth sentiment labels using binary cross-entropy loss. Multiple experiments established that the larger size of the inal embedding vector can lead to overitting. Hence, we run regularization techniques and maintain a small embedding size to account for the input dataset’s size. The sentiment extraction process is presented in Algorithm 1. ACM Trans. Knowl. Discov. Data. 8 • Singh et al. Algorithm 1: Sentiment Extraction Input: emails b/w pairs, m ,∀� ∈ {1...� },∀� ∈ {1...� }, where � denotes the number of emails of pair �,� � � �and � denotes the number of pairs, pre-traine ele d ctra, set of entire unlabeled emails � ����� Output: email context embeddings z ,∀� ∈ {1, ..., � } // Use LIWC and VADER for aggregated emails per pair. 1 F (�) := LIWC(CONCAT(m )); LIWC �,� 2 F (�) := VADER(CONCAT(m )); VADER �,� // Train a XGBoost model, and extract its leaves . 3 XL(�) := LEAVES(XGB(�(�))), � ∈ {F , F },∀� ∈ {1...� } LIWC VADER // Unsupervised finetuning ofELECTRA via MLM 4 ELECTRA = MLM(electra, � ) 5 for �∈ {1, 2, ..., � } do 6 for �∈ {1, 2, ...,� } do // Get email embeddings withELECTRA. 7 � = ELECTRA(m ) �,� �,� 8 end // UpdateELECTRA embeddings by MHSA. 9 � = MHSA(� ) �,� �,� // Obtain hidden stateh by pooling fromMHSA result. 10 h = W MEAN(� ) �,� // Obtain hidden stateh fromLIWC. 11 h = W [F (�), XL(F (�))] LIWC LIWC // Obtain hidden stateh fromVADER. 12 h = W [F (�), XL(F (�))] VADER VADER 13 end � � � ����� 14 z ← tanh([h , h , h ]) � � � 4.2 Semi-Supervised Graph Neural Networks Positive and negative links have diferent dynamics in a network, and social theories like balance theory ofer a systematic way to handle these. Specially-designed machine learning approachessigne such as d graph the neural network models are the computational counterparts of these sociological approaches, where positive and negative links are initially treated independently, often driven by relevant social theories. The message passing architecture in GNNs in general learns the embedding of a node by leveraging it’s own and the aggregated neighboring information: � �−1 �−1 � = � � ,� � N (� ) (4) � � �−1 where � is a non-linearity � is, a permutation invariant function, N and(�) are the neighboring nodes of the target node in the lay�er− 1. This mechanism is applied in signed networks by segregating the nodes by link polarity, and by contrasting the embeddings of these groups of nodes. Any other associated information such as link direction, topological structure, associated node features can easily be incorporated into the model-training. In this work, we employ the SGCN 16[], that incorporates one of the most noteworthy signed social network theory - balance theory8[] - as the base information aggregator. SGCN segregates the positive and negative ACM Trans. Knowl. Discov. Data. Multi-Stage Machine Learning Model for Hierarchical Tie Valence Prediction • 9 neighbors based on balance theory and employs a segregated aggregation mechanism as presented in Algorithm 2. Additionally, we explored GNNs like27SiGA ], thatTinclude [ both the balance theory and another popular signed social network theory - the status theory 19[, 48]. We account for neighboring nodes that can be reached via a 3-hop path, a hyperparameter design choice. We experimented with two diferent aggregation mechanisms for improving the representational ability of GNNs. First, instead of standard order invariant pooling, we employed an attention-based aggregation mechanism. Second, we designed an aggregator function that can ingest the edge-level features in the node neighborhood. This function enabled the pertinent inclusion of topologically-relevant relational information into the training process. However, we discovered that despite a substantial increase in computational complexity, including both of these approaches in the SGCN didn’t lead to a statistically signiicant performance gain. After extensive experimentation we reached the conclusion that owing to a small-sized dataset, the gains ofered by both these approaches are invariably ofset by over-itting. We train a multi layer perceptron for the inal classiication by concatenating the sentiment embeddings from the NLP stage, and relational embeddings from the SGCN stage. Algorithm 2 describes the aggregation process. 5 RESULTS We evaluated the model’s performance in various test settings. The merged data were randomly split into training, validation, and test sets (with ratios 0.70, 0.15, and 0.15), resulting in 682, 140, and 145 data points, respectively. As summarised in Tables 1 and 2, model performance for both the NLP stage and for the complete model is compared against strong baselines, including XGBoost, and various signed GNN models. The embeddings from state-of- the-art NLP-only models are used as baselines for establishing the superiority of the sentiment embeddings extracted via the proposed approach. For the complete model, the 10-run averages are reported with and without sentiment-embeddings. We use early stopping on the validation set, and the best Macro-F1, precision, recall, and AUC scores for test sets are reported. For all models the best model is selected by searching the embedding size in {32,64,128} and number of epochs in {50,100,200}. The results demonstrate our approach outperforms all baselines, achieving the best performance for all reported metrics, along with a best AUROC score of 0.8190. It is worth mentioning here that a logistic regression model (not shown in Tables 1 and 2) yielded a low AUROC score of 0.5610. To analyze the functioning of the proposed approach, we further explore model behavior under various test conditions. The inclusion of sentiment embeddings clearly delivers a substantial performance boost over the baselines models. Comparing and contrasting the model output with and without text embeddings will provide an opportunity to discover unique, informative patterns in the underlying email data. In the absence of similar datasets as used in this research, we can not make a direct empirical comparison between such indings from diferent organizations, but we aim to provide a roadmap for deploying our model to a live setting. The model code is made available publicly. We chose the best performing model SGCN among the baselines for this analysis. Speciically, the SGCN model was run with and without sentiment embeddings; then true positive (TP) and true negative cases (TN) were identiied. Further, the TP and TN were compared to achieve a set of data points that were similar Same ( or S) or are newly-identiie Dif. d (or D) in both the runs (with, and without sentiment embeddings). As the performance gain comes from the sentiment information, the new TP and TN can be attributed to new relational patterns uncovered by the email text. Recall that our data-points are the relation-edges between people formed by exchanging digital information. In this light, we perform an exploratory analysis of attributes of relation-edges in theS and D sets, as well as the individuals involved in these edges. ACM Trans. Knowl. Discov. Data. 10 • Singh et al. Algorithm 2: Balance Theory-based Aggregation + − Input: � = (�, � , � ); node featurex ,∀� ∈ � ; neighbor nodes� ∈ N , whereN denotes the neighbors � � � � � + + of� ; edge feature between positive neighbDors,∀� ∈ � ; edge feature between negative � �,� �,� − − �(�) � (�) neighborsD ,∀� ∈ � ; number of layers L; weight matrices W and W ,∀�∈ {1...�}; �,� �,� activation function � Output: node embedding vectors determining tie valence � , ∀� ∈ � � � // (Optional atention based) node aggregation for neighbors �(�) 1 F (�,�) := ATTENTION(h ) ��� �∈N �(�) 2 F (�,�) := − ATTENTION(h ) ��� �∈N // Initialize the first layer with given node features (0) 3 h ← x , ∀� ∈ � � � � 4 for � ∈ � do // Aggregate edge information between negative neighbors + + 5 h = POOL(� ), ∀� ∈ N �,� � // Aggregate edge information between negative neighbors − − 6 h = POOL(� ), ∀� ∈ N �,� � // Obtain positive hidden state for the first layer h i + − �(1) �(1) + � � (0) 7 h = � W F (0,�), h , h , h � � � � ��� // Obtain negative hidden state for the first layer h i + − � (1) � � (0) � (1) − 8 h = � W F (0,�), h , h , h � � � � ��� 9 end 10 if � > 1 then 11 for �∈ {1, 2, ...,� } do 12 for � ∈ � do // Obtain positive hidden state from the previous lay . er �(�) 13 h = h i + − �(�−1) � � �(�) + − � W F (�− 1, �), F (�− 1, �), h , h , h � � � ��� ��� // Obtain negative hidden state from the previous lay . er � (�) 14 h = h i + − � (�) + − � (�−1) � � � W F (�− 1, �), F (�− 1, �), h , h , h � � � ��� ��� 15 end 16 end 17 end �(�) � (�) 18 z ← [h , h ], ∀� ∈ � � � � � 5.1 Node-level traits A workplace usually is a unique mixture of individuals from diferent age groups, nationalities, and ethnicities, who take up diferent departmental roles according to their knowledge, skills, abilities, and experiences. We explore the HR information of the peopleS in and theD and ind that the information uncovered by inclusion of digital exchange data can exhibit certain patterns. Speciically, we analyze age distributions, and manager, minority, and gender ratios. ACM Trans. Knowl. Discov. Data. Multi-Stage Machine Learning Model for Hierarchical Tie Valence Prediction • 11 Table 1. Performance comparison for exTV’s sentiment embeddings. (exTV-NLP). Combinations of the input components also serve the purpose of ablation study. Model F1 PrecisionRecallAUC LIWC & VADER 0.396 0.328 0.500 0.5 only ELECTRA 0.517 0.576 0.575 0.575 only exTV-NLP 0.567 0.648 0.579 0.578 no_leaves exTV-NLP 0.581 0.601 0.580 0.581 no_meta exTV-NLP 0.579 0.618 0.582 0.582 no_MHSA exTV-NLP 0.615 0.650 0.612 0.612 Table 2. Performance comparison for with and without sentiment embeddings from exTV against various baselines. Model F1 PrecisionRecallAUC XGBoost 0.625 0.641 0.621 0.621 XGBoost 0.660 0.668 0.655 0.655 exTV SiGAT 0.586 0.619 0.586 0.666 SiGAT 0.579 0.581 0.578 0.673 exTV SLF 0.671 0.708 0.662 0.743 SLF 0.676 0.685 0.671 0.765 exTV SGCN 0.680 0.704 0.672 0.781 SGCN 0.728 0.726 0.730 0.819 exTV (a) Median age (b) Ratio of manager (c) Ratio of minority (d) Ratio of females Fig. 3. Ratios of age, manager, minority and gender forS (Same) and D (Dif.) sets. Allž ł represents respective values for entire dataset. Figures 3(a) to 3(d) presents the results of this analysis. It can be observed that peopleD in haveset a higher median age on average. The minority and the female ratio is lower, whereas the ratio of managers is roughly ACM Trans. Knowl. Discov. Data. 12 • Singh et al. the same. This analysis also points out that in this data, identiication of ties for people with relatively higher age, non-minorities, and males have a higher ambiguity, and that utilizing communication data mitigates this ambiguity. Discovery of such patterns can be utilized by management to design better programs and policies that promote better communication among employees, tailored to the organization’s needs. 5.2 Edge-level features The edge level features are directly associated with the tie-valence for a pair of individuals, and studying these features can aid the understanding and prediction of the relational edges. We analyze two types of edge features in our data. 5.2.1 Meta Features. We engineered meta-features from the properties of the network formed by the email exchange information. These features were designed with an assumption of being informative of the characteristics of the network formed, indicative of the relationship between a pair of employees, and logically comprehensible. Examples include the total number of emails exchanged, average message length, and message frequency. All meta-features are listed in supplementary information Table 3. Another consideration was to treat the day of week efect. We segregated the exchange for weekdays and weekends to elaborate this pattern further and received intriguing results. Dif. edges, For the average number of emails exchanged during weekdays is lower than Same theedges, whereas these numbers are substantially higher during the weekends. Additional exploratory analysis is present in A.1 (a) WPS (b) I (c) They (d) Anger (e) Sexual (f) Religious (g) Time (h) Money Fig. 4. Selected LIWC feature values forS (same edges) and D (diferent edges) sets. y-axis of the figure represents LIWC score, which is percentage of words in the text belonging to that dictionary. The values atop bars are the respective percentage values. Each subfigure’s title is the feature name in the LIWC output. łWPSž (a) stands for words per sentence, łIž (b) and łTheyž (c) presents the percentage of these pronouns, and similarly, percentage of Angerž ł (d), łSexualž (e), łReligiousž (f), łTimež (g), and łMoneyž (h) words. This figure clearly highlights diferences in the nature of communication in theS and D sets. 5.2.2 LIWC Features. LIWC [40] is a non-parametric sentiment lexicon that reveals thoughts, feelings, personality, and motivations based on percentages of the words describing diferent contexts. LIWC takes as input the text, and outputs a vector of length 93, which are the percentages of words belonging to that dictionary. The individual outputs and their values can be interpreted to analyze the underlying sentiment in the text. We summarize ACM Trans. Knowl. Discov. Data. Multi-Stage Machine Learning Model for Hierarchical Tie Valence Prediction • 13 insights from data in Figures 4(a) to 4(h), that highlight a clear diference in the nature of communication between theS and D sets. The WPS (words per sentence) feature value is substantially lowerDfor , as set is the use of irst-person singular pronouns (I). However, the use of the third-person plural pronoun (they) is higher. Further, the results also suggest that individuals inDthe also setcommunicate less about time and more about money. We also ind that the use of words concerning anger, sexuality, swearing, religion, with such verbiage being absent from the edges in setD. 5.3 Label-distributions forS and D sets We explored the ratio of negative ties for both sets. The values come out to 0.69 and 0.56Sfor andtheD sets, respectively. The newly classiied edges exhibit a lower ratio of negative ties or a larger share of positive ties. The inclusion of email embeddings resulted in better identiication of positive edges or emotions. This inding further implies that the inal classiier, ingesting the merged information of network and email embeddings, inds an improved signal for the positive relations. It is worth pointing out here that the distribution of labels in the underlying data is imbalanced in favor of negative ties (2:1), potentially resulting in the deep learning model overitting on this class, mainly due to the smaller data size. Despite this, the analysis reveals that our classiier, and hence the NLP model in stage 1, is inding better text-based indicators for the smaller class. This inding signiies that the text-based positive sentiment is easier to discern than its negative counterpart. The data show that continual communication on weekends, with a much lower tendency for anger, anxiety, and profanity, signiies a sound positive relation between individuals. Heavy usage of irst-person pronouns can be indicative of a preoccupation with one’s thoughts and depression 18], and[lower usage of such pronouns can suggest a positive relational perspective. Greater use of group connotation pronouns also signals a reduced prevalence of depression 46].[ For instance, the usage of łtheyž is higher in theD set . Similarly, the lower WPS count inDif. edges might exhibit a more casual form of relationship, akin to friendly acquaintances exchanging information on a digital messaging service, rather than employing long, formal sentences. The model’s superiority stems from sentiment embeddings, as presented in the Table 1. Further, detailed comparison of results with and without sentiment embeddings identiies new relationships. Out of the 85 employees in the test data, the inclusion of sentiment embeddings leads to a large performance gain with an accurate tie-valence identiication for 16 additional individuals .82% of all emplo (18 yees). Such indings can not only greatly aid in shaping the training programs, but also the identiication and prediction of the tie-valences in an organization. 6 DISCUSSION AND CONCLUSION This paper presented an ensemble model utilizing anonymized employee information to identify tie-valences in an organizational social network. While ensemble models and their applications have been10 w,ell 11, researched [ 17, 20, 31, 34], as per our knowledge, this is the irst deep-learning-based ensemble model that leverages archived message text and employee information to learn and predict tie-valences in a context involving hierarchy and organizational structure. While being a data-driven paper, the computational algorithms employed in this work include the most prominent theories in signed social networks - balance 8] and thestatus ory [ theory [19, 48]. Our model can be applied by third-party providers in a live setting to unobtrusively analyze traic on an organization’s digital communication platforms and deliver useful insights to the client, including providing feedback on emerging interdepartmental conlicts that could threaten the organization’s functioning or tracking increased collaboration in a post-merger context [12, 47], all while protecting individuals’ anonymity. This research is built on a snapshot of a much larger dataset and proves that deep learning can be used to better predict and analyze workplace tie valences. Yet, our indings have implications beyond organizational structures and can be used in any online domain, for example on various online collaboration platforms (e.g., ACM Trans. Knowl. Discov. Data. 14 • Singh et al. Trello, Slack). Our work suggests insights for future research. As the size of our dataset was relatively small, we expect the model’s learning capability to increase greatly when trained over the entire corporate dataset. A larger dataset can be used for modeling. Owing to data limitations, we have employed transductive GNNs for the model’s second stage; however, larger datasets will warrant using inductive approaches like GraphSA 23]. GE [ Similarly, an end-to-end training regimen will allow for improved learning of embedding in both of the stages. Findings from this work have direct implications in promoting positively valenced relations and collaboration in organizational and other social networks. The proposed method can be employed in any social network by replacing/augmenting the information exchange mechanism, e.g., social media posts, instant messages, bulletin boards, and calendar invites. The relatively high performance of exTV also renders it helpful in pursuing what-if analysis in the social networks. While the use of personal, conversational data is always fraught with privacy concerns, there are many ways to manage the risk. We suggest separating the providing irm collecting and analyzing the data from the client irm receiving the anonymized, aggregated suggestions for improvement. Combining this with a machine-learning- based method that does not involve humans accessing private data allows the client irm to protect its employees while deriving valuable insights that can improve the organization’s functioning and the employees’ mental health and career outcomes. ACKNOWLEDGMENTS Authors are grateful for the assistance of the organization’s CHRO, as well as all the employees who assisted in this data collection (particularly within the IT department), without whom this research would not have been possible. This work was supported by the Institute for Basic Sciences (IBS), Republic of Korea, under IBS-R029-C2 (Singh, Lee, and Cha). REFERENCES [1] Filip Agneessens and Giuseppe (Joe) Labianca. 2022. Collecting survey-based social network information in work Social organizations. Networks 68 (2022), 31ś47. [2] Priyanka Agrawal, Vikas K. Garg, and Ramasuri Narayanam. 2013. Link Label Prediction in Signed Social proNetw c. of the orks. IJCAI In 2013. 2591ś2597. [3] Chester Irving Barnard. 1968.The functions of the executive. Vol. 11. Harvard University Press. [4] Ghazaleh Beigi, Suhas Ranganath, and Huan Liu. 2019. Signed Link Prediction with Sparse Data: The Role of Personality Information. In proc. of the WWW. 1270ś1278. [5] Kamal Berahmand, Elahe Nasiri, Saman Forouzandeh, and Yuefeng Li. 2022. A preference random walk algorithm for link prediction through mutual inluence nodes in complex netwJournal orks. of King Saud University - Computer and Information Sciences 34, 8, Part A (2022), 5375ś5387. [6] Kamal Berahmand, Elahe Nasiri, Mehrdad Rostami, and Saman Forouzandeh. 2021. A modiied DeepWalk method for link prediction in attributed social network. Computing 103, 10 (2021), 2227ś2249. [7] Xiaoyan Cai, Junwei Han, and Libin Yang. 2018. Generative Adversarial Network Based Heterogeneous Bibliographic Network Representation for Personalized Citation Recommendation. proc. ofInthe AAAI. 5747ś5754. [8] Dorwin Cartwright and Frank Harary. 1956. Structural balance: a generalization of Heider’s Psychological theory. Review 63, 5 (1956), [9] Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A Scalable Tree Boosting System. proc. ofInthe ACM SIGKDD. 785ś794. [10] Yi-Ling Chen, Ming-Syan Chen, and Philip S. Yu. 2015. Ensemble of Diverse Sparsiications for Link Prediction in Large-Scale Networks. In 2015 IEEE International Conference on Data Mining. 51ś60. [11] Yen-Liang Chen, Chen-Hsin Hsiao, and Chia-Chi Wu. 2022. An ensemble model for link prediction based on graph Decision embedding. Support Systems 157 (2022), 113753. [12] Chia-Yen (Chad) Chiu, Prasad Balkundi, Bradley P Owens, and Paul E Tesluk. 2020. Shaping positive and negative ties to improve team efectiveness: The roles of leader humility and team helping Human norms. Relations (2020). [13] Kevin Clark, Minh-Thang Luong, Quoc V. Le, and Christopher D. Manning. 2020. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators. In proc. of the ICLR. [14] James A Davis. 1967. Clustering and structural balance in graphs. Human Relations 20, 2 (1967), 181ś187. ACM Trans. Knowl. Discov. Data. Multi-Stage Machine Learning Model for Hierarchical Tie Valence Prediction • 15 [15] James A Davis and Samuel Leinhardt. 1967. The structure of positive interpersonal relations in small groups. (1967). [16] Tyler Derr, Yao Ma, and Jiliang Tang. 2018. Signed graph convolutional netw proorks. c. of the In IEEE ICDM. 929ś934. [17] Liang Duan, Shuai Ma, Charu Aggarwal, Tiejun Ma, and Jinpeng Huai. 2017. An Ensemble Approach to Link Pr IEEE ediction. Transactions on Knowledge and Data Engineering 29, 11 (2017), 2402ś2416. [18] To’Meisha Edwards and Nicholas S. Holtzman. 2017. A meta-analysis of correlations between depression and irst person singular pronoun use. Journal of Research in Personality 68 (2017), 63ś68. [19] M Hamit Fişek, Joseph Berger, and Robert Z Norman. 1991. Participation in heterogeneous and homogeneous groups: A theoretical integration. Amer. J. Sociology 97, 1 (1991), 114ś142. [20] Saman Forouzandeh, Kamal Berahmand, Elahe Nasiri, and Mehrdad Rostami. 2021. A Hotel Recommender System for Tourists Using the Artiicial Bee Colony Algorithm and Fuzzy TOPSIS Model: A Case Study of Trip International Advisor. Journal of Information Technology & Decision Making 20, 01 (2021), 399ś429. [21] CHE Gilbert and Erric Hutto. 2014. Vader: A parsimonious rule-based model for sentiment analysis of social proc. meofdia the text. In ICWSM. [22] Avni Gulati and Magdalini Eirinaki. 2019. With a Little Help from My Friends (and Their Friends): Inluence Neighborhoods for Social Recommendations. Inproc. of the WWW. 2778ś2784. [23] Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large proc. graphs. of the NeurIPS In . 1024ś1034. [24] Xiaorong Hao, Tao Lian, and Li Wang. 2020. Dynamic Link Prediction by Integrating Node Vector Evolution and Local Neighborhood Representation. Inproc. of SIGIR. 1717ś1720. [25] Nicholas M Harrigan, Giuseppe (Joe) Labianca, and Filip Agneessens. 2020. Negative ties and signed graphs research: Stimulating research on dissociative forces in social netw Social orks. Networks 60 (2020), 1ś10. [26] Xiangnan He, Hanwang Zhang, Min-Yen Kan, and Tat-Seng Chua. 2016. Fast Matrix Factorization for Online Recommendation with Implicit Feedback.prIn oc. of SIGIR. 549ś558. [27] Junjie Huang, Huawei Shen, Liang Hou, and Xueqi Cheng. 2019. Signed graph attention netwpr orks. oc. of Inthe ICANN. 566ś577. [28] Humanyze. 2011. Humanyze Workplace Analytics. https://humanyze.com/ [29] Mohammad Raihanul Islam, B. Aditya Prakash, and Naren Ramakrishnan. 2018. SIGNet: Scalable Embeddings for Signed Networks. In proc. of PAKDD. 157ś169. [30] Giuseppe Joe Labianca. 2014. Negative ties in organizational netw Contemp orks. In orary perspectives on organizational social networks. Emerald Group Publishing Limited. [31] Kuanyang Li, Lilan Tu, and Lang Chai. 2020. Ensemble-model-based link prediction of comple Computer x networks. Networks 166 (2020), 106978. [32] Xiaoming Li, Hui Fang, and Jie Zhang. 2017. Rethinking the Link Prediction Problem in Signed Sopr cial oc. ofNetw the AAAI orks.. In 4955ś4956. [33] Yi Liao, Xin Jiang, and Qun Liu. 2020. Probabilistically Masked Language Model Capable of Autoregressive Generation in Arbitrary Word Order. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020. 263ś274. [34] Yu lin He, James N.K. Liu, Yan xing Hu, and Xi zhao Wang. 2015. OWA operator based link prediction ensemble for social network. Expert Systems with Applications 42, 1 (2015), 21ś50. [35] Linyuan Lü and Tao Zhou. 2011. Link prediction in complex networks: A Psur hysica vey. A: statistical mechanics and its applications 390, 6 (2011), 1150ś1170. [36] Scott M. Lundberg, Gabriel G. Erion, Hugh Chen, Alex DeGrave, Jordan M. Prutkin, Bala Nair, Ronit Katz, Jonathan Himmelfarb, Nisha Bansal, and Su-In Lee. 2019. Explainable AI for Trees: From Local Explanations to Global Understanding. Nature Machine Intelligence abs/1905.04610 (2019). [37] Joshua E Marineau and Giuseppe Joe Labianca. 2021. Positive and negative tie perceptual accuracy: Pollyanna principle vs. negative asymmetry explanations. Social Networks 64 (2021), 83ś98. [38] Maven7. 2017. Maven7 OrgMapper. http://maven7.com/ [39] Microsoft. 2017. Microsoft Workplace Analytics. https://cloudpartners.transform.microsoft.com/practices/workplaceanalytics [40] James W Pennebaker, Roger J Booth, Ryan L Boyd, and Martha E Francis. 2015. Linguistic Inquiry and Word Count: LIWC2015. [41] Franco Scarselli, Marco Gori, Ah Chung Tsoi, Markus Hagenbuchner, and Gabriele Monfardini. 2008. The graph neural network model. IEEE Transactions on Neural Networks 20, 1 (2008), 61ś80. [42] Kenwyn K Smith. 1989. The movement of conlict in organizations: The joint dynamics of splitting andAtriangulation. dministrative Science Quarterly (1989), 1ś20. [43] Yiyi Tao, Yiling Jia, Nan Wang, and Hongning Wang. 2019. The FacT: Taming Latent Factor Models for Explainability with Factorization Trees. In proc. of SIGIR. 295ś304. ACM Trans. Knowl. Discov. Data. 16 • Singh et al. [44] Wiebe Van der Hoek, Louwe Kuijer, and Yì Wáng. 2020. Logics of Allies and Enemies: A Formal Approach to the Dynamics of Social Balance Theory. Inproc. of IJCAI 2020. [45] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you neeNeurIPS d. In . 5998ś6008. [46] Nikhita Vedula and Srinivasan Parthasarathy. 2017. Emotional and Linguistic Cues of Depression from So Procial ceedings Meof dia. In the 2017 International Conference on Digital Health. 127ś136. [47] Vijaya Venkataramani, Giuseppe Joe Labianca, and Travis Grosser. 2013. Positive and negative workplace relationships, social satisfaction, and organizational attachment. Journal of Applied Psychology 98, 6 (2013), 1028. [48] David Willer. 1999. Network exchange theory. Greenwood Publishing Group. [49] Pinghua Xu, Wenbin Hu, Jia Wu, and Bo Du. 2019. Link Prediction with Signed Latent Factors in Signed Social proNetw c. of the orks. In ACM SIGKDD. 1046ś1054. [50] Janice Yap and Nicholas Harrigan. 2015. Why does everybody hate me? Balance, status, and homophily: The triumvirate of signed tie formation.Social Networks 40 (2015), 103ś122. [51] Muhan Zhang and Yixin Chen. 2020. Inductive Matrix Completion Based on Graph Neural Netw prorks. oc. ofInthe ICLR. ACM Trans. Knowl. Discov. Data. Multi-Stage Machine Learning Model for Hierarchical Tie Valence Prediction • 17 A APPENDIX A.1 Exploratory Analysis on HR and conversation Data Both M anagersE ither M anagers No M anagers (a) Proile and demographic ratios (b) Number of words in an email (c) Weekday emails of managers and subordi- nates Both M inoritiE es ither M inoritN yo M inority Both M ale E ither M ale Both F emale Negative Positive (d) Weekday emails of minorities and non- (e) Weekday emails of men and women (f) The label trend of weekday emails minorities Fig. 5. Exploratory analysis. (a) The ratios of diferent profiles and demographic features. (b) A histogram of number of words in an email. (c) - (e) Average number of emails exchanged between individuals of diferent profiles and demographic atributes. (f) Average number of emails exchanged between edges with diferent tie-valences (positive and negative). łOut-ž and łIn-ž emails denote the emails sent and received, respectively. Figure 5 presents selected exploratory analyses on our dataset. A larger majority of the individuals are males (67%), managers ( 19%), ethnic minorities (16%) and front-line supervisors (16%). The word count of emails has a long-tailed distribution with the peak at 50ś100, implying that most of the emails involve shorter conversations. Combined with the fact that these exchanges are taking place in a formal setting, this makes mining useful information from text more diicult. Figures 5(c) to 5(f) present unique traits Ð the average number of emails exchanged Ð of the email exchange patterns between diferent types of user pairs in our data. In Figure 5(c), it can be observed that managers send and receive the most, and subordinates the least information among themselves. This observation could be attributed to the fact that, by the nature of their role, managers interact with many people and tend to participate in many higher-level meetings. Figure 5(d) shows that on average, minorities send out more emails to non-minorities (likely driven by numerical probabilities given how few minorities there are), but receive the most emails from minorities themselves, suggesting email exchange is also driven by social identities. The results in Figure 5(e) show that women are more likely to exchange with each other. Finally, Figure 5(f) highlights that as compared to negative ties, people with positive workplace relationships interact substantially more with each other. This suggests that reducing negative ties might improve an organization’s efectiveness and productivity by promoting better interactions and exchange. ACM Trans. Knowl. Discov. Data. 18 • Singh et al. A.2 Meta Features Table 3. Designed meta features. All meta features are computed for sent (out) and received (in) messages. <month> is replaced by all months. Feature name Interpretation total_time_<out/in> The total period that emails have been sent/received. The number of emails that emails have been num_email_<out/in> sent/received. The frequency that emails have been sent/received. frequency_<out/in> (num_email_out / total_time_out) The average interval that emails have been avg_interval_<out/in> sent/received. (total_time_out / num_email_out) The relative time of median email that have been med_total_pos_<out/in> sent/received in total period. The year-wise relative time of median email med_year_pos_<out/in> that have been sent/received. The month-wise relative time of median email med_month_pos_<out/in> that have been sent/received. The number of emails that have been sent/received num_week_emails_<out/in> in weekdays. The number of emails that have been sent/received num_weekend_emails_<out/in> in weekend. The maximum number of emails that have been month_max_num_<out/in> sent/received in a month. The maximum number of emails that have been month_max_<month>_<out/in> sent/received in <month>. The average number of characters in an email that avg_length_<out/in> has been sent/received. The average sentence length in an email that has avg_sentence_len_<out/in> been sent/received. The average number of sentences in an email that avg_sentence_num_<out/in> has been sent/received. ACM Trans. Knowl. Discov. Data. Multi-Stage Machine Learning Model for Hierarchical Tie Valence Prediction • 19 A.3 Survey Description The ground truth network used to train the model was collected via survey back in September 2012 in the new product development unit of this consumer product organization’s corporate headquarters. Social network analysis was conducted to help reorganize the unit to speed up new product development. A roster with all 185 employees’ names was provided and the following social network questions were administered, including the positive tie (friend) and negative tie (avoid) question. Of the 185 sociometric surveys distributed, 144 completed surveys were returned. • Desired Collaboration – I would be more efective in my work if I were able to collaborate more closely with this person. (respondent will check the appropriate names on the roster) • Friend – Do you consider this person to be a close friend (e.g., conide in this person)? (respondent will check the appropriate names on the roster) • Avoid – Sometimes people at work may make us feel uncomfortable or uneasy and, therefore, we try to avoid interacting with them. Do you try to avoid interacting with this person? (respondent will check the appropriate names on the roster) • Innovation Ratings – Innovative employees have the ability to efectively generate and implement novel ideas in the workplace. Please rate how innovative you believe each of your coworkers is. (1 = never innovative, 7 = always innovative) ACM Trans. Knowl. Discov. Data. 20 • Singh et al. A.4 Shapley Analysis We present the SHAP analysis 36[] of a XGBoost edge-label classiier that utilizes the same information as the full model. As seen in (Figure 6), this analysis could reconirm that sentiment embeddings are key features in the classiication task. It can also be observed that people who send a relatively higher number of messages on weekends are inclined to give positive workplace relationship feedback. Similarly older people tend to give more positive feedback as well. On the other hand, belonging to minority groups seem to be correlated with receiving negative feedback in perceived workplace ties. High SE_9 SE_8 U1_age U2_age SE_1 WD_em_in SE_4 SE_6 SE_0 Num_em_out Sen_len_in Sen_num_out Len_in U2_minority WE_em_out Low 2 0 2 SHAP value (impact on model output) Fig. 6. SHAP summary plot. Let vertical axis - features sorted in descending order of importance. Vertical bar - color legend for feature values. Horizontal axis - how values of diferent features drive the model output. For display purposes, feature names are shortened. SE_<> : sentiment embeddings, U<>_<age/minority> :user1/user2 age/minority, <WD/WE/Num>_em_<in/out> :number of weekday/weekend/number of emails in/out, Sen_<len/num>_<in/out> : sentence length/number in/out. ACM Trans. Knowl. Discov. Data. Feature value Multi-Stage Machine Learning Model for Hierarchical Tie Valence Prediction • 21 A.5 Additional Exploratory Analysis Additional exploratory analysis on the valence-data. Fig. 7 presents the mean valence score given by managers. The igure presents segregated scores from male and female managers, as well scores from managers to people in minority and non-minority sets. Fig. 8 shows the weekend communication trend for diferent scenarios. An interesting observation here is the pattern for positive and negative edges, with people with positive edges communicating substantially higher during the weekend of-hoursž ł . To F emale Non-M anager To F emale Non-M anager To M ale M inority To M ale Non-M anager To M ale Non-M anager To F emale M inority To F emale M anager To F emale M anager To M ale Non-M inority To M ale M anager To M ale M anager To F emale Non-M inority (a) Male manager (b) Female manager (c) Managers to minorities Fig. 7. Valence scores by managers Both M anagersE ither M anagers No M anagers Both M inoritiE es ither M inoritN yo M inority Both M ale E ither M ale Both F emale Negative Positive (a) Managers and subordinates(b) Minorities and non-minorities (c) Gender (d) By label type Fig. 8. Avg. number of weekend emails exchanged ACM Trans. Knowl. Discov. Data. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png ACM Transactions on Knowledge Discovery from Data (TKDD) Association for Computing Machinery

Multi-Stage Machine Learning Model for Hierarchical Tie Valence Prediction

Loading next page...
 
/lp/association-for-computing-machinery/multi-stage-machine-learning-model-for-hierarchical-tie-valence-00uDgd6YZM

References (54)

Publisher
Association for Computing Machinery
Copyright
Copyright © 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM.
ISSN
1556-4681
eISSN
1556-472X
DOI
10.1145/3579096
Publisher site
See Article on Publisher Site

Abstract

Multi-Stage Machine Learning Model for Hierarchical Tie Valence Prediction KARANDEEP SINGH , Data Science Group, Institute for Basic Science, South Korea SEUNGEON LEE , Data Science Group, Institute for Basic Science; School of Computing, KAIST, South Korea GIUSEPPE (JOE) LABIANCA, Department of Management, UMass Amherst, USA; Department of Manage- ment, University of Exeter, UK JESSE MICHAEL FAGAN, Department of Management, University of Exeter, UK MEEYOUNG CHA, Data Science Group, Institute for Basic Science; School of Computing, KAIST, South Korea Individuals interacting in organizational settings involving varying levels of formal hierarchy naturally form a complex network of social ties having diferent tie valences (e.g., positive and negative connections). Social ties critically afect employees’ satisfaction, behaviors, cognition, and outcomes Ð yet identifying them solely through survey data is challenging because of the large size of some organizations or the often hidden nature of these ties and their valences. We present a novel deep learning model encompassing NLP and graph neural network techniques that identiies positive and negative ties in a hierarchical network. The proposed model uses human resource attributes as node information and web-logged work conversation data as link information. Our indings suggest that the presence of conversation data improves the tie valence classiication by 8.91% compared to employing user attributes alone. This gain came from accurately distinguishing positive ties, particularly for male, non-minority, and older employee groups. We also show a substantial diference in conversation patterns for positive and negative ties with positive ties being associated with more messages exchanged on weekends, and lower use of words related to anger and sadness. These indings have broad implications for facilitating collaboration and managing conlict within organizational and other social networks. CCS Concepts: · Applied computing→ Business Intelligence; · Information systems→ Enterprise information systems. Additional Key Words and Phrases: Signed link prediction, Sentiment embeddings, Graph neural networks, Tie-Valence prediction, Organizational social network 1 INTRODUCTION There is growing interest in understanding the role of positive and negative network ties or links ś recurring relationships that involve enduring valenced interpersonal judgments ś in explaining actors’ attitudes, behaviors, cognition, and outcomes 25[]. Identiication of tie valences can be helpful for a variety of practical downstream tasks such as recommending products26 [ , 51] or friends22[], estimating the impact of a publication 7], and [ Both authors contributed equally to this research. Authors’ addresses: Karandeep Singh, Data Science Group, Institute for Basic Science, South Korea; Seungeon Lee, Data Science Group, Institute for Basic Science; School of Computing, KAIST, South Korea; Giuseppe (Joe) Labianca, Department of Management, UMass Amherst, USA; Department of Management, University of Exeter, UK; Jesse Michael Fagan, Department of Management, University of Exeter, UK; Meeyoung Cha, Data Science Group, Institute for Basic Science; School of Computing, KAIST, South Korea. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for proit or commercial advantage and that copies bear this notice and the full citation on the irst page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior speciic permission and/or a fee. Request permissions from permissions@acm.org. © 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM. 1556-4681/2023/1-ART $15.00 https://doi.org/10.1145/3579096 ACM Trans. Knowl. Discov. Data. 2 • Singh et al. predicting the dynamics of complex social netw 24,orks 49]. Ho [ wever, despite the amount of time that individuals spend within social structures involving hierarchy and some level of competition, such as work organizations, little is known about tie valence within these networks. Work contexts can, for example, control which individual can gain a promotion; create competition for scarce organizational resources; place people into specialized units that are often at odds with other units; and introduce power relationships that are diicult to ignore. While viewing organizations as a nexus of social relationships has gained ground in the past decade, the use of large-scale data for tie valence detection in a setting involving hierarchy and some conlict is still in its infancy. What has made this research and its applications particularly challenging is that negative ties are counter- normative (i.e., are generally frowned upon) and are often hidden from 37].vie While w [ colleagues in the same rank might notice conlicts, these conlicts are intentionally suppressed from higher-ranked individuals. This produces a phenomenon where top management is often unaware of important conlicts occurring below them (e.g., interpersonal or interdepartmental disputes) that might need active attention because these conlicts can grow and draw in others42 [ ], ultimately threatening the organization’s proper functioning 3]. Thus, a[critical task is to understand where there are positive ties that can encourage collaboration and employee attachment to the organization and negative ties that undermine organizational solidarity and goal achievement. Numerous previous studies have shown that the number of positive and negative ties in an organization afects individual and group-based outcomes, including work performance, job satisfaction, and employee turnover [30]. Electronically-mediated communication and electronic information exchange patterns could be relective of the tie-valences between the concerned parties. Additionally, the advent of digitization has enabled data collection at an unprecedented scale which could be mined for discovering hidden patterns of interest. This, coupled with the recognition of the importance of understanding workplace ties, has resulted in the development of various data-driven solutions like Microsoft Workplace Analytics 39], OrgMapp [ er [38], and Humanyze Workplace Analytics28[]. These tools utilize corporate data to drive workplace improvements based on a broad set of features, including those from emails, meetings, collaboration activities, unscheduled calls, and instant messages. In some cases, these providers already collect these data legally on behalf of the client organizations for other purposes. These digital information exchanges can be used to learn relationship patterns and create actionable insights while still protecting individuals’ privacy. Yet much of this potential remains underutilized because tie valence prediction is not incorporated into social network analytic tools. Given the potential ofered by these relatively unexplored data, we propose a computational and data-driven approach to the problem of tie-valence prediction in networks involving hierarchy and competition. We utilize disparate sources of real data from an organization and propose a neural network model that extracts the relational sentiment information from unstructured and structured data sources. More speciically, our model, exTVcalled (Model toextract TieValences), is trained on anonymized work conversation data, employees’ human resources (HR) data, and a sociometric survey of positive and negative ties among a subsample of members. exTV is the irst of its kind to utilize anonymized oicial conversation texts exchanged between members of varying ranks in an organizational hierarchy to learn the numeric embeddings that are representative of people’s sentiments toward one another. This step employs a meticulously designed natural language processing algorithm to handle unstructured textual information and other critical meta information such as messaging frequency and times. Next, these embeddings are used alongside the HR data to build a neural network that identiies relational information among the people and inally delivers the output to be used as the tie valence label. We explore the model and the underlying data with the results obtained and test what sources and types of data are important for the inal classiication. In doing so, we discover intricate relationships among individuals of diferent ranks (e.g., supervisor and subordinate) and individuals of particular proiles (e.g., managers, females, ethnic minorities). We also ind that work conversations are a rich resource that uncovers valuable information for discerning additional, and otherwise ambiguous, tie-valences. Our model, exTV, can classify tie valence in the studied organizational network with an AUROC scor 0.8190 e of. We intend to apply our model in the ACM Trans. Knowl. Discov. Data. Multi-Stage Machine Learning Model for Hierarchical Tie Valence Prediction • 3 future to understand the rest of our large organization’s network tie valences. Our model can ofer key insights into downstream tasks such as improving organizational restructuring and post-merger integration, increasing workplace attachment, decreasing conlict and employee attrition rates, leading to improved organizational functioning and employee well-being. Ethical considerations. Analyzing interpersonal connections in general and tie valence, in particular, can provide valuable insights into the workplace environment even while being conducted ethically in a manner that protects individual privacy. One key is separating the irm gathering the data and conducting the network analyses (the third-party provider) from the client organization that receives recommendations, much as is done in numerous other contexts (e.g., third-party irms often handle client companies’ sexual harassment or whistleblower allegations to maintain conidentiality for all parties and provide neutral assessments). The client only receives anonymized, aggregated results that, while actionable, protect individuals’ data privacy (e.g., identifying where there is increasing interdepartmental conlict in the organization to initiate conlict management techniques quickly). These data are obtained legally because, in most of the world, the data are owned by companies, and employees are made aware that anything they transmit over company networks can be accessed by those companies (with the European Union being the main notable exception). 2 RELATED WORK Numerous social theories have been proposed to understand positive and negative ties in networks, including in organizational conte50 xts ]; these [ theories are often used when attempting tie valence detection. The most commonly used theory is structural balance the8or ] [y14 [ ], along with its newer variants such 44as ]. Sentiment [ lexicons like 40] [[21] have also been a popular choice for linguistic-features-based sentiment-classiication. Other popular methods in link prediction such as Jaccard coeicient, resource allocation index, and preferential attachment can be used to predict łmissingž links 35]. [In this work, we focus on the tie-valence of existing edges in an organizational social network. From a computational perspective, work on predicting network tie valence in organizational contexts is nearly non-existent. Though there is plenty of research on link prediction in social networks, transferring this to organizational contexts is impracticable due to diferent network dynamics in a formal work setting (e.g., one with competing/collaborating departments and a formal reporting hierarchy). From a machine learning point-of-view, GNN 41][ based models can operate on and learn representations of graph-structured data and have shown improved performance over traditional deep learning approaches. The strength of GNNs comes from their ability to implicitly learn the graph’s structure and the neighboring contextual information. Additionally, works like SGCN 16[], SiGNet29 [ ], and SiGAT27 [ ] have adopted GNNs to handle directional and signed networks. These approaches often incorporate social balance theories into the training process, thereby amalgamating computational and sociological approaches. For instance, SGCN includes structural balance theory, and SiGAT captures both the structural balance theory and status theory [15]. Matrix factorization has also been used commonly for analyzing networks and link prediction 2, 4]. tasks [ For example 4[] proposes a matrix factorization-based model that also cashes in on the users’ personality information. 32][ rethinks the problem of link prediction by identifying ‘no-relation’ as a possible future status of the node pairs. Studies like 5, 6] intr [ oduce advanced graph embedding methods with techniques such as preferential random walks. Research has also been focused on using latent factor models for link prediction tasks, such as [43, 49]. ACM Trans. Knowl. Discov. Data. 4 • Singh et al. Fig. 1. An example of signed, directed network with partially missing information (i.e., tie valence from Tom to Sean). People with diferent roles and atributes form the organization depicted. 3 PROBLEM SETTING 3.1 Dataset One of the key factors in determining the performance of a machine learning model is the data being utilized for training and testing purposes. Our research problem is not often attempted because employee data from a work organization is rarely available for research consumption. Even rarer is the data pertaining to the employees’ liking and disliking of their fellow colleagues. Understandably, these data are limited and relatively very limited for a computational approach that a machine learning model heavily drives. The various data sources that we utilize to supervise the training and assess the model output are discussed in the following paragraphs. Conversation data consist of oicial conversation exchanges among employees over two years. While the digital conversations could be in any form, such as emails, instant messages, meeting invites, or seminar chat logs, we use emails in this study. The digital exchange of information among employees forms an information exchange network. Along with the exchange patterns, we also have at our disposal the anonymized text of the conversations, where the data is stripped of any personally identiiable information. Each record in this data has multiple features, including message text, ID, timestamp, and the hash digests of the sender and receiver IDs. In total, there are 1,403,303 messages exchanged between 3,404 users. Further, using the exchange patterns, we generate an additionalmeta 73 features from conversation data that may relect the link polarity, e.g., the number of total emails between a user pair, the number of weekend emails. Human resources (HR) data contain the demographic and work-related attributes for each employee. Features in these data are typical of HR data (e.g., age, gender, department, rank, experience). Other information, such as salary, is anonymized due to privacy concerns. The HR data is only available for 1022 employees (e.g., it does not cover part-time employees or contractors). Sociometric survey data from the studied organization contains self-reported workplace relationships in a work unit. The survey contained a questionnaire that inquired about the attitudes 127 emplo of yees towards their colleagues. Responses were provided on a seven-point Likert scale, 1 = ‘dislike a lot’ and 7 = ‘like a lot.’ We Code link: https://github.com/k-s-b/extv Due to our non-disclosure agreement with the organization and our Institutional Review Board data management protocol, the raw data cannot be shared. ACM Trans. Knowl. Discov. Data. Multi-Stage Machine Learning Model for Hierarchical Tie Valence Prediction • 5 utilize these data to build the dichotomized ground truth labels for our research problem; speciically, people in an employee’s łfriendž network have positive labels and the łavoidž network have negative labels. State-of-the-art techniques for valid, reliable, and ethical social network survey data collection, including using optimal question wording, Social Network Data Labs, and institutional ethical oversight wer1e].emplo A.3 presents yed [ additonal details of the survey. We reconcile and merge these datasets via employees’ anonymized IDs and arrive at the inal dataset used as an input to our model. The email and HR data are from an overlapping period, which makes the data unique. This intersection of results in data 98 on unique employees with 967 labeled relations among them. Further, these data are split into train, valid, and test sets for model training and analysis purposes. While the inal dataset is small in size, our proposed approach is nonetheless able to extract insightful information from the available data annotation. A.1 presents selected exploratory analyses on HR and conversation data. 3.2 Signed Edge Prediction Problem Employees constantly interact with each other in organizational settings, forming a localized social interaction network. Increasingly, these individuals also exchange digital information (e.g., emails, instant messages), forming a virtual social interaction network. The nodes in this network are the individuals, and the traces of digital information exchange would determine edges. When this virtual network is reconciled with the survey responses indicating the perceived link polarity between the people in the oline network, the virtual network’s edges could be annotated as positive or negative ties. Overall, this process constitutes directed, signe a d, as well no asde, and edge attributed social network, where the employees are the nodes and their tie-valences the network’s edges. Given the importance of understanding and predicting the valence of ties between employees, our research problem can be deined as a binary classiication task with{0, 1lab }, denoting els negative and positive ties. We formulate this problem with a multi-modal deep-learning approach that uses natural language processing (NLP) and graph-based deep-learning techniques. More formally, an organizations’ social network can be viewed as a + − + − graphG = (V,E ,E ), whereV = {� ,� , . . .,� } is the set of � employees while E ⊂ V ×V andE ⊂ V ×V 1 2 � represents set of positive and negative edges. A labeled edge between an employ�ee :pair � ,�, ∈ V and �,� � � + − ∀� ∈ {E ∪E }, is representative of the sentiment fr � om to � . Note that the signs of� and � could be �,� � � �,� �,� diferent, as one’s perception of a person as a łfriendž, may not necessarily be reciprx ocate denotes d. the feature + − matrix that contains each employee’s personal information ∀�∈ � . D and D are the matrices that contains + + − − edge features including conversation information of e∀dges, � ∈ E forand∀� ∈ E each. 4 METHODS The model’s irst stage processes the raw email text and aims to extract numeric embeddings that represent the underlying reported sentiments. We irst explore standard NLP approaches, including sentiment lexicons like LIWC [40] and VADER [21]. Then, we design a neural network that builds upon state-of-the-art deep learning NLP models while accommodating outputs from the above-mentioned standard approaches. Because the network is composed of sentient people who can inluence each other’s attitudes, which can in turn alter their relationships with other people, we need to add a relational component to the machine learning process. Therefore, in the second stage of exTV, the outputs from stage one, along with other designed meta-features, build upon approaches like graph neural networks and matrix-factorization that take relational information into account while learning the target embeddings. Further, the łsignedž models of these approaches also segregate these relational contexts into positive and negative before learning the embeddings to be further utilized for the inal classiication. The overall model architecture is depicted in Figure 2. ACM Trans. Knowl. Discov. Data. 6 • Singh et al. � × � � N o de Featu re ∈ℝ � × � � H R D ata ∈ℝ t o si ti v e B B Negati v e S en ti men t Grap h Neu ral C o n stru ct Edge t redi cti o n N L t a o del Netw o rk Grap h Netw o rk ( EL EC TRA ) � × � � Edge Featu re ∈ℝ � S en ti men t L ab el s � Emai l s � � � × � � b etween � t eo p l e b etween A & B S en ti men t Emb eddi n g ∈ℝ Fig. 2. Overview of the model: exTV consists of two stages ś the NLP stage and the semi-supervised signed GNN stage. The NLP stage extracts text embeddings (łSentiment Embeddingž in this figure) with the context of emails exchanged between the employee-pairs. The ground truth is the sociometric survey data on the tie-valence between the employees. Network graph is constructed with employees as nodes, and the email exchange as the edges. As the learnt embeddings represent the sentiment between a pair of employees, these embeddings are concatenated with other edge-level feature obtained from the input data. Finally, the node features, updated edge features, and other meta features are leveraged in the semi-supervised signed GNN stage to perform the final classification. 4.1 Text-to-Sentiment Embeddings This step’s goal is to extract numeric embeddings from the email conversation data that are representative of the reported tie-valences among the employees. An employee interacts and exchanges information with other employees via emails (and other means), and the nature of this communication is determined by factors such as department ailiation, roles, rank, and perceived tie-valences. We posit that the email text should carry information that is relective of the nature of the relationship between people and it could be utilised for the inal classiication task. We begin by cleaning the conversation data (i.e., email in this study but could be replaced by other types) of unwanted noise by removing signatures and addresses, automated messages, and salutations. The ground-truth labels (the survey data) only exist for a small portion of the otherwise large network with ample conversation text data (approx. 1 million messages in total). Utilizing the whole network and the accompanying exchange data can be potentially advantageous in discovering important hidden features and in enabling the model to łlearnž the structure of underlying text. To enable this utilization, we undertake the ine-tuning of pre-trained and state-of-the-art ELECTRA [13] model in an unsupervised fashion. This is accomplished by ine-tuning the model with masked language modeling (MLM). Eq 1 represents the training objective of MLMΠwhile denotes the index of masked tokens, and � and � denotes set of masked tokens and unmasked tokens, respectively [33] Π −Π � (� |� ) = log�(� |� ) (1) ��� Π −Π � −Π �=1 As strong sentiments are rarely expressed in workplace messages, the lack of informative signal could potentially lead to the model learning over-smooth sentiment embeddings. Letting our model ingest the entire conversation data customizes the model weights to the mostly neutral, formal conversation style of an organizational workplace. ACM Trans. Knowl. Discov. Data. Multi-Stage Machine Learning Model for Hierarchical Tie Valence Prediction • 7 Thereafter, we employ unsupervised ine-tuned ELECTRA to generate numeric embeddings for all messages pertaining to all pairs of users involved in the email exchange. The email exchange data is naturally unstructured as a variable number of messages of diferent lengths are exchanged by each person-pair. We design a methodology where this data could be transformed into a ixed-sized embedding for each pair of users in the exchange data. For each pair, the numeric embeddings for each message obtained from the previous step are stacked and passed through a multi-head-self-attention (MHSA) layer as described in Eq 2. MHSA step relates diferent messages in the input and updates the initial embeddings to further łhighlightž the underlying sentiment. � � � Let F , F , and F denote matrices for queries, keys, and values in self-attention 45], respe[ctively Scale . d dot product attention is deined as: � � ⊤ F (F ) � � � � ����(F , F , F ) = ��� ���� F , (2) � � which is used to capture the similarity betwFeen, and the F vectors. Instead of performing a single attention function, the queries, keys, and values can be projected with diferent learned linear projections, on which the attention mechanism can be performed in parallel. To accomplish this goal of modeling diferent aspects of interactions between the diferent messages, multi-head (MH) self-attention is utilized: � � � � �� (F , F , F ) = Concat(ℎ��� ,ℎ��� , ...,ℎ��� )W , (3) 1 2 � � � � � � � � � � �×� � � � ×� � ℎ � where ℎ��� = ����(F W , F W , F W ), W , W , W ∈ R and W ∈ R are learnable weights, � � � � � � � � and � is the number of heads. The dense layers (i.e W .,, W , W ) are used to project the queries, keys, and � � values into their vector spaces. Since the queries, keys, and values are all equal to the messages pertaining to a � � � person-pair, i.eF., = F = F = S, we can produce the multi-head attention-aware sentiment embedding matrix as S = �� (S, S, S). Input to the MHSA layer is padded (and masked) equal to the length of vector of maximum length in a batch. Next, the output of the MHSA layer is pooled via mean-pooling to arrive at a ixed-sized embedding across all user pairs. Subject to ELECTRA, email messages longer than 512 words are truncated, and those of shorter lengths are padded, though as shown in Figure 5(b), most email messages are shorter than the upper limit of 512 words and there is rarely any information loss due to this limitation. To strengthen and aid the sentiment-extraction process, we utilize the outputssentiment of two lexicons: LIWC [40] and VADER [21]. Both LIWC and VADER are fed the concatenated messages for all user-pairs. LIWC reads the input text and outputs the percentage of words that relect diferent emotions, thinking styles, social concerns. VADER considers the polarity and intensity of emotion of the input text and gives four output scores, positive, negative, neutral and compound (computed by normalising the other three scores). We train XGBoost9[] models on the outputs of LIWC and VADER and then let the neural network learn the weights of the concatenated input of original features and the one-hot encoded decision paths in the extracted tree leaves from the trained XGBoost model. This enables the model to learn the relational information between diferent features of sentiment lexicons. Finally, ELECTRA embeddings, outputs from sentiment lexicons, and one-hot encoded leaf embeddings from sentiment-lexicon-XGBoost-models are concatenated and fed to a FC layer to produce a ixed-sized embedding vector. The training process is accomplished by performing binary classiication against the ground-truth sentiment labels using binary cross-entropy loss. Multiple experiments established that the larger size of the inal embedding vector can lead to overitting. Hence, we run regularization techniques and maintain a small embedding size to account for the input dataset’s size. The sentiment extraction process is presented in Algorithm 1. ACM Trans. Knowl. Discov. Data. 8 • Singh et al. Algorithm 1: Sentiment Extraction Input: emails b/w pairs, m ,∀� ∈ {1...� },∀� ∈ {1...� }, where � denotes the number of emails of pair �,� � � �and � denotes the number of pairs, pre-traine ele d ctra, set of entire unlabeled emails � ����� Output: email context embeddings z ,∀� ∈ {1, ..., � } // Use LIWC and VADER for aggregated emails per pair. 1 F (�) := LIWC(CONCAT(m )); LIWC �,� 2 F (�) := VADER(CONCAT(m )); VADER �,� // Train a XGBoost model, and extract its leaves . 3 XL(�) := LEAVES(XGB(�(�))), � ∈ {F , F },∀� ∈ {1...� } LIWC VADER // Unsupervised finetuning ofELECTRA via MLM 4 ELECTRA = MLM(electra, � ) 5 for �∈ {1, 2, ..., � } do 6 for �∈ {1, 2, ...,� } do // Get email embeddings withELECTRA. 7 � = ELECTRA(m ) �,� �,� 8 end // UpdateELECTRA embeddings by MHSA. 9 � = MHSA(� ) �,� �,� // Obtain hidden stateh by pooling fromMHSA result. 10 h = W MEAN(� ) �,� // Obtain hidden stateh fromLIWC. 11 h = W [F (�), XL(F (�))] LIWC LIWC // Obtain hidden stateh fromVADER. 12 h = W [F (�), XL(F (�))] VADER VADER 13 end � � � ����� 14 z ← tanh([h , h , h ]) � � � 4.2 Semi-Supervised Graph Neural Networks Positive and negative links have diferent dynamics in a network, and social theories like balance theory ofer a systematic way to handle these. Specially-designed machine learning approachessigne such as d graph the neural network models are the computational counterparts of these sociological approaches, where positive and negative links are initially treated independently, often driven by relevant social theories. The message passing architecture in GNNs in general learns the embedding of a node by leveraging it’s own and the aggregated neighboring information: � �−1 �−1 � = � � ,� � N (� ) (4) � � �−1 where � is a non-linearity � is, a permutation invariant function, N and(�) are the neighboring nodes of the target node in the lay�er− 1. This mechanism is applied in signed networks by segregating the nodes by link polarity, and by contrasting the embeddings of these groups of nodes. Any other associated information such as link direction, topological structure, associated node features can easily be incorporated into the model-training. In this work, we employ the SGCN 16[], that incorporates one of the most noteworthy signed social network theory - balance theory8[] - as the base information aggregator. SGCN segregates the positive and negative ACM Trans. Knowl. Discov. Data. Multi-Stage Machine Learning Model for Hierarchical Tie Valence Prediction • 9 neighbors based on balance theory and employs a segregated aggregation mechanism as presented in Algorithm 2. Additionally, we explored GNNs like27SiGA ], thatTinclude [ both the balance theory and another popular signed social network theory - the status theory 19[, 48]. We account for neighboring nodes that can be reached via a 3-hop path, a hyperparameter design choice. We experimented with two diferent aggregation mechanisms for improving the representational ability of GNNs. First, instead of standard order invariant pooling, we employed an attention-based aggregation mechanism. Second, we designed an aggregator function that can ingest the edge-level features in the node neighborhood. This function enabled the pertinent inclusion of topologically-relevant relational information into the training process. However, we discovered that despite a substantial increase in computational complexity, including both of these approaches in the SGCN didn’t lead to a statistically signiicant performance gain. After extensive experimentation we reached the conclusion that owing to a small-sized dataset, the gains ofered by both these approaches are invariably ofset by over-itting. We train a multi layer perceptron for the inal classiication by concatenating the sentiment embeddings from the NLP stage, and relational embeddings from the SGCN stage. Algorithm 2 describes the aggregation process. 5 RESULTS We evaluated the model’s performance in various test settings. The merged data were randomly split into training, validation, and test sets (with ratios 0.70, 0.15, and 0.15), resulting in 682, 140, and 145 data points, respectively. As summarised in Tables 1 and 2, model performance for both the NLP stage and for the complete model is compared against strong baselines, including XGBoost, and various signed GNN models. The embeddings from state-of- the-art NLP-only models are used as baselines for establishing the superiority of the sentiment embeddings extracted via the proposed approach. For the complete model, the 10-run averages are reported with and without sentiment-embeddings. We use early stopping on the validation set, and the best Macro-F1, precision, recall, and AUC scores for test sets are reported. For all models the best model is selected by searching the embedding size in {32,64,128} and number of epochs in {50,100,200}. The results demonstrate our approach outperforms all baselines, achieving the best performance for all reported metrics, along with a best AUROC score of 0.8190. It is worth mentioning here that a logistic regression model (not shown in Tables 1 and 2) yielded a low AUROC score of 0.5610. To analyze the functioning of the proposed approach, we further explore model behavior under various test conditions. The inclusion of sentiment embeddings clearly delivers a substantial performance boost over the baselines models. Comparing and contrasting the model output with and without text embeddings will provide an opportunity to discover unique, informative patterns in the underlying email data. In the absence of similar datasets as used in this research, we can not make a direct empirical comparison between such indings from diferent organizations, but we aim to provide a roadmap for deploying our model to a live setting. The model code is made available publicly. We chose the best performing model SGCN among the baselines for this analysis. Speciically, the SGCN model was run with and without sentiment embeddings; then true positive (TP) and true negative cases (TN) were identiied. Further, the TP and TN were compared to achieve a set of data points that were similar Same ( or S) or are newly-identiie Dif. d (or D) in both the runs (with, and without sentiment embeddings). As the performance gain comes from the sentiment information, the new TP and TN can be attributed to new relational patterns uncovered by the email text. Recall that our data-points are the relation-edges between people formed by exchanging digital information. In this light, we perform an exploratory analysis of attributes of relation-edges in theS and D sets, as well as the individuals involved in these edges. ACM Trans. Knowl. Discov. Data. 10 • Singh et al. Algorithm 2: Balance Theory-based Aggregation + − Input: � = (�, � , � ); node featurex ,∀� ∈ � ; neighbor nodes� ∈ N , whereN denotes the neighbors � � � � � + + of� ; edge feature between positive neighbDors,∀� ∈ � ; edge feature between negative � �,� �,� − − �(�) � (�) neighborsD ,∀� ∈ � ; number of layers L; weight matrices W and W ,∀�∈ {1...�}; �,� �,� activation function � Output: node embedding vectors determining tie valence � , ∀� ∈ � � � // (Optional atention based) node aggregation for neighbors �(�) 1 F (�,�) := ATTENTION(h ) ��� �∈N �(�) 2 F (�,�) := − ATTENTION(h ) ��� �∈N // Initialize the first layer with given node features (0) 3 h ← x , ∀� ∈ � � � � 4 for � ∈ � do // Aggregate edge information between negative neighbors + + 5 h = POOL(� ), ∀� ∈ N �,� � // Aggregate edge information between negative neighbors − − 6 h = POOL(� ), ∀� ∈ N �,� � // Obtain positive hidden state for the first layer h i + − �(1) �(1) + � � (0) 7 h = � W F (0,�), h , h , h � � � � ��� // Obtain negative hidden state for the first layer h i + − � (1) � � (0) � (1) − 8 h = � W F (0,�), h , h , h � � � � ��� 9 end 10 if � > 1 then 11 for �∈ {1, 2, ...,� } do 12 for � ∈ � do // Obtain positive hidden state from the previous lay . er �(�) 13 h = h i + − �(�−1) � � �(�) + − � W F (�− 1, �), F (�− 1, �), h , h , h � � � ��� ��� // Obtain negative hidden state from the previous lay . er � (�) 14 h = h i + − � (�) + − � (�−1) � � � W F (�− 1, �), F (�− 1, �), h , h , h � � � ��� ��� 15 end 16 end 17 end �(�) � (�) 18 z ← [h , h ], ∀� ∈ � � � � � 5.1 Node-level traits A workplace usually is a unique mixture of individuals from diferent age groups, nationalities, and ethnicities, who take up diferent departmental roles according to their knowledge, skills, abilities, and experiences. We explore the HR information of the peopleS in and theD and ind that the information uncovered by inclusion of digital exchange data can exhibit certain patterns. Speciically, we analyze age distributions, and manager, minority, and gender ratios. ACM Trans. Knowl. Discov. Data. Multi-Stage Machine Learning Model for Hierarchical Tie Valence Prediction • 11 Table 1. Performance comparison for exTV’s sentiment embeddings. (exTV-NLP). Combinations of the input components also serve the purpose of ablation study. Model F1 PrecisionRecallAUC LIWC & VADER 0.396 0.328 0.500 0.5 only ELECTRA 0.517 0.576 0.575 0.575 only exTV-NLP 0.567 0.648 0.579 0.578 no_leaves exTV-NLP 0.581 0.601 0.580 0.581 no_meta exTV-NLP 0.579 0.618 0.582 0.582 no_MHSA exTV-NLP 0.615 0.650 0.612 0.612 Table 2. Performance comparison for with and without sentiment embeddings from exTV against various baselines. Model F1 PrecisionRecallAUC XGBoost 0.625 0.641 0.621 0.621 XGBoost 0.660 0.668 0.655 0.655 exTV SiGAT 0.586 0.619 0.586 0.666 SiGAT 0.579 0.581 0.578 0.673 exTV SLF 0.671 0.708 0.662 0.743 SLF 0.676 0.685 0.671 0.765 exTV SGCN 0.680 0.704 0.672 0.781 SGCN 0.728 0.726 0.730 0.819 exTV (a) Median age (b) Ratio of manager (c) Ratio of minority (d) Ratio of females Fig. 3. Ratios of age, manager, minority and gender forS (Same) and D (Dif.) sets. Allž ł represents respective values for entire dataset. Figures 3(a) to 3(d) presents the results of this analysis. It can be observed that peopleD in haveset a higher median age on average. The minority and the female ratio is lower, whereas the ratio of managers is roughly ACM Trans. Knowl. Discov. Data. 12 • Singh et al. the same. This analysis also points out that in this data, identiication of ties for people with relatively higher age, non-minorities, and males have a higher ambiguity, and that utilizing communication data mitigates this ambiguity. Discovery of such patterns can be utilized by management to design better programs and policies that promote better communication among employees, tailored to the organization’s needs. 5.2 Edge-level features The edge level features are directly associated with the tie-valence for a pair of individuals, and studying these features can aid the understanding and prediction of the relational edges. We analyze two types of edge features in our data. 5.2.1 Meta Features. We engineered meta-features from the properties of the network formed by the email exchange information. These features were designed with an assumption of being informative of the characteristics of the network formed, indicative of the relationship between a pair of employees, and logically comprehensible. Examples include the total number of emails exchanged, average message length, and message frequency. All meta-features are listed in supplementary information Table 3. Another consideration was to treat the day of week efect. We segregated the exchange for weekdays and weekends to elaborate this pattern further and received intriguing results. Dif. edges, For the average number of emails exchanged during weekdays is lower than Same theedges, whereas these numbers are substantially higher during the weekends. Additional exploratory analysis is present in A.1 (a) WPS (b) I (c) They (d) Anger (e) Sexual (f) Religious (g) Time (h) Money Fig. 4. Selected LIWC feature values forS (same edges) and D (diferent edges) sets. y-axis of the figure represents LIWC score, which is percentage of words in the text belonging to that dictionary. The values atop bars are the respective percentage values. Each subfigure’s title is the feature name in the LIWC output. łWPSž (a) stands for words per sentence, łIž (b) and łTheyž (c) presents the percentage of these pronouns, and similarly, percentage of Angerž ł (d), łSexualž (e), łReligiousž (f), łTimež (g), and łMoneyž (h) words. This figure clearly highlights diferences in the nature of communication in theS and D sets. 5.2.2 LIWC Features. LIWC [40] is a non-parametric sentiment lexicon that reveals thoughts, feelings, personality, and motivations based on percentages of the words describing diferent contexts. LIWC takes as input the text, and outputs a vector of length 93, which are the percentages of words belonging to that dictionary. The individual outputs and their values can be interpreted to analyze the underlying sentiment in the text. We summarize ACM Trans. Knowl. Discov. Data. Multi-Stage Machine Learning Model for Hierarchical Tie Valence Prediction • 13 insights from data in Figures 4(a) to 4(h), that highlight a clear diference in the nature of communication between theS and D sets. The WPS (words per sentence) feature value is substantially lowerDfor , as set is the use of irst-person singular pronouns (I). However, the use of the third-person plural pronoun (they) is higher. Further, the results also suggest that individuals inDthe also setcommunicate less about time and more about money. We also ind that the use of words concerning anger, sexuality, swearing, religion, with such verbiage being absent from the edges in setD. 5.3 Label-distributions forS and D sets We explored the ratio of negative ties for both sets. The values come out to 0.69 and 0.56Sfor andtheD sets, respectively. The newly classiied edges exhibit a lower ratio of negative ties or a larger share of positive ties. The inclusion of email embeddings resulted in better identiication of positive edges or emotions. This inding further implies that the inal classiier, ingesting the merged information of network and email embeddings, inds an improved signal for the positive relations. It is worth pointing out here that the distribution of labels in the underlying data is imbalanced in favor of negative ties (2:1), potentially resulting in the deep learning model overitting on this class, mainly due to the smaller data size. Despite this, the analysis reveals that our classiier, and hence the NLP model in stage 1, is inding better text-based indicators for the smaller class. This inding signiies that the text-based positive sentiment is easier to discern than its negative counterpart. The data show that continual communication on weekends, with a much lower tendency for anger, anxiety, and profanity, signiies a sound positive relation between individuals. Heavy usage of irst-person pronouns can be indicative of a preoccupation with one’s thoughts and depression 18], and[lower usage of such pronouns can suggest a positive relational perspective. Greater use of group connotation pronouns also signals a reduced prevalence of depression 46].[ For instance, the usage of łtheyž is higher in theD set . Similarly, the lower WPS count inDif. edges might exhibit a more casual form of relationship, akin to friendly acquaintances exchanging information on a digital messaging service, rather than employing long, formal sentences. The model’s superiority stems from sentiment embeddings, as presented in the Table 1. Further, detailed comparison of results with and without sentiment embeddings identiies new relationships. Out of the 85 employees in the test data, the inclusion of sentiment embeddings leads to a large performance gain with an accurate tie-valence identiication for 16 additional individuals .82% of all emplo (18 yees). Such indings can not only greatly aid in shaping the training programs, but also the identiication and prediction of the tie-valences in an organization. 6 DISCUSSION AND CONCLUSION This paper presented an ensemble model utilizing anonymized employee information to identify tie-valences in an organizational social network. While ensemble models and their applications have been10 w,ell 11, researched [ 17, 20, 31, 34], as per our knowledge, this is the irst deep-learning-based ensemble model that leverages archived message text and employee information to learn and predict tie-valences in a context involving hierarchy and organizational structure. While being a data-driven paper, the computational algorithms employed in this work include the most prominent theories in signed social networks - balance 8] and thestatus ory [ theory [19, 48]. Our model can be applied by third-party providers in a live setting to unobtrusively analyze traic on an organization’s digital communication platforms and deliver useful insights to the client, including providing feedback on emerging interdepartmental conlicts that could threaten the organization’s functioning or tracking increased collaboration in a post-merger context [12, 47], all while protecting individuals’ anonymity. This research is built on a snapshot of a much larger dataset and proves that deep learning can be used to better predict and analyze workplace tie valences. Yet, our indings have implications beyond organizational structures and can be used in any online domain, for example on various online collaboration platforms (e.g., ACM Trans. Knowl. Discov. Data. 14 • Singh et al. Trello, Slack). Our work suggests insights for future research. As the size of our dataset was relatively small, we expect the model’s learning capability to increase greatly when trained over the entire corporate dataset. A larger dataset can be used for modeling. Owing to data limitations, we have employed transductive GNNs for the model’s second stage; however, larger datasets will warrant using inductive approaches like GraphSA 23]. GE [ Similarly, an end-to-end training regimen will allow for improved learning of embedding in both of the stages. Findings from this work have direct implications in promoting positively valenced relations and collaboration in organizational and other social networks. The proposed method can be employed in any social network by replacing/augmenting the information exchange mechanism, e.g., social media posts, instant messages, bulletin boards, and calendar invites. The relatively high performance of exTV also renders it helpful in pursuing what-if analysis in the social networks. While the use of personal, conversational data is always fraught with privacy concerns, there are many ways to manage the risk. We suggest separating the providing irm collecting and analyzing the data from the client irm receiving the anonymized, aggregated suggestions for improvement. Combining this with a machine-learning- based method that does not involve humans accessing private data allows the client irm to protect its employees while deriving valuable insights that can improve the organization’s functioning and the employees’ mental health and career outcomes. ACKNOWLEDGMENTS Authors are grateful for the assistance of the organization’s CHRO, as well as all the employees who assisted in this data collection (particularly within the IT department), without whom this research would not have been possible. This work was supported by the Institute for Basic Sciences (IBS), Republic of Korea, under IBS-R029-C2 (Singh, Lee, and Cha). REFERENCES [1] Filip Agneessens and Giuseppe (Joe) Labianca. 2022. Collecting survey-based social network information in work Social organizations. Networks 68 (2022), 31ś47. [2] Priyanka Agrawal, Vikas K. Garg, and Ramasuri Narayanam. 2013. Link Label Prediction in Signed Social proNetw c. of the orks. IJCAI In 2013. 2591ś2597. [3] Chester Irving Barnard. 1968.The functions of the executive. Vol. 11. Harvard University Press. [4] Ghazaleh Beigi, Suhas Ranganath, and Huan Liu. 2019. Signed Link Prediction with Sparse Data: The Role of Personality Information. In proc. of the WWW. 1270ś1278. [5] Kamal Berahmand, Elahe Nasiri, Saman Forouzandeh, and Yuefeng Li. 2022. A preference random walk algorithm for link prediction through mutual inluence nodes in complex netwJournal orks. of King Saud University - Computer and Information Sciences 34, 8, Part A (2022), 5375ś5387. [6] Kamal Berahmand, Elahe Nasiri, Mehrdad Rostami, and Saman Forouzandeh. 2021. A modiied DeepWalk method for link prediction in attributed social network. Computing 103, 10 (2021), 2227ś2249. [7] Xiaoyan Cai, Junwei Han, and Libin Yang. 2018. Generative Adversarial Network Based Heterogeneous Bibliographic Network Representation for Personalized Citation Recommendation. proc. ofInthe AAAI. 5747ś5754. [8] Dorwin Cartwright and Frank Harary. 1956. Structural balance: a generalization of Heider’s Psychological theory. Review 63, 5 (1956), [9] Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A Scalable Tree Boosting System. proc. ofInthe ACM SIGKDD. 785ś794. [10] Yi-Ling Chen, Ming-Syan Chen, and Philip S. Yu. 2015. Ensemble of Diverse Sparsiications for Link Prediction in Large-Scale Networks. In 2015 IEEE International Conference on Data Mining. 51ś60. [11] Yen-Liang Chen, Chen-Hsin Hsiao, and Chia-Chi Wu. 2022. An ensemble model for link prediction based on graph Decision embedding. Support Systems 157 (2022), 113753. [12] Chia-Yen (Chad) Chiu, Prasad Balkundi, Bradley P Owens, and Paul E Tesluk. 2020. Shaping positive and negative ties to improve team efectiveness: The roles of leader humility and team helping Human norms. Relations (2020). [13] Kevin Clark, Minh-Thang Luong, Quoc V. Le, and Christopher D. Manning. 2020. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators. In proc. of the ICLR. [14] James A Davis. 1967. Clustering and structural balance in graphs. Human Relations 20, 2 (1967), 181ś187. ACM Trans. Knowl. Discov. Data. Multi-Stage Machine Learning Model for Hierarchical Tie Valence Prediction • 15 [15] James A Davis and Samuel Leinhardt. 1967. The structure of positive interpersonal relations in small groups. (1967). [16] Tyler Derr, Yao Ma, and Jiliang Tang. 2018. Signed graph convolutional netw proorks. c. of the In IEEE ICDM. 929ś934. [17] Liang Duan, Shuai Ma, Charu Aggarwal, Tiejun Ma, and Jinpeng Huai. 2017. An Ensemble Approach to Link Pr IEEE ediction. Transactions on Knowledge and Data Engineering 29, 11 (2017), 2402ś2416. [18] To’Meisha Edwards and Nicholas S. Holtzman. 2017. A meta-analysis of correlations between depression and irst person singular pronoun use. Journal of Research in Personality 68 (2017), 63ś68. [19] M Hamit Fişek, Joseph Berger, and Robert Z Norman. 1991. Participation in heterogeneous and homogeneous groups: A theoretical integration. Amer. J. Sociology 97, 1 (1991), 114ś142. [20] Saman Forouzandeh, Kamal Berahmand, Elahe Nasiri, and Mehrdad Rostami. 2021. A Hotel Recommender System for Tourists Using the Artiicial Bee Colony Algorithm and Fuzzy TOPSIS Model: A Case Study of Trip International Advisor. Journal of Information Technology & Decision Making 20, 01 (2021), 399ś429. [21] CHE Gilbert and Erric Hutto. 2014. Vader: A parsimonious rule-based model for sentiment analysis of social proc. meofdia the text. In ICWSM. [22] Avni Gulati and Magdalini Eirinaki. 2019. With a Little Help from My Friends (and Their Friends): Inluence Neighborhoods for Social Recommendations. Inproc. of the WWW. 2778ś2784. [23] Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large proc. graphs. of the NeurIPS In . 1024ś1034. [24] Xiaorong Hao, Tao Lian, and Li Wang. 2020. Dynamic Link Prediction by Integrating Node Vector Evolution and Local Neighborhood Representation. Inproc. of SIGIR. 1717ś1720. [25] Nicholas M Harrigan, Giuseppe (Joe) Labianca, and Filip Agneessens. 2020. Negative ties and signed graphs research: Stimulating research on dissociative forces in social netw Social orks. Networks 60 (2020), 1ś10. [26] Xiangnan He, Hanwang Zhang, Min-Yen Kan, and Tat-Seng Chua. 2016. Fast Matrix Factorization for Online Recommendation with Implicit Feedback.prIn oc. of SIGIR. 549ś558. [27] Junjie Huang, Huawei Shen, Liang Hou, and Xueqi Cheng. 2019. Signed graph attention netwpr orks. oc. of Inthe ICANN. 566ś577. [28] Humanyze. 2011. Humanyze Workplace Analytics. https://humanyze.com/ [29] Mohammad Raihanul Islam, B. Aditya Prakash, and Naren Ramakrishnan. 2018. SIGNet: Scalable Embeddings for Signed Networks. In proc. of PAKDD. 157ś169. [30] Giuseppe Joe Labianca. 2014. Negative ties in organizational netw Contemp orks. In orary perspectives on organizational social networks. Emerald Group Publishing Limited. [31] Kuanyang Li, Lilan Tu, and Lang Chai. 2020. Ensemble-model-based link prediction of comple Computer x networks. Networks 166 (2020), 106978. [32] Xiaoming Li, Hui Fang, and Jie Zhang. 2017. Rethinking the Link Prediction Problem in Signed Sopr cial oc. ofNetw the AAAI orks.. In 4955ś4956. [33] Yi Liao, Xin Jiang, and Qun Liu. 2020. Probabilistically Masked Language Model Capable of Autoregressive Generation in Arbitrary Word Order. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020. 263ś274. [34] Yu lin He, James N.K. Liu, Yan xing Hu, and Xi zhao Wang. 2015. OWA operator based link prediction ensemble for social network. Expert Systems with Applications 42, 1 (2015), 21ś50. [35] Linyuan Lü and Tao Zhou. 2011. Link prediction in complex networks: A Psur hysica vey. A: statistical mechanics and its applications 390, 6 (2011), 1150ś1170. [36] Scott M. Lundberg, Gabriel G. Erion, Hugh Chen, Alex DeGrave, Jordan M. Prutkin, Bala Nair, Ronit Katz, Jonathan Himmelfarb, Nisha Bansal, and Su-In Lee. 2019. Explainable AI for Trees: From Local Explanations to Global Understanding. Nature Machine Intelligence abs/1905.04610 (2019). [37] Joshua E Marineau and Giuseppe Joe Labianca. 2021. Positive and negative tie perceptual accuracy: Pollyanna principle vs. negative asymmetry explanations. Social Networks 64 (2021), 83ś98. [38] Maven7. 2017. Maven7 OrgMapper. http://maven7.com/ [39] Microsoft. 2017. Microsoft Workplace Analytics. https://cloudpartners.transform.microsoft.com/practices/workplaceanalytics [40] James W Pennebaker, Roger J Booth, Ryan L Boyd, and Martha E Francis. 2015. Linguistic Inquiry and Word Count: LIWC2015. [41] Franco Scarselli, Marco Gori, Ah Chung Tsoi, Markus Hagenbuchner, and Gabriele Monfardini. 2008. The graph neural network model. IEEE Transactions on Neural Networks 20, 1 (2008), 61ś80. [42] Kenwyn K Smith. 1989. The movement of conlict in organizations: The joint dynamics of splitting andAtriangulation. dministrative Science Quarterly (1989), 1ś20. [43] Yiyi Tao, Yiling Jia, Nan Wang, and Hongning Wang. 2019. The FacT: Taming Latent Factor Models for Explainability with Factorization Trees. In proc. of SIGIR. 295ś304. ACM Trans. Knowl. Discov. Data. 16 • Singh et al. [44] Wiebe Van der Hoek, Louwe Kuijer, and Yì Wáng. 2020. Logics of Allies and Enemies: A Formal Approach to the Dynamics of Social Balance Theory. Inproc. of IJCAI 2020. [45] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you neeNeurIPS d. In . 5998ś6008. [46] Nikhita Vedula and Srinivasan Parthasarathy. 2017. Emotional and Linguistic Cues of Depression from So Procial ceedings Meof dia. In the 2017 International Conference on Digital Health. 127ś136. [47] Vijaya Venkataramani, Giuseppe Joe Labianca, and Travis Grosser. 2013. Positive and negative workplace relationships, social satisfaction, and organizational attachment. Journal of Applied Psychology 98, 6 (2013), 1028. [48] David Willer. 1999. Network exchange theory. Greenwood Publishing Group. [49] Pinghua Xu, Wenbin Hu, Jia Wu, and Bo Du. 2019. Link Prediction with Signed Latent Factors in Signed Social proNetw c. of the orks. In ACM SIGKDD. 1046ś1054. [50] Janice Yap and Nicholas Harrigan. 2015. Why does everybody hate me? Balance, status, and homophily: The triumvirate of signed tie formation.Social Networks 40 (2015), 103ś122. [51] Muhan Zhang and Yixin Chen. 2020. Inductive Matrix Completion Based on Graph Neural Netw prorks. oc. ofInthe ICLR. ACM Trans. Knowl. Discov. Data. Multi-Stage Machine Learning Model for Hierarchical Tie Valence Prediction • 17 A APPENDIX A.1 Exploratory Analysis on HR and conversation Data Both M anagersE ither M anagers No M anagers (a) Proile and demographic ratios (b) Number of words in an email (c) Weekday emails of managers and subordi- nates Both M inoritiE es ither M inoritN yo M inority Both M ale E ither M ale Both F emale Negative Positive (d) Weekday emails of minorities and non- (e) Weekday emails of men and women (f) The label trend of weekday emails minorities Fig. 5. Exploratory analysis. (a) The ratios of diferent profiles and demographic features. (b) A histogram of number of words in an email. (c) - (e) Average number of emails exchanged between individuals of diferent profiles and demographic atributes. (f) Average number of emails exchanged between edges with diferent tie-valences (positive and negative). łOut-ž and łIn-ž emails denote the emails sent and received, respectively. Figure 5 presents selected exploratory analyses on our dataset. A larger majority of the individuals are males (67%), managers ( 19%), ethnic minorities (16%) and front-line supervisors (16%). The word count of emails has a long-tailed distribution with the peak at 50ś100, implying that most of the emails involve shorter conversations. Combined with the fact that these exchanges are taking place in a formal setting, this makes mining useful information from text more diicult. Figures 5(c) to 5(f) present unique traits Ð the average number of emails exchanged Ð of the email exchange patterns between diferent types of user pairs in our data. In Figure 5(c), it can be observed that managers send and receive the most, and subordinates the least information among themselves. This observation could be attributed to the fact that, by the nature of their role, managers interact with many people and tend to participate in many higher-level meetings. Figure 5(d) shows that on average, minorities send out more emails to non-minorities (likely driven by numerical probabilities given how few minorities there are), but receive the most emails from minorities themselves, suggesting email exchange is also driven by social identities. The results in Figure 5(e) show that women are more likely to exchange with each other. Finally, Figure 5(f) highlights that as compared to negative ties, people with positive workplace relationships interact substantially more with each other. This suggests that reducing negative ties might improve an organization’s efectiveness and productivity by promoting better interactions and exchange. ACM Trans. Knowl. Discov. Data. 18 • Singh et al. A.2 Meta Features Table 3. Designed meta features. All meta features are computed for sent (out) and received (in) messages. <month> is replaced by all months. Feature name Interpretation total_time_<out/in> The total period that emails have been sent/received. The number of emails that emails have been num_email_<out/in> sent/received. The frequency that emails have been sent/received. frequency_<out/in> (num_email_out / total_time_out) The average interval that emails have been avg_interval_<out/in> sent/received. (total_time_out / num_email_out) The relative time of median email that have been med_total_pos_<out/in> sent/received in total period. The year-wise relative time of median email med_year_pos_<out/in> that have been sent/received. The month-wise relative time of median email med_month_pos_<out/in> that have been sent/received. The number of emails that have been sent/received num_week_emails_<out/in> in weekdays. The number of emails that have been sent/received num_weekend_emails_<out/in> in weekend. The maximum number of emails that have been month_max_num_<out/in> sent/received in a month. The maximum number of emails that have been month_max_<month>_<out/in> sent/received in <month>. The average number of characters in an email that avg_length_<out/in> has been sent/received. The average sentence length in an email that has avg_sentence_len_<out/in> been sent/received. The average number of sentences in an email that avg_sentence_num_<out/in> has been sent/received. ACM Trans. Knowl. Discov. Data. Multi-Stage Machine Learning Model for Hierarchical Tie Valence Prediction • 19 A.3 Survey Description The ground truth network used to train the model was collected via survey back in September 2012 in the new product development unit of this consumer product organization’s corporate headquarters. Social network analysis was conducted to help reorganize the unit to speed up new product development. A roster with all 185 employees’ names was provided and the following social network questions were administered, including the positive tie (friend) and negative tie (avoid) question. Of the 185 sociometric surveys distributed, 144 completed surveys were returned. • Desired Collaboration – I would be more efective in my work if I were able to collaborate more closely with this person. (respondent will check the appropriate names on the roster) • Friend – Do you consider this person to be a close friend (e.g., conide in this person)? (respondent will check the appropriate names on the roster) • Avoid – Sometimes people at work may make us feel uncomfortable or uneasy and, therefore, we try to avoid interacting with them. Do you try to avoid interacting with this person? (respondent will check the appropriate names on the roster) • Innovation Ratings – Innovative employees have the ability to efectively generate and implement novel ideas in the workplace. Please rate how innovative you believe each of your coworkers is. (1 = never innovative, 7 = always innovative) ACM Trans. Knowl. Discov. Data. 20 • Singh et al. A.4 Shapley Analysis We present the SHAP analysis 36[] of a XGBoost edge-label classiier that utilizes the same information as the full model. As seen in (Figure 6), this analysis could reconirm that sentiment embeddings are key features in the classiication task. It can also be observed that people who send a relatively higher number of messages on weekends are inclined to give positive workplace relationship feedback. Similarly older people tend to give more positive feedback as well. On the other hand, belonging to minority groups seem to be correlated with receiving negative feedback in perceived workplace ties. High SE_9 SE_8 U1_age U2_age SE_1 WD_em_in SE_4 SE_6 SE_0 Num_em_out Sen_len_in Sen_num_out Len_in U2_minority WE_em_out Low 2 0 2 SHAP value (impact on model output) Fig. 6. SHAP summary plot. Let vertical axis - features sorted in descending order of importance. Vertical bar - color legend for feature values. Horizontal axis - how values of diferent features drive the model output. For display purposes, feature names are shortened. SE_<> : sentiment embeddings, U<>_<age/minority> :user1/user2 age/minority, <WD/WE/Num>_em_<in/out> :number of weekday/weekend/number of emails in/out, Sen_<len/num>_<in/out> : sentence length/number in/out. ACM Trans. Knowl. Discov. Data. Feature value Multi-Stage Machine Learning Model for Hierarchical Tie Valence Prediction • 21 A.5 Additional Exploratory Analysis Additional exploratory analysis on the valence-data. Fig. 7 presents the mean valence score given by managers. The igure presents segregated scores from male and female managers, as well scores from managers to people in minority and non-minority sets. Fig. 8 shows the weekend communication trend for diferent scenarios. An interesting observation here is the pattern for positive and negative edges, with people with positive edges communicating substantially higher during the weekend of-hoursž ł . To F emale Non-M anager To F emale Non-M anager To M ale M inority To M ale Non-M anager To M ale Non-M anager To F emale M inority To F emale M anager To F emale M anager To M ale Non-M inority To M ale M anager To M ale M anager To F emale Non-M inority (a) Male manager (b) Female manager (c) Managers to minorities Fig. 7. Valence scores by managers Both M anagersE ither M anagers No M anagers Both M inoritiE es ither M inoritN yo M inority Both M ale E ither M ale Both F emale Negative Positive (a) Managers and subordinates(b) Minorities and non-minorities (c) Gender (d) By label type Fig. 8. Avg. number of weekend emails exchanged ACM Trans. Knowl. Discov. Data.

Journal

ACM Transactions on Knowledge Discovery from Data (TKDD)Association for Computing Machinery

Published: Feb 28, 2023

Keywords: Signed link prediction

There are no references for this article.