U.S. Pat. No. 12,148,445

EMOTION DETECTION & MODERATION BASED ON VOICE INPUTS, ERRORS IN GAMEPLAY, PHYSIOLOGICAL INDICATORS OF HEART RATE OR SKIN TONE OF USERS OF A COMMUNICATION SESSION

AssigneeSony Interactive Entertainment Inc

Issue DateSeptember 30, 2021

Illustrative Figure

Abstract

Systems and methods for emotion detection and emotion-based moderation based on voice inputs are provided. A user emotion profile may be stored in memory for a user. The user emotion profile may include one or more moderation rules that specifies a moderation action responsive to one or more emotional states. A current communication session associated with the user and one or more other users may be monitored based on the user emotion profile. An emotional state detected as being associated with a subset of the messages may trigger at least one of the moderation rules by corresponding to at least one of the emotional states specified by the user emotion profile. A presentation of at least one of the messages in the subset being provided to the user device may be modified in accordance with the moderation action specified by the user emotion profile.

Description

DETAILED DESCRIPTION Embodiments of the present invention include systems and methods for emotion detection and emotion-based moderation based on voice inputs. A user emotion profile may be stored in memory for a user. The user emotion profile may include one or more moderation rules that specifies a moderation action responsive to one or more emotional states. A current communication session associated with the user and one or more other users may be monitored based on the user emotion profile. An emotional state detected as being associated with a subset of the messages may trigger at least one of the moderation rules by corresponding to at least one of the emotional states specified by the user emotion profile. A presentation of at least one of the messages in the subset being provided to the user device may be modified in accordance with the moderation action specified by the user emotion profile. FIG.1illustrates a network environment in which a system for emotion detection and emotion-based moderation based on voice inputs may be implemented. The network environment100may include one or more content source servers110that provide digital content (e.g., games, other applications and services) for distribution, one or more content provider server application program interfaces (APIs)120, content delivery network server130, an emotion analytics server140, and one or more user devices150. The devices in network environment100communicate with each other using one or more communication networks, which may include a local, proprietary network (e.g., an intranet) and/or may be a part of a larger wide-area network. The communications networks may be a local area network (LAN), which may be communicatively coupled to a wide area network (WAN) such as the Internet. The Internet is a broad network of interconnected computers and servers allowing for the transmission and exchange of Internet Protocol (IP) data between users connected ...

DETAILED DESCRIPTION

Embodiments of the present invention include systems and methods for emotion detection and emotion-based moderation based on voice inputs. A user emotion profile may be stored in memory for a user. The user emotion profile may include one or more moderation rules that specifies a moderation action responsive to one or more emotional states. A current communication session associated with the user and one or more other users may be monitored based on the user emotion profile. An emotional state detected as being associated with a subset of the messages may trigger at least one of the moderation rules by corresponding to at least one of the emotional states specified by the user emotion profile. A presentation of at least one of the messages in the subset being provided to the user device may be modified in accordance with the moderation action specified by the user emotion profile.

FIG.1illustrates a network environment in which a system for emotion detection and emotion-based moderation based on voice inputs may be implemented. The network environment100may include one or more content source servers110that provide digital content (e.g., games, other applications and services) for distribution, one or more content provider server application program interfaces (APIs)120, content delivery network server130, an emotion analytics server140, and one or more user devices150. The devices in network environment100communicate with each other using one or more communication networks, which may include a local, proprietary network (e.g., an intranet) and/or may be a part of a larger wide-area network. The communications networks may be a local area network (LAN), which may be communicatively coupled to a wide area network (WAN) such as the Internet. The Internet is a broad network of interconnected computers and servers allowing for the transmission and exchange of Internet Protocol (IP) data between users connected through a network service provider. Examples of network service providers are the public switched telephone network, a cable service provider, a provider of digital subscriber line (DSL) services, or a satellite service provider. One or more communications networks allow for communication between the various components of network environment100.

The servers described herein may include any type of server as is known in the art, including standard hardware computing components such as network and media interfaces, non-transitory computer-readable storage (memory), and processors for executing instructions or accessing information that may be stored in memory. The functionalities of multiple servers may be integrated into a single server. Any of the aforementioned servers (or an integrated server) may take on certain client-side, cache, or proxy server characteristics. These characteristics may depend on the particular network placement of the server or certain configurations of the server.

Content source servers110may maintain and provide a variety of digital content and digital services available for distribution over a communication network. The content source servers110may be associated with any content provider that makes its content available for access over a communication network. The content source servers110may therefore host a variety of different content titles, which may further have be associated with object data regarding a digital or virtual object (e.g., activity information, zone information, character information, player information, other game media information, etc.) displayed in a digital or virtual environment during an interactive session.

Such content may include not only digital video and games, but also other types of digital applications and services. Such applications and services may include any variety of different digital content and functionalities that may be provided to user devices150, including providing and supporting chat and other communication channels. The chat and communication services may be inclusive of voice-based, text-based, and video-based messages. Thus, a user device150may participate in a gameplay session concurrent with one or more communication sessions, and the gameplay and communication sessions may be hosted on one or more of the content source servers110.

The content from content source server110may be provided through a content provider server API120, which allows various types of content source servers110to communicate with other servers in the network environment100(e.g., user devices150). The content provider server API120may be specific to the particular operating language, system, platform, protocols, etc., of the content source server110providing the content, as well as the user devices150and other devices of network environment100. In a network environment100that includes multiple different types of content source servers110, there may likewise be a corresponding number of content provider server APIs120that allow for various formatting, conversion, and other cross-device and cross-platform communication processes for providing content and other services to different user devices150, which may each respectively use different operating systems, protocols, etc., to process such content. As such, applications and services in different formats may be made available so as to be compatible with a variety of different user device150. In a network environment100that includes multiple different types of content source servers110, content delivery network servers130, emotion analytics server140, user devices150, and databases160, there may likewise be a corresponding number of APIs managed by content provider server APIs120.

The content provider server API120may further facilitate access of each of the user devices150to the content hosted or services provided by the content source servers110, either directly or via content delivery network server130. Additional information, such as metadata, about the accessed content or service can also be provided by the content provider server API120to the user device150. As described below, the additional information (e.g., object data, metadata) can be usable to provide details about the content or service being provided to the user device150. In some embodiments, the services provided from the content source servers110to the user device150via the content provider server API120may include supporting services that are associated with other content or services, such as chat services, ratings, and profiles that are associated with a particular game, team, community, etc. In such cases, the content source servers110may also communicate with each other via the content provider server API120.

The content delivery network server130may include a server that provides resources, files, etc., related to the content from content source servers110, including various content and service configurations, to user devices150. The content delivery network server130can also be called upon by the user devices150that request to access specific content or services. Content delivery network server130may include universe management servers, game servers, streaming media servers, servers hosting downloadable content, and other content delivery servers known in the art.

Emotion analytics server140may include any data server known in the art that is capable of communicating with the different content source servers110, content provider server APIs120, content delivery network server130, user devices150, and databases160. Such emotion analytics server140may be implemented on one or more cloud servers that carry out instructions associated with interactive content (e.g., games, activities, video, podcasts, User Generated Content (“UGC”), publisher content, etc.). The emotion analytics servers140may further carry out instructions, for example, for monitoring a communication session based on an emotion profile of a user, identifying and categorizing one or more emotional states associated with specific communications in the communication session in real-time, detecting when the emotional states trigger a moderation rule in the emotion profile, and modifying at least one message associated with the emotional state based on a moderation action of the emotion profile.

The user device150may include a plurality of different types of computing devices. The user device150may be a server that provides an internal service (e.g., to other servers) in network environment100. In such cases, user device150may correspond to one of the content servers110described herein. Alternatively, the user device150may be a computing device that may include any number of different gaming consoles, mobile devices, laptops, and desktops. Such user devices150may also be configured to access data from other storage media, such as, but not limited to memory cards or disk drives as may be appropriate in the case of downloaded services. Such user devices150may include standard hardware computing components such as, but not limited to network and media interfaces, non-transitory computer-readable storage (memory), and processors for executing instructions that may be stored in memory. These user devices150may also run using a variety of different operating systems (e.g., iOS, Android), applications or computing languages (e.g., C++, JavaScript). An exemplary client device150is described in detail herein with respect toFIG.5. Each user device150may be associated with participants (e.g., players) or other types (e.g., spectators) of users in relation to a collection of digital content streams.

While pictured separately, the databases160may be stored on any of the servers and devices illustrated in network environment100on the same server, on different servers, or on any of the user devices150. Such databases160may store or link to various voice-emotion learning models, language-emotion models, or other emotion models, each of which may be specific to different emotions, users, or user groups. Databases160may also store other resources used for evaluating emotional states exhibited during communication sessions. In addition, databases160may store emotion profiles, as well as moderation rules that may be specific to and further customizable to a particular user, user group or team, user category, game title, game genre, language category, etc. One or more user emotion profiles may also be stored in the databases160for each user. In addition to gameplay data regarding the user (e.g., user progress in an activity and/or media content title, user ID, user game characters, etc.), the user emotion profile may include a set of triggers and moderation rules specific to the user.

Exemplary learning models associated with emotion identification may include a variety of detectable indicators, including textual indicators (e.g., diction or word choice, profanity, keywords, terms, phrases, etc.), verbal indicators (e.g., changes in voice intonation, pitch or frequency, rate or tempo of speech, rhythms, volume, etc.), visual indicators (e.g., facial expressions, skin tone (e.g., flushed or pale relative to calm states), physiological or behavioral indicators (e.g., biometric changes such as heart rate, signs of stress, increased errors in input, etc.), in-game indicators (e.g., errors in gameplay, character behaviors), and other indicators associated with mood, sentiment, emotion, etc., known in the art. The learning models may further be refined for different players, who may exhibit different indicators for a given emotion. A specific player may further be associated with multiple different learning models corresponding to different emotions. The learning models may further be refined based on user input from the user or from others who may provide feedback as to the emotional state of a particular user.

In addition to learning models associated with emotion, a user emotion profile may further include one or more moderation rules that are responsive to different emotional triggers. The moderation rules for the user may specify, for example, certain trigger conditions related to emotion and associated moderation actions. For example, a trigger condition may be include detecting indicators of undesired emotions (e.g., high volume, yelling, profanity, hate speech, racist, sexist, homophobic, mis-gendering language), and a corresponding moderation rule may specify reducing the volume, muting or replacing the objectionable language, and modulating the voice tone to be more neutral in relation to emotion (e.g., auto-tuning or otherwise changing a sound parameter of the original voice, using a voice generator to provide a new voice). When the moderation rules are applied to a communication session, therefore, a user may still participate in a communication session and comprehend the communications and messages that are exchanged, but may also avoid or neutralize undesired emotions. A user may therefore determine how to protect their own mental and emotional well-being and avoid stress related to such undesired emotions when interacting with other users in communication sessions. Such a user may also specify or customize moderation rules responsive to their own emotional states. Such moderation rules may concern decreasing game difficulty, level of aggressive gameplay, heightened thresholds for different emotional triggers, self-censorship of profanity, neutralizing or modifying one's own voice, triggering calming effects (e.g., music, visuals, fans and other temperature adjustments, etc.

In some embodiments, a user may specify a moderation rule that results in a visual display indicative of their own or other users' emotional states. Such a visual display may indicate different emotions and levels (e.g., low, medium, high) of such emotions. Different emotions may be visualized in an emotional dashboard, for example, that includes different types of measurement or metric presentation. Such visual displays may be triggered based on custom alert or notification settings as to thresholds associated with indicators for a certain emotion. In addition, the emotion analytics server140may also track data regarding emotional indicators over time, which may be used to develop and refine learning models associated with emotions. The user may also request to view the emotional dashboard (associated with their own emotional data or the emotional data of others), which may further breakdown the emotional data by game title, game session, interaction with different specific users or user groups, time periods, or any other parameter being tracked and made accessible to emotion analytics server140(including data tracked by UDS system200).

In a communication session that involves multiple players each associated with different emotion profiles, therefore, each player may be presented with different versions of the same chat stream that has been moderated in accordance with their respective user emotion profiles. Thus, a chat stream that may include multiple different communications exhibiting different emotions in unmodified, unfiltered form may be transformed into different versions of the chat stream presented at each user device in accordance with their respective user emotion profiles. For example, one user—who may not have expressed any emotional triggers or moderation preferences or rules—may receive an original, unmodified version of the chat streams including messages as provided by each of the other users. Another user—who may have expressed preferences for avoiding profanity/hate speech—may therefore be presented with a different version of the chat stream from which all profanity and hate speech were removed or muted. Yet another user—who may have expressed a further preference for avoiding anger-laden communications—may be presented with yet a different version of the chat stream in which all anger-laden communications have been neutralized.

Such neutralization may be implemented by auto-tuning the specific communications into a more emotionally-neutral tone or by using a voice generator to generate a new tone. The new tone may correspond to a neutral, computer-generated voice or may correspond to preferred voices (e.g., cartoon character voices) associated with an opposite emotional effect. For example, a user may opt to change a voice detected as indicative of anger (or other negative emotion) to an auto-tuned or computer-generated cartoon voice associated with humor and comical effect. Such neutralization may also apply to textual language, for example, by replacing anger-laden words with replacements (e.g., language associated with neutral, humorous, comical effect, emojis, graphics, other user-generated content (UGC), etc.).

In some implementations, the user emotion profile may be set not only by the users themselves, but also parents, guardians, supervisors, etc., of the users who may wish their charge to avoid exposure to certain types of emotions and indicators of the same, including profanity, graphic content, bullying language, misgendering language, or other aggressive, hostile, or violent language (e.g., harassment, threats). Thus, the trigger conditions defined by the moderation rules may include various terms, phrases, and indicators of such language or content, while the associated moderation actions may include removing (e.g., deleting, muting, bleeping, auto-tuning, voice-generating) or modifying (e.g., blurring, blocking/blacking-out, auto-correcting or replacing with alternative language, emoji, graphics, UGC) the triggering language from the chat stream presentation provided to the user. For example, misgendering language or dead names may be replaced with the correct pronouns or chosen names associated with the user. As discussed herein, voices detected as being associated with an undesired emotion may also be modified by way of muting, modulating, auto-tuning, or replacement by a voice generator.

In some instances, a user may choose to censor themselves when customizing the moderation rules of their own user emotion profile. For example, the user may specify that certain audiences (e.g., players in young age ranges) be prevented from seeing, hearing, or accessing certain types of content generated by the user. As such, the trigger conditions may include parameters characterizing the other users and different ranges associated with different trigger actions. Female players may have different settings for modifying their own voices, for example, to allow or avoid indicators of their gender based on the identity or category of each of the other players in a communication session. Younger players may wish to modify their voices so as to appear older and more mature. In another example, the user may specify different categories of users (e.g., friends, teammates, current session players, favorites, trusted users, etc.) associated with different moderation rules as applied to their own voice or other voices. Thus, a chat stream may be further customized for the user based on application of the different moderation rules in the associated user emotion profile.

In some embodiments, the associated moderation action may include identifying a user who is the source of the triggering emotion (e.g., a specific player or spectator) and automatically adding the identified user to a block list or initiating another form of punishment or intervention (e.g., blocking access to gameplay session or platform, requiring remedial training as prerequisite to regaining gameplay privileges, in-game punishments affecting game status or gameplay). Some implementations of the trigger conditions may include a predetermined number of permitted uses (e.g., three strikes) before the identified user is blocked or other punishment or intervention imposed. As such, each detected use of triggering emotional indicators may be counted against the predetermined number and result in an update in the current status of the trigger condition (e.g., second strike out of three). Progressive punishments for repeat offenders may also be implemented locally or across the entire communication session, so that players who continue to exhibit the same undesirable behaviors may be subject to increasingly harsher or longer punishment.

Different combinations of the moderation rules (and associated trigger conditions and moderation actions) may be stored in the user emotion profile for each user. Collectively, therefore, the set of moderation rules included in a user emotion profile provides for custom evaluation, modification, and presentation for the specific user. The customization may further be based on current session conditions (e.g., current gameplay status, current game title, current in-game conditions). Because each chat stream presentation is personalized to the specific user based on their respective user emotion profile, a chat stream involving multiple different users may be modified to result in as many different versions of the chat stream as there are users. In some embodiments, emotion analytics server140may generate customized bots (including chatbots) programmed to apply a custom user emotion profile associated with a user to online communication sessions with which the user is participating via their respective user device. Such analysis and modification may therefore occur either at a remote emotion analytics server140or associated bot, locally via a bot or other application operated by the user device150, or a combination.

In exemplary implementations, a current communication session involving multiple users may be monitored by emotion analytics server140in accordance with the respective user emotion profiles associated with each of the users. In monitoring for a particular user, the emotion analytics server140may detect in real-time that a subset of the messages (and parts of specific messages) of the current communication session triggers at least one of the moderation rules in the user emotion profile for that user. Such detection may be based on the message meeting one or more trigger conditions specified by the moderation rule. The emotion analytics server140may then moderate at least one message in real-time based on a moderation action specified by the triggered moderation rule. Such moderation action may result in modifying the at least one message that exhibits one or more of the trigger conditions. Further, emotion analytics server140may modify a presentation of the current communication session that is provided to the user device of the particular user. The presentation of the current communication session may be modified to remove or neutralize the trigger conditions.

During the course of a communication session, the user may also tag new emotional indicators for identification as an emotional trigger subject to moderation. Tagging may be based on a selection of textual language communications, transcripts of spoken language, recorded in-game behaviors (e.g., user-generated content as captured by UDS system200), auto-generated menus of different indicators detected as being present in the current communication session, freeform descriptions, or other selection mechanisms known in the art. The tagged and newly identified emotional indicators may then be used to evaluate incoming communications within the current communication session. Further, the user emotional profile(s) may be updated in real-time, as well as used to update and refine the learning models associated with emotion analysis and identification.

FIG.2illustrates an exemplary uniform data system (UDS)200that may be used to provide data to a system for emotion detection and emotion-based moderation based on voice inputs. Based on data provided by UDS200, emotion analytics server140can be made aware of the current session conditions, e.g., what in-game objects, entities, activities, and events that users and associated character(s) have engaged with, and thus support analysis of and coordination of emotion detection and emotion-based responses by emotion analytics server140with current gameplay and in-game activities. Each user interaction may be associated the metadata for the type of in-game interaction, location within the in-game environment, and point in time within an in-game timeline, as well as other players, objects, entities, etc., involved. Thus, metadata can be tracked for any of the variety of user interactions that can occur in during a game session, including associated activities, entities, settings, outcomes, actions, effects, locations, and character stats. Such data may further be aggregated, applied to data models, and subject to analytics. Such a UDS data model may be used to assign contextual information to each portion of information in a unified way across games.

For example, various content titles may depict one or more objects (e.g., involved in in-game activities) with which a user can interact and/or UGC (e.g., screen shots, videos, commentary, mashups, etc.) created by peers, publishers of the media content titles and/or third party publishers. Such UGC may include metadata by which to search for such UGC. Such UGC may also include information about the media and/or peer. Such peer information may be derived from data gathered during peer interaction with an object of an interactive content title (e.g., a video game, interactive book, etc.) and may be “bound” to and stored with the UGC. Such binding enhances UGC as the UGC may deep link (e.g., directly launch) to an object, may provide for information about an object and/or a peer of the UGC, and/or may allow a user to interact with the UGC.

As illustrated inFIG.2, an exemplary console228(e.g., a user device130) and exemplary servers218(e.g., streaming server220, an activity feed server224, an user-generated content (UGC) server232, and an object server226) are shown. In one example, the console228may be implemented on the platform server120, a cloud server, or on any of the servers218. In an exemplary example, a content recorder202may be implemented on the platform server120, a cloud server, or on any of the servers218. Such content recorder202receives and records content (e.g., media) from an interactive content title230onto a content ring-buffer208. Such ring-buffer208may store multiple content segments (e.g., v1, v2and v3), start times for each segment (e.g., V1_START_TS, V2_START_TS, V3_START_TS), and end times for each segment (e.g., V1_END_TS, V2_END_TS, V3_END_TS). Such segments may be stored as a media file212(e.g., MP4, WebM, etc.) by the console228. Such media file212may be uploaded to the streaming server220for storage and subsequent streaming or use, though the media file212may be stored on any server, a cloud server, any console228, or any user device130. Such start times and end times for each segment may be stored as a content time stamp file214by the console228. Such content time stamp file214may also include a streaming ID, which matches a streaming ID of the media file212, thereby associating the content time stamp file214to the media file212. Such content time stamp file214may be uploaded and stored to the activity feed server224and/or the UGC server232, though the content time stamp file214may be stored on any server, a cloud server, any console228, or any user device130.

Concurrent to the content recorder202receiving and recording content from the interactive content title230, an object library204receives data from the interactive content title230, and an object recorder206tracks the data to determine when an object beings and ends. The object library204and the object recorder206may be implemented on the platform server120, a cloud server, or on any of the servers218. When the object recorder206detects an object beginning, the object recorder206receives object data (e.g., if the object were an activity, user interaction with the activity, activity ID, activity start times, activity end times, activity results, activity types, etc.) from the object library204and records the activity data onto an object ring-buffer210(e.g., ActivityID1, START_TS; ActivityID2, START_TS; ActivityID3, START_TS). Such activity data recorded onto the object ring-buffer210may be stored in the object file216. Such object file216may also include activity start times, activity end times, an activity ID, activity results, activity types (e.g., competitive match, quest, task, etc.), user or peer data related to the activity. For example, an object file216may store data regarding an item used during the activity. Such object file216may be stored on the object server226, though the object file216may be stored on any server, a cloud server, any console228, or any user device130.

Such object data (e.g., the object file216) may be associated with the content data (e.g., the media file212and/or the content time stamp file214). In one example, the UGC server232stores and associates the content time stamp file214with the object file216based on a match between the streaming ID of the content time stamp file214and a corresponding activity ID of the object file216. In another example, the object server226may store the object file216and may receive a query from the UGC server232for an object file216. Such query may be executed by searching for an activity ID of an object file216that matches a streaming ID of a content time stamp file214transmitted with the query. In yet another example, a query of stored content time stamp files214may be executed by matching a start time and end time of a content time stamp file214with a start time and end time of a corresponding object file216transmitted with the query. Such object file216may also be associated with the matched content time stamp file214by the UGC server232, though the association may be performed by any server, a cloud server, any console228, or any user device130. In another example, an object file216and a content time stamp file214may be associated by the console228during creation of each file216,214.

In exemplary embodiments, the media files212and activity files216may provide information to emotion analytics server140regarding current session conditions. Emotion analytics server140may therefore use such media files212and activity files216to determine whether the current session conditions meet any of the trigger conditions of the moderation rules associated with the current users of a monitored communication session. For example, the media files212and activity files216may include references to particular characters, locations, and objects displayed and interacted with in a current gameplay session. Based on such files212and216, therefore, emotion analytics server140may identify in-game behaviors may be indicative of emotion rather than competitive gameplay. For example, unusual behaviors such as attacking a teammate while yelling profanities may be identified as being indicators of anger rather than a good faith effort to compete within the context of the game session. As such, evaluation of whether current in-game data meets a trigger conditions may include evaluation of certain in-game status of the user or other users. In such cases, the emotion analytics server140may use media files212and activity files216to evaluate and determine whether such in-game status meets or contributes to the trigger conditions.

FIG.3is a flowchart illustrating an exemplary method300for emotion detection and emotion-based moderation based on voice inputs. The method300ofFIG.3may be embodied as executable instructions in a non-transitory computer readable storage medium including but not limited to a CD, DVD, or non-volatile memory such as a hard drive. The instructions of the storage medium may be executed by a processor (or processors) to cause various hardware components of a computing device hosting or otherwise accessing the storage medium to effectuate the method. The steps identified inFIG.3(and the order thereof) are exemplary and may include various alternatives, equivalents, or derivations thereof including but not limited to the order of execution of the same.

In step310, a user emotion profile may be stored in memory (e.g., databases160) for a particular user. Different users may be associated different user emotion profiles, as each user emotion profile may be customized to the preferences, habits, and histories of the respective user. Each user emotion profile may include one or more moderation rules, each of which may specify one or more triggering emotional conditions associated with one or more moderation actions. The triggering emotional conditions may pertain to any emotional condition regarding the user, other users, the messages in a given communication session, and associated content session(s). Meanwhile, the moderation actions may include muting, modifying, or replacing verbal and/or textual language from the communication session, presentation modification, and other modifications to the chat stream.

In step320, a current communication session may be monitored by emotion analytics server140based on the user emotion profiles of the participating users. As each user has their own respective user emotion profile, emotion analytics server140may retrieve the associated user emotion profiles from databases160and use the associated moderation rules to monitor a communication session in real-time (or close to real-time). Further, where the associated moderation rules may specify trigger conditions involving other or associated sessions (e.g., gameplay sessions), the emotion analytics server140may also obtain media files212and activity files216associated with the user who specified such trigger conditions. The media files216and activity files216may be used by emotion analytics server140in conjunction with the conditions of the communication session to determine whether the trigger conditions are met.

In step330, a set of indicators associated with one or more messages in the communication session may be detected as triggering a moderation rule associated with a user. As noted above, the set of indicators may trigger the moderation rule when the trigger conditions specified by that moderation rule are met by the indicators. Because different users may be associated with different moderation rules via their respective different user emotion profiles, a set of indicators that triggers a moderation rule for one user may not trigger any moderation rule for another user. In a multiuser communication session where all users have different emotion profiles, different messages may trigger different levels and types of moderation for each different user.

In step340, the emotion analytics server140may modify the message detected in step330by modifying (e.g., filtering, removing, etc.) the trigger conditions in accordance with the moderation actions of the triggered moderation rule. As noted above, such modifications may include adding to, removing, or otherwise changing the trigger conditions exhibited by the message. In some implementations, such modifications may result in filtering the message in part or entirely out of the chat stream of the communication session.

In step350, the emotion analytics server140may modify the presentation of the current communication session provided to the user device150in accordance with the moderation actions of the triggered moderation rule. The presentation modification may include changing the content (e.g., specific words, terms, phrases) of triggering communications that is presented to the user device150, as well as changing how such content is presented to the user device150. Further, different media elements (e.g., from media files216, activity files216, or predefined media specified by moderation rules) may be added to or removed from the chat stream before being presented to the specific user associated with the triggered moderation rule.

FIG.4is a diagram illustrating an exemplary implementation of emotion detection and emotion-based moderation based on voice inputs. The diagram ofFIG.4illustrates user devices150A-D that may be part of the same communication session being monitored by emotion analytics server140. Where the communication session may be associated with a gameplay session, emotion analytics server140may further monitor the gameplay session or otherwise obtain data regarding the gameplay session (e.g., via UDS system200).

As discussed above, each user of user devices150A-D may be associated with their own respective user emotion profile(s). The emotion analytics server140may therefore monitor all communications originating from user devices150A-D and evaluate whether such communications meet the triggering emotion conditions in the user emotion profiles of the users participating in the communication session. Where triggering emotion conditions are detected, the associated moderation rule(s) may be applied to the message or messages exhibiting or associated with the triggering emotion condition.

In an exemplary implementation, the user of user device150D may originate a message identified as being associated with anger (e.g., angry message410. The emotion analytics server140may identify indicators of anger in real-time as being present in or exhibited by angry message410, which may include a text-based chat message or verbal chat message (including in-video speech). Such indicators may be associated with the angry message410itself or may be associated with the context within which the angry message410appeared. For example, the angry message410may have appeared at the same time that the user of user device150D suffered an in-game loss or setback. Depending on the emotion profiles of each of the user devices150A-D (including the originator of the angry message410), the presentation of the angry message410may be modified in real-time to result in different modified messages420A-C sent to user devices150A-C. Thus, the user devices150A-C may present different versions of the messages in communication session in real-time based on the respective emotion profiles.

For example, the emotion profile associated with the user of user device150D—who originated the angry message410—may include moderation rules applicable to their own messages, which may be used for self-censorship or otherwise automatically guarding against exposure of certain information (e.g., indicators of age, gender). In such instances, the same moderation rule may be applied to modify an aspect of the angry message410within each of the modified messages420A-C provided to user devices150A-C. For example, the emotion profile associated with the user of user device150D may specify that all outgoing voice-based messages from user device150D be modified in accordance with a moderation action that results in a computer-generated voice and vocal effects corresponding to a favorite game character.

In addition, the emotion profiles associated with the users of user devices150A-C may further specify different moderation rules applicable to incoming messages. For example, the emotion profile associated with the user of user device150A may not specify any moderation rules that are triggered by angry message410, but may nevertheless be presented with modified message420A based on the moderation rules of the user of user device410D who originated the angry message, i.e., the angry message410as voiced by the favorite game character of the user of user device410D.

Meanwhile, the emotion profile associated with the user of user device150B may specify a triggering emotional condition as being met by profanity and that a corresponding moderation action mute the any profanity. Applying such a moderation action to the angry message410may include identifying specific words or phrases that constitute profanity within the angry message410and muting the identified words or phrases to generate modified message420B that is then provided to user device150B.

At the same time, the emotion profile associated with the user of user device150C may specify a triggering emotional condition as being met by voice-based indicators of anger and that a corresponding moderation action modulate or modify the voice to remove the voice-based indicators of anger. Applying such a moderation action to the angry message410may include identifying specific voice-based words or phrases that exhibit the indicators of anger within the angry message410and auto-tuning the identified words or phrases in the angry message410to remove such indicators of anger (or using a computer-based voice generator to recreate the content of angry message410without the voice-based indicators of anger). As such, modified message420A may be generated and then provided to user device150A.

FIG.5is a block diagram of an exemplary electronic entertainment system that may be used in embodiments of the present invention. The entertainment system500ofFIG.5includes a main memory505, a central processing unit (CPU)510, vector unit515, a graphics processing unit520, an input/output (I/O) processor525, an I/O processor memory530, a controller interface535, a memory card540, a Universal Serial Bus (USB) interface545, and an IEEE interface550. The entertainment system500further includes an operating system read-only memory (OS ROM)555, a sound processing unit560, an optical disc control unit570, and a hard disc drive565, which are connected via a bus575to the I/O processor525.

Entertainment system500may be an electronic game console. Alternatively, the entertainment system500may be implemented as a general-purpose computer, a set-top box, a hand-held game device, a tablet computing device, or a mobile computing device or phone. Entertainment systems may contain more or less operating components depending on a particular form factor, purpose, or design.

The CPU510, the vector unit515, the graphics processing unit520, and the I/O processor525ofFIG.5communicate via a system bus585. Further, the CPU510ofFIG.5communicates with the main memory505via a dedicated bus580, while the vector unit515and the graphics processing unit520may communicate through a dedicated bus590. The CPU510ofFIG.5executes programs stored in the OS ROM555and the main memory505. The main memory505ofFIG.5may contain pre-stored programs and programs transferred through the I/O Processor525from a CD-ROM, DVD-ROM, or other optical disc (not shown) using the optical disc control unit570. I/O Processor525ofFIG.5may also allow for the introduction of content transferred over a wireless or other communications network (e.g., 4$, LTE, 3G, and so forth). The I/O processor525ofFIG.5primarily controls data exchanges between the various devices of the entertainment system500including the CPU510, the vector unit515, the graphics processing unit520, and the controller interface535.

The graphics processing unit520ofFIG.5executes graphics instructions received from the CPU510and the vector unit515to produce images for display on a display device (not shown). For example, the vector unit515ofFIG.5may transform objects from three-dimensional coordinates to two-dimensional coordinates, and send the two-dimensional coordinates to the graphics processing unit520. Furthermore, the sound processing unit560executes instructions to produce sound signals that are outputted to an audio device such as speakers (not shown). Other devices may be connected to the entertainment system500via the USB interface545, and the IEEE 1394 interface550such as wireless transceivers, which may also be embedded in the system500or as a part of some other component such as a processor.

A user of the entertainment system500ofFIG.5provides instructions via the controller interface535to the CPU510. For example, the user may instruct the CPU510to store certain game information on the memory card540or other non-transitory computer-readable storage media or instruct a character in a game to perform some specified action.

The present invention may be implemented in an application that may be operable by a variety of end user devices. For example, an end user device may be a personal computer, a home entertainment system (e.g., Sony PlayStation2® or Sony PlayStation3® or Sony PlayStation4®), a portable gaming device (e.g., Sony PSP® or Sony Vita®), or a home entertainment system of a different albeit inferior manufacturer. The present methodologies described herein are fully intended to be operable on a variety of devices. The present invention may also be implemented with cross-title neutrality wherein an embodiment of the present system may be utilized across a variety of titles from various publishers.

The present invention may be implemented in an application that may be operable using a variety of devices. Non-transitory computer-readable storage media refer to any medium or media that participate in providing instructions to a central processing unit (CPU) for execution. Such media can take many forms, including, but not limited to, non-volatile and volatile media such as optical or magnetic disks and dynamic memory, respectively. Common forms of non-transitory computer-readable media include, for example, a floppy disk, a flexible disk, a hard disk, magnetic tape, any other magnetic medium, a CD-ROM disk, digital video disk (DVD), any other optical medium, RAM, PROM, EPROM, a FLASHEPROM, and any other memory chip or cartridge.

Various forms of transmission media may be involved in carrying one or more sequences of one or more instructions to a CPU for execution. A bus carries the data to system RAM, from which a CPU retrieves and executes the instructions. The instructions received by system RAM can optionally be stored on a fixed disk either before or after execution by a CPU. Various forms of storage may likewise be implemented as well as the necessary network interfaces and network topologies to implement the same.

The foregoing detailed description of the technology has been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the technology to the precise form disclosed. Many modifications and variations are possible in light of the above teaching. The described embodiments were chosen in order to best explain the principles of the technology, its practical application, and to enable others skilled in the art to utilize the technology in various embodiments and with various modifications as are suited to the particular use contemplated. It is intended that the scope of the technology be defined by the claim.

Claims

  1. A method for emotion-based moderation, the method comprising: storing a user emotion profile in memory for a user, wherein the user emotion profile includes one or more moderation rules that specify a moderation action responsive to one or more emotional states of other users;monitoring a current communication session associated with a user device of the user based on the user emotion profile, wherein the current communication session includes messages from one or more other user devices of the other users;identifying an emotional state of one of the other users based on analyzing a subset of the messages from the other user using an emotion learning model that correlates emotion to one or more emotional indicators including at least one of error in gameplay, character behavior, or player physiological indicator that includes at least one of a heart rate or a skin tone;detecting that the emotional state identified for the other user triggers at least one of the moderation rules, wherein the detected emotional state corresponds to at least one of the emotional states specified by the user emotion profile;and modifying a presentation of at least one of the messages in the subset being provided to the user device in accordance with the at least one triggered moderation rule, wherein at least a part of the at least one message is filtered out of the modified presentation in accordance with the moderation action specified by the at least one triggered moderation rule from the user emotion profile of the user, and wherein the modified presentation is different from a version provided to each of the other users of the current communication session.
  1. The method of claim 1, wherein monitoring the current communication session includes monitoring emotional states associated with the user and with each of the other users.
  2. The method of claim 2, wherein monitoring the current communication session further includes assigning the emotional states associated with the user and with each of the other users to one or more predetermined categories.
  3. The method of claim 3, wherein the predetermined categories include one or more subcategories corresponding to a gradient between mild and extreme levels of the associated emotional state.
  4. The method of claim 1, wherein the subset of messages is associated with one or more identified users within the current communication session.
  5. The method of claim 5, further comprising imposing a punishment for the identified users, wherein the punishment includes a loss of one or more privileges associated with the current communication session.
  6. The method of claim 1, wherein modifying the presentation of the at least one message includes neutralizing the detected emotional state associated with the at least one message.
  7. A system for emotion-based moderation, the system comprising: memory that stores a user emotion profile for a user, wherein the user emotion profile includes one or more moderation rules that specify a moderation action responsive to one or more emotional states of other users;a communication interface that communicates over a communication network, wherein the communication interface connects to a current communication session associated with a user device of the user, wherein the current communication session includes messages from one or more other user devices of the other users;and a processor that executes instructions stored in memory, wherein the processor executes the instructions to: monitor the current communication session based on the user emotion profile, identify an emotional state of one of the other users based on analyzing a subset of the messages from the other user using an emotion learning model that correlates emotion to one or more emotional indicators including at least one of error in gameplay, character behavior, or player physiological indicator that includes at least one of a heart rate or a skin tone;detect that the emotional state identified for the other user triggers at least one of the moderation rules, wherein the detected emotional state corresponds to at least one of the emotional states specified by the user emotion profile, and modify a presentation of at least one of the messages in the subset being provided to the user device in accordance with the at least one triggered moderation rule, wherein at least a part of the at least one message is filtered out of the modified presentation in accordance with the moderation action specified by the at least one triggered moderation rule from the user emotion profile of the user, and wherein the modified presentation is different from a version provided to each of the other users of the current communication session.
  8. The system of daim 8, wherein the processor monitors the current communication session by monitoring emotional states associated with the user and with each of the other users.
  9. The system of claim 9, wherein the processor monitors the current communication session by further assigning the emotional states associated with the user and with each of the other users to one or more predetermined categories.
  10. The system of claim 10, wherein the predetermined categories include one or more subcategories corresponding to a gradient between mild and extreme levels of the associated emotional state.
  11. The system of claim 8, wherein the subset of messages is associated with one or more identified users within the current communication session.
  12. The system of claim 12, wherein the processor executes further instructions to impose a punishment for the identified users, wherein the punishment includes a loss of one or more privileges associated with the current communication session.
  13. The system of claim 8, wherein the processor modifies the presentation of the at least one message by neutralizing the detected emotional state associated with the at least one message.
  14. A non-transitory, computer-readable storage medium, having embodied thereon a program executable by a processor to perform a method for emotion-based moderation, the method comprising: storing a user emotion profile in memory for a user, wherein the user emotion profile includes one or more moderation rules that specify a moderation action responsive to one or more emotional states of other users;monitoring a current communication session associated with a user device of the user based on the user emotion profile, wherein the current communication session includes messages from one or more other user devices of the other users;identifying an emotional state of one of the other users based on analyzing a subset of the messages from the other user using an emotion learning model that correlates emotion to one or more emotional indicators including at least one of error in gameplay, character behavior, or player physiological indicator that includes at least one of a heart rate or a skin tone;detecting that the emotional state identified for the other user triggers at least one of the moderation rules, wherein the detected emotional state corresponds to at least one of the emotional states specified by the user emotion profile;and modifying a presentation of at least one of the messages in the subset being provided to the user device in accordance with the at least one triggered moderation rule, wherein at least a part of the at least one message is filtered out of the modified presentation in accordance with the moderation action specified by the at least one triggered moderation rule from the user emotion profile of the user, and wherein the modified presentation is different from a version provided to each of the other users of the current communication session.
  15. The non-transitory, computer-readable storage medium of claim 15, wherein monitoring the current communication session further includes assigning the emotional states associated with the user and with each of the other users to one or more predetermined categories.
  16. The non-transitory, computer-readable storage medium of claim 16, wherein the predetermined categories include one or more subcategories corresponding to a gradient between mild and extreme levels of the associated emotional state.
  17. The non-transitory, computer-readable storage medium of claim 15, wherein the subset of messages is associated with one or more identified users within the current communication session.
  18. The non-transitory, computer-readable storage medium of claim 15, further comprising instructions executable to impose a punishment for the identified users, wherein the punishment includes a loss of one or more privileges associated with the current communication session.

Disclaimer: Data collected from the USPTO and may be malformed, incomplete, and/or otherwise inaccurate.