Coach Effectiveness and Personality Assessments: An Exploratory Analysis of Thin Slice Interpersonal Perceptions

Jennifer L. Knight, Daniel R. Czech, Barry Joyner, Tyler McDaniel, Alan D. Zwald, Trevor Egli, Larry D. Bryant and Trey Burdette; Georgia Southern University

### Abstract

Gordon Allport (3) suggested that people are able to form accurate perceptions of others from mere glimpses of their behavior. The concept of interpersonal perception accuracy based solely on thin slices has been brought to mainstream attention by the popular book Blink by Malcolm Gladwell (35). Gladwell (35) proclaims that “decisions made very quickly can be as good as decisions made consciously and deliberately” (p. 14). Research suggested that expressive behaviors (movement, speech, gesture, facial expressions, posture) contribute to impressions made about the target (8). With that said, coaching research has identified behaviors that elicit positive perceptions from athletes towards coaches (63, 78). This research examined accuracy, consensus, and self-other agreement of personality assessments and coaching effectiveness based on thin-slice judgments of 30-second video clips of 9 recreation level coaches. Naïve raters (N=206) viewed the clips and rated the targets on coaching effectiveness and personality attributes. Ratings of coaching effectiveness were correlated with expert ratings of effectiveness to measure accuracy. The ratings of attributes were correlated with expert ratings of the same attributes to measure consensus. Gender, race, and level of sport participation of naïve raters was subjected to independent samples t-tests and one-way analyses of variance (ANOVA) to determine if they moderated thin-slice judgments. Results indicated that naïve raters as a group were not accurate in assessment of coaching effectiveness, nor were there significant correlations on consensus or self-other agreement. There were significant differences between levels of sport participation groups on two of the fourteen attributes: competence and confidence.

**Key Words:** Thin-slicing, Coaching Effectiveness, Consensus, Accuracy

### Introduction

In 1937, Gordon Allport (3) introduced this idea that people are able to form accurate perceptions of others from mere glimpses of their behavior. Making judgments from so called “thin slices” of behavior has become very popular in contemporary social psychological research (6-9). Interpersonal perception accuracy is based on thin slices, which was brought to mainstream attention by the popular book Blink by Malcolm Gladwell (35). This concept suggests that most people can thin-slice with surprising success, so that “decisions made very quickly can be as good as decisions made consciously and deliberately (p. 14).” Gladwell provides examples from academic research to support his overall premise, including that of Ambady and Rosenthal (9). Thin-slices are brief excerpts of expressive behavior less than five minutes sampled from the behavioral stream (6).

Ambady and Rosenthal (8) suggested that expressive behaviors (movement, speech, gesture, facial expressions, posture) contribute to impressions made about the target. Early researchers were interested in the link between expressive behaviors as the indicators of personality (3,4). The cues that are projected by expressive behavior have been shown to be interpreted accurately in as little as a 2-second nonverbal clip of a target (9).

Ambady and Rosenthal (8) also suggested that the accuracy of thin-slice judgments have practical applications in fields that are interpersonally oriented. When thin slice ratings predict criterion variables, they can be used, for example, to target biased teachers or gauge expectancies of newscasters. They also suggest that thin slice judgments can be used in the selection, training, and evaluation of people in fields where interpersonal skills are important. Accuracy of thin-slice judgments of coaches could be very useful in selection, training, and evaluation of coaches.

Accuracy in personality and social psychology research can be defined in three ways: the degree of correspondence between a judgment and a criterion, interpersonal consensus, and a construct possessing pragmatic utility (49). These definitions fall into two approaches within the field. The pragmatic approach defines a judgment as accurate if it predicts behavior. This approach looks at personality judgments as necessary tools for social living and evaluates their accuracy in terms of their practical value (31). The constructivist approach focuses on consensus between raters. This approach looks at all judgments as perceptions and evaluates their accuracy in terms of agreement between judges (31). Kenny (45) further explained that target accuracy is broken into three categories: Perceiver, generalized, and dyadic. Generalized target accuracy is the correlation between how a person is generally seen by others and how that person generally behaves. Target accuracy can be defined in thin-slice research as the correspondence between participants’ judgments of a target individual and well-defined external criterion (6,8,9).

Thin-slice judgments have been shown to produce similar judgments to ecologically valid criterion. Ecologically valid criteria are characterized by pragmatic utility in that they are used in everyday decisions about people as an external outcome of observed behavior (9). Support for congruence in this relationship has been shown by significant positive correlations between naïve judgments and outcomes, such as predicting judgments of candidates in job interviews and effectiveness of teachers (7).

The target accuracy and consensus of naïve raters given thin-slices of information appears to be moderated by characteristics of the raters, traits assessed, and characteristics of the targets. Studies show that individual differences of raters can affect judgments based on thin-slices of information including gender and ethnicity (6,7,29,73). Previous research is equivocal regarding the accuracy of judgments based on gender. Some research suggests that females are more accurate judges of non-verbal behavior (40), while other research found no difference in judgments of non-verbal behavior based on gender (8). Researchers have found that raters judge targets of a different ethnicity more negatively than targets of the same ethnicity (73).

Another bias can involve the dimensions being rated. One study found accuracy at zero acquaintance for judgments of extraversion, but not conscientiousness (47). Another study found similar correlations for extraversion as well as a relationship between zero acquaintance ratings of conscientiousness, but not for agreeableness, emotional stability, and culture (14). John and Robins (42) suggest that differences in ratings on traits depend on evaluativeness and observability. Traits that are less evaluative (neutral) and more observable reach greater consensus and accuracy (42). They define observability by the degree to which behaviors are relevant to the trait can be easily observed. They define evaluativeness by the degree to which a trait is relatively neutral.

Limitations are also present on the persons being judged. Persons who possess extraversion and good mental health are simpler to judge at first glance than targets who possess introversion or poor mental health, as Flora (28) denotes “exterior behavior mimics their internal view of themselves. What you see is what you get” (p. 66). Social context can also play a role depending on personality types. Expressive behaviors were limited by individuals with a high self-monitoring in social situations, therefore making judgments on their mood more difficult.

Ambady and Rosenthal (9) researched intuitive judgments on teacher

effectiveness. It was determined that thin-slice evaluations by naive raters of 30 seconds, 5 seconds, and 2 seconds were congruent with evaluations by students and principals who observed the teacher for a semester. It is suggested that the accuracy of the thin-slice judgments can be attributed to raters’ years experience in classroom situations; therefore, within the coaching context, amount of sport experience may also be an individual difference that moderates interpersonal perception accuracy. Ambady and Rosenthal (8) measured judgments on fourteen personality attributes: Accepting, active, dominant, empathic, enthusiastic, honest, likable, optimistic, professional, supportive, and warm. Teaching is an interpersonal field, as well as coaching. Due to similarities in the fields the same attributes were chosen in this study.

The teaching and coaching environment may have parallels and crossover applications. Often cited in coaching and teaching lore is John Wooden, who was one of the most successful collegiate basketball coaches. Wooden pointed out that coaches are teachers first and profiled ten criteria needed for a successful teacher; Among them, knowledge and warm personality and genuine consideration of others (79).

Research in the teaching profession highlights attributes of successful teaching. The list includes a teacher’s enthusiasm and positive attitude, approachability, an environment that is positive, cooperative, and clear-cut, specific objectives, as well as appropriate feedback (20,52,62). Wooden’s (79) coaching philosophy includes all of the aforementioned in his pyramid of success. Bloom (13) explains that coaching, like teaching, can perhaps best be viewed as an interpersonal relations field, which rests primarily on effective communication and interaction among various participants.

Coaching research has identified behaviors that elicit positive perceptions from athletes towards coaches (63,78). Behaviors include positive reinforcement, technical instruction, encouragement, and structuring fun practices. It is theorized that coachs behaviors plays a significant role in the psychological development of young athletes (64). Youth sport research highlights the positive relationship between specific coaching behaviors and self-esteem, satisfaction, and enjoyment in children (64,67). This has led to a recent theoretical model (19) that emphasizes how coaching behaviors impact youth psychosocial outcomes which emphasizes the role of athletes’ perceptions.

A recent study explored the characteristics of expert university level coaches and found several personal attributes that these coaches possessed: Commitment to learning; learning from past mistakes; knowledgeable; open-minded; balanced; composed; caring; and genuinely interested in their athletes (72).

Previous research targets the importance of increasing self-awareness of coaches’ referencing personal behavior while coaching (63,65,74). In a study that coded coaches’ behaviors, the athletes were significantly more successful than the coaches’ in the recall of those behaviors (63). This same research determined that youth athletes’ interpretation of coaches’ behaviors are of even greater impact than the actual behaviors in psychosocial outcomes. At the recreation level, game outcomes bear little significance in psychosocial outcomes (reaction to coach, enjoyment, and self-esteem) for the athletes. The measurement of psychosocial outcomes showed a significant relationship between coaches’ behavior and aforementioned outcomes. Earlier research (13) indicated that the coach is central to the development of expertise in a sport.

Nonverbal behavior can be very significant in an environment where high levels of stress and decision-making are concerned. Perceptions can cause shifts in confidence.

Research supports that the self-efficacy of athletes who judged opponents non-verbal behavior was directly related to those perceptions (39). As outcome expectations may be influenced by perceptions of sporting opponents, and have been shown to influence performance levels (24,26,76).

The purpose of this study is to examine the relationship between naïve ratings of thin-slices of coaching and ecologically valid criterion measures, which are end of the season evaluations by supervisors, as well as self measures of coaching attributes and effectiveness. This research will also include the demographic background of the naïve raters and explore the differences among evaluations based on gender, race, and level of sport participation. The following nine research questions are explored: What is the naïve raters’ accuracy in their assessment of coaching effectiveness; What is the consensus between naïve raters and experts on each attribute; What is the self-other agreement between naïve raters and coaches on each attribute; Is there a significant difference in accuracy between male and female raters; Is there a significant difference in consensus between male and female raters; Is there a significant difference in accuracy between races of raters; Is there a significant difference in consensus between races of raters; Is there a significant difference in accuracy between raters’ level of sport participation; Is there a significant difference in consensus between raters’ level of sport participation?

### Methods

#### Participants

There were two samples of participants in this study. Sample A consisted of 206 naïve raters recruited from undergraduate healthful living classes. Raters ranged from 18 to 55 years old (M = 19.6; SD = 4.4) and included 115 men and 91 women. Raters included African-Americans (n = 47), Caucasians (n = 147), Hispanics (n = 6), and other races (n = 4). The naïve raters indicated the highest level of sport in which they participated: none (n = 26); recreation (n = 46); junior varsity (n = 16); varsity/elite (n = 91); and college (n = 20). Sample B consisted of nine coaching students (eight men, one woman) from an undergraduate level coaching course at a southeastern university. There were eight Caucasian coaches and one African-American coach. The average age of the coaches was 20.2 years old (SD = 1.4).

#### Instrumentation

Coach attributions. Naïve raters, coaches, and supervisors rated each coach using an attributional survey (9) which included the following subscales: accepting, active, dominant, empathic, enthusiastic, honest, likable, optimistic, professional, supportive, and warm. Each coach was rated three times for each attribute on a 9-point Likert scale ranging from not at all (1) to very (9). The reliability in previous research of the mean of the judges’ ratings of the sum of the mean ratings of the 14 nonverbal variables was .80, assessed by an intraclass correlation (9).

Coach effectiveness. In addition, overall effectiveness of the coach was rated on a 5-point Likert scale: “Overall, how would you rate this coach?” Respondents could answer from very poor (1) to very good (5). Coaches and supervisors completed evaluations with the attributional survey and overall effectiveness questions at the end of the evaluation tool.

#### Procedures

Permission was obtained to use videotapes of coaching sessions by nine students in an undergraduate coaching class, who, as part of their course, were filmed for a practice session to be evaluated by their professor. The students coached recreation level youth football (n = 5) and soccer (n = 4) teams which ranged in competition level from under six to under fourteen. Consistent with Ambady and Rosenthal’s (9) previous research, three 10 second silent video clips were used from each coach’s session from the beginning, middle, and end; the clips feature the coach alone, consistent with previous research to control for the effects of interaction effects in the environment of the target (9).

All of the coach’s clips were arranged in one videotape in a randomized Latin-square design (8). The final tape consisted of 27 clips: 3 clips for each of the 9 coaches.

Each coach rated him/herself on the attribution scale and effectiveness item.

Supervisors completed the attribution scale and overall effectiveness item on each coach as part of their formal evaluation of the coach. Evaluations were delivered by the supervisors to the professor and picked up by the researcher.

Raters completed a demographic questionnaire and observed the video of the twenty seven 10-second video clips. Following each clip, raters completed the attributional scale and overall effectiveness question. End-of-the season evaluations by the recreation department supervisors, as well as self-evaluations were used for comparison with the raters’ scores on each of the 14 attributes.

#### Data Analysis

Given that each naïve rater rated each of nine coaches on three occasions, a within-rater mean across three occasions was computed for each coach for each attribute as well as effectiveness. To create an individual difference variable representing target accuracy, 206 correlations between each rater’s mean effectiveness scores and supervisor effectiveness scores (df = 7) were calculated. To create an individual difference variable representing consensus, 206 correlations between each rater’s mean scores and supervisor scores (df = 7) were calculated for each attribute. To create an individual difference variable representing self-other agreement, 206 correlations between each rater’s mean scores and self scores (df = 7) were calculated for each attribute and for effectiveness.

Inferential statistics were utilized to examine moderators of target accuracy, consensus, and self-other agreement. Means were compared using independent sample t-tests for gender comparisons and one-way ANOVAs for comparisons between races and sport participation groups. Post hoc comparisons using Fisher’s LSD were conducted on any significant results ascertained from ANOVAs (p < .01).

Individual correlations between each naïve rater’s score on effectiveness and the supervisor’s score on effectiveness for each coach were calculated and a mean consensus score was obtained. This provided an individual difference variable representing accuracy accuracy.

Individual correlations between each naïve rater’s attributional ratings across nine

coaches observed and the supervisors’ attributional ratings of these coaches were calculated and a mean correlation was determined to provide an individual difference variable representing consensus. Individual correlations between each naive rater’s attributional ratings across nine coaches observed and the actual coach were calculated and a mean correlation was determined to provide an individual difference variable representing self-other agreement.

Means were compared using independent sample t-tests for gender comparisons and one-way ANOVAs for comparisons between races and sport participation groups.

Post hoc comparisons using Fisher’s LSD were conducted on any significant results ascertained from ANOVAs: (p < .01).

### Results

The mean correlations between the naïve raters’ effectiveness ratings and the supervisors’ effectiveness ratings were calculated to estimate target accuracy of the thin slice judgments by the naïve raters (see Table 1).

The mean correlations between the naïve raters’ ratings on each of the fourteen attributes with the supervisors’ ratings on each of the fourteen attributes were calculated to estimate consensus, as well as other results regarding self-other agreement (see table 1). Independent samples t-tests were run based off of means generated on male and female raters to determine differences between the two groups on accuracy. There were no differences found on accuracy between groups (see Table 2). Independent samples t-tests run on differences on consensus between genders found significant differences (p < .01) on one of the fourteen variables: likeability. Female raters were higher on means consensus than male raters on likeability (see Table 2).

Due to the small sample size of Hispanic, Asian, and Other, these categories were not included in analyses on race differences. Independent samples t-tests run on differences between Caucasian and African-American raters found no significant differences on accuracy or consensus (p > .01) (see Table 3).

In addition, a one-way ANOVA showed no significant differences between levels of sport participation on accuracy (p > .01) (see Table 4). However, there were significant differences (p < .01) between level of sport participation groups on consensus on two of the fourteen variables: Competence and confidence (see Table 4). Fisher’s LSD post hoc tests indicated that naïve raters who participated in collegiate athletics showed significantly more consensus with supervisor ratings on competence than all other categories of level of sport participation raters. College raters also showed significantly more consensus with supervisor ratings on confidence than two other sport participation groups: no participation and varsity/elite participation.

### Discussion

There were several constructs of accuracy measured in this study. The first research question examined the target accuracy of the naïve raters. Due to the lack of correlation between the naïve raters’ judgments and the supervisors’ evaluations, the naïve raters as a group were not accurate in their assessments of coaching effectiveness. There are several explanations why this may have occurred. The nine coaches varied across two sports and four age levels. They were not observed directly with the athletes so differences in coaching behaviors due to varying age and sport contexts may have caused some of the variability. Thin-slice judgments in the sport context may have more variables that need to be controlled for than thin-slicing in classroom settings or social settings that have been previously examined. Modeling the Ambady and Rosenthal (8) study, the coaches were presented on muted video clips without athletes present. Ambady and Rosenthal (8) presented teachers alone in the clips they showed to naïve raters to control for biases to the reactions from students being taught. The coaching context requires adaptations to lessons as well as more frequent feedback. There may be a need for more frequent transactions whereas teaching may include more directive communication. Observations of a coach may require this interaction to accurately assess coaching effectiveness. The design of this study did not allow naïve raters to observe direct interactions between the coach and players.

Another explanation to support the complexity of the sport context is the individual differences in perceptions of effective coaches. Previous research found a negative correlation between body size and perceptions of coaching effectiveness by female gymnasts, while no correlation was found for soccer players or basketball players (21). This study did not survey for particular sport participation so variation may be due mainly to perceptions of coaching effectiveness in a particular sport. Other research suggests that the personality of the athlete can effect coaching evaluations. Williams et al. (78) found that athletes with higher anxiety and lower self-confidence rated effectiveness of coaches more negatively. This study did not look at the personality makeup of the raters to determine if those attributes moderate accuracy.

Previous research also suggests that mood state can affect evaluations (6). Recent research shows that mood state of customers can effect evaluation of sales people (57). When customers were in a bad mood and the salesperson was perceived as happy the customer rated the salesperson negatively. Ambady and Gray (6) found that negative mood states affected accuracy of social perceptions.

Another possible explanation why there was not a relationship between naïve raters and coaches on coaching effectiveness is the lack of congruency between the present circumstances of the raters and the environment of the target. The targets were coaching at the recreation level and the raters were college students. If they had participated in recreation level athletics they were many years removed from the situation. Much of the previous research on thin-slicing has used blind raters who are within the context being evaluated. One example is Ambady and Rosenthal’s (8) study on teacher effectiveness. The naïve raters were college students and they were rating college instructors and their judgments were compared to other student evaluations. This current study used college aged naïve raters who evaluated other college student coaches in a youth sport context. Other studies look at social contexts that most people are familiar with on a current basis (7). It may be useful to preface the thin-slicing with the context being rated. The naïve raters were not aware they were judging recreation level coaches. It may have been more useful to use parents of children who are in the recreation level context.

Consensus between naïve raters and experts on attributes was not reached on thirteen of the fourteen attributes. Consensus was defined within this study as the agreement between the naïve raters and the expert on personality attributes. Overall significance was not reached on thirteen of the fourteen attributes. Overall consensus was not reached on thirteen of the fourteen attributes. Considering how many correlations were measured, it can be expected that one could reach significance solely by chance. Kenny (45) defines consensus as the agreement between two raters. This research treats the naïve raters as one and the expert as the second rater. Consensus operationalized this way shows if naïve raters view a target similarly to a person who has greater knowledge of the target.

This approach has limitations because the naïve raters are compared with only one knowledgeable rater. Previous research suggests that there is greater accuracy in judgments of a target when there are two are more evaluations from people who know the target (48). Consensus may have been higher if more than one judgment by knowledgeable others could have been averaged to determine consensus. Consensus in Ambady and Rosenthal’s (9) research was operationalized by intracorrelations of naïve raters’ judgments of attributes which were placed in a 15 X 15 matrix and subjected to a principle components analysis. It is possible that consensus between naïve raters was reached in this study, which means they could have viewed the target similarly. This is a research question that should be considered for future research.

In regards to consensus, there was a moderate relationship between naïve raters and supervisors on the attribute enthusiastic. Previous research on the Norman and

Goldberg’s (54) Big Five and zero acquaintance research found consensus on the extraversion factor of the Big Five (33,46,55). Characteristics suggested by the extraversion category include sociable and energetic. It is possible that enthusiastic may be very similar to, or an expression of, extraversion. It could be easier to observe than the other traits. Researchers (46) suggest that extraversion is processed very quickly. John and Robins (42) suggest that the observability and evaluativeness of the attributes can contribute to accuracy and agreement between raters. The more neutral (less evaluative) and observable an attribute is the greater the agreement between raters is about the target. For example talkativeness is observable and neutral, while arrogance could be viewed as negative and more difficult to observe. Most of the fourteen attributes in this study were positively charged and difficult to directly observe: Accepting, attentive, competent, confident, dominant, empathic, enthusiastic, honest, likable, optimistic, professional, supportive, and warm.

Little research has examined thin-slicing in the sport context. Potentially personal biases of raters could affect judgments of coaches’ attributes. Kenny (45) explains that “personal stereotypes”, such as whether a rater subscribes to a widely held view. An example would be “all professors are absent-minded”, which can be reflected in judgments, and does not necessarily change with increasing acquaintance. Current research shows that stereotypes are based on more than gender or race. Kenny (45) explains that appearance cues and nonverbal behaviors are associated with different personality traits.

There was not self-other agreement in this study between the naïve raters’ judgments and the coaches self judgments of personality attributes. Previous perception research found that self judgments were less accurate when assessing behavior than others (48,69). Robins and John (58) suggest that mood affects self judgments as well as the need to protect self-esteem. The coaches in this study were undergraduate college students with no previous coaching experience. Their own perceptions about their coaching may have entered into the answering of the survey questions. Coaching literature has found that coaches are unaware of how they present themselves and behave while coaching (63,65,74). It is possible that the coaches in this study are similar and unaware of their behaviors.

This study supports the research literature in which no significant differences were found between gender and target accuracy. This supports an earlier meta-analysis by Ambady and Rosenthal (8) that examined numerous studies and concluded that overall gender did not affect thin slicing or zero acquaintance judgments. It has been suggested that women are better judges of nonverbal behavior (40). Rosenthal and DePaulo (60) found that women are better judges when the information is presented in more controllable channels. Speech is considered the most controllable channel, while the voice is considered the least controllable (15). This study did not involve an auditory component so potential differences in gender may not have arisen because of the channels for the cues of nonverbal behavior.

There was a significant difference between male and female naïve raters on one of the fourteen attributes. The most only significant difference (p < .01) was for likeability attribute. Female raters were closer to consensus with supervisors than male raters. This may pertain to the different expectations by gender on participation in sport. Previous studies have shown that females emphasize friendship and social interaction over competition and achievement than males do (1,34,36,56). Dubois (22) found that the longer youth participate in sport the greater the divergence in values placed on the outcomes by gender, Experienced males place greater importance on outcomes, whereas females consistently place emphasis on social aspects of sport. Potentially female raters in this study may have been more attuned to characteristics that embody the outcomes they desire in a sport setting. The other two attributes in which females differed significantly from males were enthusiastic and optimistic. All three of the differences between variables could be explained by the greater emphasis females place on these attributes and potentially the greater awareness they have of these attributes.

Overall there were no differences between African-Americans and Caucasians on target accuracy or consensus. Little research has examined racial differences in perception of naïve raters. Previous research has found race of target to affect accuracy and consensus (17,37). This research shows that race of raters does not affect target accuracy or consensus. Perhaps the sport context is different due to the length of participation of different races in sport and public acceptance of different races in sport over other areas in society. Edwards (23) suggests that lack of opportunities in mainstream society due to discrimination has led a disproportionately high number of blacks to pursue sport. Bledsoe (12) highlighted the practice in which young blacks pursue sport because of the lack of successful black role models in other areas. Sport is an area that has provided opportunity for those lower on the socioeconomic ladder to gain recognition and money when other avenues were closed off to them. (18). This can be supported by statistics: Blacks make up 77% of the NBA, 64% of the WNBA, and 65% of the NFL, they are only 4.2% of our physicians, 2.7% of our lawyers and 2.2% of our civil engineers (16). In NCAA Division I athletics blacks comprise 23.5% of student athletes: black males = 29.5% of male athletes; black females = 14.2% of female athletes). Black males comprise 60% of basketball players and 51% of football players and 27% of track athletes, while black females constitute 35% of basketball players and 31% of track athletes (53).

Perceptions of the race of the coaches may have also played a role in the lack of significant differences between races. There were eight Caucasian coaches and one African American coach. Statistics show a disproportionate number of non-Latino white males in coaching positions in the professional leagues and NCAA (50). “Stacking” theories in sport studies suggest that blacks are placed in positions that require more speed and stamina but less cognitive processes. One result of this is less opportunity to coach for minorities because of the positions they played that required less understanding of the overall game (18). There is a pattern found in professional sports and college sports of a disproportionately high number of blacks playing on teams coached by whites (18).

Overall there were no differences among levels of sport participation of raters on consensus of effectiveness. There was no correlation with the criterion variable between sport participation groups. Eight of the nine coaches were rated by supervisors as a four or a five out of five on effectiveness. The ninth coach was rated a three. Naïve raters overall rated coaches less effective than the supervisors. This could be a function of expectations of effective coaches at different levels. These coaches are fulfilling a requirement of an undergraduate coaching course which meets 3 hours a week. These coaches may experience more instruction which affects their ratings by supervisors.

While there were not significant differences in most of the attributional categories, there were significant differences on two of the fourteen attributes among levels of sport participation of raters. The higher the level of sport participation the greater the consensus with the expert judge on the competence attribute: The raters with college participation were significantly different than raters with varsity/elite experience, junior varsity experience, recreation level experience, and no sport experience. The college level athletes had greater consensus than all the other groups. One explanation could be the greater participation of these raters in sport and their level of attunement to competence of coaches. These raters possibly had a greater exposure to a number of coaches and are more sensitive to competence. Millard (51) posits that the higher level an athlete pursues the greater the need for winning and the greater the need for technical instruction from a coach. She found that coaches who provided more instruction based feedback were perceived as more competent. High-experience coaches are noted to provide more technical feedback and less general encouragement than low-experience coaches (61). This difference could also account for the awareness of competence of the college level raters.

The college level raters were also significantly different than varsity/elite athletes and recreation level athletes on confidence. The college level raters showed more consensus with supervisors’ ratings. They could also be attuned to the confidence level of coaches. Research shows that male coaches are generally more confident in abilities than female coaches (51). This study used eight male coaches and one female coach. College level raters due to length involved in sport may be more attuned to the confidence level of a coach.

Researchers attempt to define the moderators surrounding the rater, the channel, the judgments, and the target that could affect accuracy. It is also valuable to learn in what scenarios judgments are not accurate. Evans (25) notes that it is more important to know in what contexts people do not make good decisions. Previous research suggests that the degree to which a judge cares about the judgment he or she is making can affect the accuracy and consensus (27,31). The environment observed may have also affected consensus on personality judgments. Previous research found that less structured situations yield greater correlations on personality (32,68). This research involved judgments of targets in a classroom setting observing video clips instead of directly observing the targets in the sport environment.

This research is promising because it is the first to examine thin-slicing in the sport environment. It suggests that the sport context may have more variables to control for when doing zero acquaintance research. Future research should attempt to control variables and look at particular sports and use naïve raters who have experienced that sport. Future research could also examine zero acquaintance situations at different levels, like the collegiate or elite level. Looking at moderators of consensus based on the demographics of the coach, like gender and race would be valuable. Qualitative studies could further understand personal biases that underscore perceivers’ views of effective coaches, whether gender, sport level and type, or race could affect that.

### Application in Sport

This thought of split second decision making about a coach could be very critical in developing the most cohesive team possible. With further research necessary based on the above suggestions, thin-slicing could potentially benefit the cohesion of the team. By reversing this idea, coaches might be able to more effectively choose players that fit their team when recruiting. Stats are very important, but if there were other intangible ways to ‘correctly’ choose athletes that fit the mold of their team, coaches might be able to more effectively choose a cohesive, talented team.

### Tables

#### Table 1
Descriptive Statistics

	M	SD	Skewness (SE = 0.17)	Kurtosis (SE = 0.34)	M	SD	Skewness (SE = 0.17)	Kurtosis (SE = 0.34)
	Target Accuracy
	Consensus				Self-Other Agreement
Effectiveness Attribute	-.27	0.25	0.65	0.69
Acceptance	-.33	0.28	0.65	0.65	.03	0.30	-0.57	0.33
Active	-.16	0.25	0.10	-0.44	.16	0.28	-0.08	0.30
Attentive	.23	0.27	-0.69	0.91	.11	0.28	-0.20	0.03
Competent	-.15	0.23	0.69	1.40	.19	0.28	-0.15	-0.19
Confidence	.15	0.25	-0.07	0.18	-.05	0.28	0.23	-0.12
Dominance	-.11	0.24	0.30	1.10	.27	0.25	-0.90	-0.05
Empathic	-.17	0.28	0.45	0.56	.42	0.32	-1.20	0.60
Enthusiastic	.45	0.24	-0.99	1.50	-.11	0.30	0.64	1.80
Honesty	-.07	0.27	0.25	-0.10	-.08	0.26	0.33	0.42
Likeability	.20	0.23	-0.22	0.21	.01	0.29	0.48	0.02
Optimistic	.00	0.23	0.13	0.15	.18	0.28	-0.59	0.43
Professional	-.09	0.25	0.02	-0.35	.22	0.27	-0.42	0.10
Supportive	-.17	0.25	0.35	0.11	.01	0.27	0.00	0.10
Supportive	-.17	0.25	0.35	0.11	.01	0.27	0.00	0.10
Warm	-.13	0.28	0.16	-0.10	-.09	0.29	0.23	-0.08

#### Table 2
Descriptive Statistics for Target Accuracy and Consensus Differentiated by Gender

	Gender
	Males		Females
Attributes	M	SD	M	SD
Effectiveness	-.28	0.28	-.26	0.23
Acceptance	-.33	0.30	-.33	0.26
Active	-.18	0.23	-.14	0.25
Attentive	.20	0.30	.25	0.24
Competent	-.14	0.24	-.16	0.23
Dominance	-.12	0.25	-.11	0.23
Empathic	-.18	0.33	-.17	0.24
Enthusiastic	.40	0.24	.48	0.23
Honesty	-.08	0.30	-.05	0.24
Likeability*	.14	0.23	.25	0.22
Optimistic	-.04	0.25	.03	0.22
Professional	-.13	0.26	-.07	0.24
Supportive	-.18	0.28	-.15	0.22
Warm	-.13	0.30	-.13	0.27

* p < .01

#### Table 3
Descriptive Statistics for Target Accuracy and Consensus Differentiated by Race

	Race
	African-Americans		Caucasians
Attributes	M	SD	M	SD
Effectiveness	-.26	0.25	-.30	0.25
Acceptance	-.31	0.29	-.39	0.24
Active	-.31	0.29	-.39	0.24
Attentive	.24	0.28	.20	0.22
Competent	-.15	0.22	-.13	0.26
Dominance	-.09	0.24	-.17	0.18
Empathic	-.17	0.28	-.19	0.29
Enthusiastic	.45	0.25	.42	0.23
Honesty	-.06	0.28	-.09	0.23
Likeability	.19	0.23	.26	0.24
Optimistic	-.02	0.23	.06	0.23
Professional	-.10	0.26	-.07	0.22
Supportive	-.18	0.25	-.15	0.25
Warm	.14	0.28	-.11	0.27

#### Table 4
Analysis of Variance for Attributes between Levels of Sport Participation Groups

Attributes	df	F	p
Acceptance	4	0.85	0.50
Active	4	0.29	0.89
Attentive	4	0.96	0.43
Competent*	4	3.57	0.01
Confidence*	4	3.67	0.01
Dominance	4	0.31	0.87
Empathic	4	0.32	0.86
Enthusiastic	4	3.22	0.01
Honesty	4	0.70	0.59
Likeability	4	1.14	0.23
Optimistic	4	0.94	0.45
Professional	4	0.71	0.59
Supportive	4	1.51	0.20
Warm	4	1.45	0.22

* p < .01

### References

1. Alderman, R. B. (1988). Strategies for Motivating Young Athletes. In W. F. Straub (Ed.) Sport psychology: An analysis of athlete behavior (pp. 4961). Ithaca, NY: Movement.
2. Abromovitch, R. (1977). Children’s recognition of situational aspects of facial expressions. Child Development, 48, 459-463.
3. Allport, G. W. (1937). Personality: A psychological interpretation. New York: Holt.
4. Allport, G. W., & Vernon, P. E. (1933). Studies in expressive movement. New York: Haffner.
5. Ambady, N., Conroy, M., Mullins, J., & Tobia, A. (2001). Friends, lovers, and strangers: Judging dyadic relationships from thin slices. Manuscript submitted for publication.
6. Ambady, N., & Gray, H. M. (2002). On being sad and mistaken: Mood effects on the accuracy of thin-slice judgments. Journal of Personality and Social Psychology, 83, 947-961.
7. Ambady, N., Hallahan, M., Rosenthal, R. (1995). On judging and being judged accurately in zero-acquaintance situations. Journal of Personality and Social Psychology, 69, 518-529.
8. Ambady, N., & Rosenthal, R. (1992). Thin slices of expressive behavior as predictors of interpersonal consequences: A meta-analysis. Psychological Bulletin, 111, 256-274.
9. Ambady, N., & Rosenthal, R. (1993). Half a minute: Predicting teacher evaluations from thin slices of nonverbal behavior and physical attractiveness. Journal of Personality and Social Psychology, 64, 431-441.
10. Archer, D., & Akert, R. M. (1977). Words and everything else: Verbal and nonverbal cues in social interpretation. Journal of Personality and Social Psychology, 35, 443-449.
11. Bem, D. J. (1972). Self-perception theory. In L. Berkowitz (Ed.), Advances in experimental social psychology, (Vol.6, pp. 1-62).NewYork:Academic Press.
12. Bledsoe, T. (1973). Black dominance in sports: Strictly from hunger. The Progressive, 37, 16-19.
13. Bloom, B. (1985). Developing talent in young people. New York: Ballantines.
14. Borkenau, P., & Liebler, A. (1992). Trait inferences: Sources of validity at zero acquaintance. Journal of Personality and Social Psychology, 62, 645-657.
15. Brown, R. (1986). Social psychology: The second edition. New York: Free Press.
16. Bureau of Labor Statistics (1997). Retrieved June 6, 2006 from ftp://ftp.bls.gov/pub/special.requests/lf/aat11.txt
17. Chiao, J. Y., Heck, H. E., Nakayama, K., & Ambady, N. (2006). Priming race in biracial observers affects visual search for black and white faces. Psychological Science,17, 387-392.
18. Coakley, J. (2001). Sport in society: Issues and controversy. New York: McGraw-Hill.
19. Conroy, D. E., & Coatsworth, J. D. (2006). Coach training as a strategy for promoting youth social development. The Sport Psychologist, 20, 124-144.
20. Csikszentmihalyi, M., & McCormack, J. (1986). The influence of teachers. Phi Delta Kappan, 66, 415-419.
21. Cumming, S. P., Eisenmann, J. C., Smoll, F. L., Smith, R. E., Malina, R. M. (2005). Body size and perceptions of coaching behaviors by adolescent female athletes. Psychology of Sport and Exercise, 6, 693-705.
22. Dubois, P. (1990). Gender differences in value orientation towards sports: A longitudinal analysis. Journal of Sport Behavior, 13, 3-14.
23. Edwards, H. (1973). Sociology of sport. Dorsey Press: Homewood, IL.
24. Eyal, N., Bar-Elis, M., Tenenbaum, G., & Pie, J. S. (1995). Manipulated outcome expectations and competitive performance in motor tasks with gradually increasing difficulty. The Sport Psychologist, 9, 188-200.
25. Evans, J. St. B. T. (1984). In defense of the citation bias in the judgment literature. American Psychologist, 39, 1500-1501.
26. Feltz, D. L., & Riessinger, C. A. (1990). Effects of in vivo emotive imagery and performance feedback on self-efficacy and muscular endurance. Journal of Sport and Exercise Psychology, 12, 132-143.
27. Flink, C., & Park B. (1991). Increasing consensus in trait judgments through outcome dependency. Journal of Experimental Social Psychology, 27, 453-467.
28. Flora, C. (2004). The once-over can you trust first impressions. Psychology Today, 37, 60-65.
29. Fortunato, V. J., & Mincy, M. D. (2003). The interactive effects of dispositional affectivity, sex, and a positive mood induction of student evaluations of teachers. Journal of Applied Sport Psychology, 33, 1945-1968.
30. Funder, D. C. (1991). Global traits: A neo-Allportian approach to personality. Psychological Science, 2, 31-39.
31. Funder, D. C. (1995). On the accuracy of personality judgment: A realistic approach. Psychological Review, 102, 652-670.
32. Funder, D. C., & Colvin, C. R. (1991). Explorations in behavioral consistency: Properties of persons, situations, and behaviors. Journal of Personality and Social Psychology, 60, 773-794.
33. Funder, D. C., & Dobroth, K. M. (1987). Differences between traits: Properties associated with interjudge agreement. Journal of Personality and Social Psychology, 52, 409-418.
34. Gill D. L., Gross, J. B., Huddleston, S. (1983). Participation motivation in youth sports. International Journal of Sport Psychology, 14, 1-14.
35. Gladwell, M. (2005). Blink. New York: Little, Brown & Company.
36. Gould, D., Feltz, D., & Weiss, M. (1986). Motives for participating in competitive youth swimming. International Journal of Sport Psychology, 16, 126-140.
37. Gosling, S. D., Ko, S. J., Mannarelli, T., & Morris, M. E. (2002). A room with a cue: Personality judgments based on offices and bedrooms. Journal of Personality and Social Psychology, 82, 379-398.
38. Grahe, J. E., & Bernieri, F. J. (1999). The importance of nonverbal cues in judging rapport. Journal of Nonverbal Behavior, 23, 253-269.
39. Greelees, I., Bradley, A., Holder, T., & Thelwell, R. (2005). The impact of opponents’ non-verbal behavior on the first impressions and outcome expectations of table-tennis players. Psychology of Sport & Exercise, 6, 103-115.
40. Hall, J. A. (1984). Nonverbal sex differences: Communication accuracy and expressive style. Baltimore: Johns Hopkins University Press.
41. Hofstee, W. K. B. (1994). Who should own the definition of personality? European Journal of Personality, 8, 149-162.
42. John, O. P., & Robins, R. W. (1993). Determinants of interjudge agreement on personality traits: The Big Five domains, observability, evaluativeness, and the unique perspective of the self. Journal of Personality, 61, 521-551.
43. Jones, E. E., & Nisbett, R. E. (1971). The actor and the observer: Divergent perceptions of the causes of behavior. Morristown, NJ: General Learning Press.
44. Kahlbaugh, P. E., & Haviland, J. M. (1994). Nonverbal communication between parents and adolescents: A study of approach and avoidance behaviors. Journal of Nonverbal Behavior, 18, 91-113.
45. Kenny, D. A. (1994). Interpersonal perception: A social relations analysis. New York, New York: Guilford Publications. Retrieved March 16, 2006 from http://davidakenny.net/ip/assim.htm
46. Kenny, D. A., Albright, L., Malloy, T. E., & Kashy, D. A. (1994). Consensus and interpersonal perception: Acquaintanceship and the Big Five. Psychological Bulletin, 116, 245-258.
47. Kenny, D. A, Horner, C., Kashy, D. A., & Chu, L. (1992). Consensus at zero acquaintance: Replication, behavioral cues, and stability. Journal of Personality and Social Psychology, 62, 88-97.
48. Kolar, D. W., Funder, D. C., & Colvin, C. R. (1996). Comparing the accuracy of personality judgments by the self and knowledgeable others. Journal of Personality, 64, 311-337.
49. Kruglanski, A. W. (1989). The psychology of being “right”: The problem of accuracy in social perception and cognition. Psychological Bulletin, 106, 395-409.
50. Lapchick, R., & Mathews, K. (2000). 1998 Racial and gender Report Card. The Center for the Study of Sport in Society: Boston, MA.
51. Millard, L. (1996). Differences in coaching behaviors of male and female high school soccer coaches. Journal of Sport Behavior, 19, 19-32.
52. McGinnis, A. L. (1985). Bring out the best in people. Minneapolis, MN: Augsburg Publishing House.
53. NCAA (1998). Race demographics of NCAA member institutions’ athletics personnel. The National Collegiate Athletics Association: Overland Park, Kansas.
54. Norman, W. T., & Goldberg, L. R. (1966). Raters, ratees, and randomness in personality structure. Journal of Personality and Social Psychology, 4, 681-691.
55. Park, B., & Judd, C. M. (1989). Agreement of initial impressions: Differences due to perceivers, trait dimensions, and target behaviors. Journal of Personality and Social Psychology, 56, 493-505.
56. Petrie, B. M. (1971). Achievement orientations in adolescent attitudes toward play. International Review of Sport Sociology, 6, 89-101.
57. Puccinelli, N. M. (2006). Putting Your Best Face Forward: The Impact of Customer Mood on Salesperson Evaluation. Journal of Consumer Psychology, 16, 156-162.
58. Robins, R. W., & John, O. P. (1997). The quest for self-insight: Theory and research on accuracy and bias in self-perception. In R. Hogan, J. A. Johnson, & S. R. Briggs (Eds.), Handbook of personality psychology (pp. 649-679). San Diego, CA: Academic Press.
59. Rosenthal, R. (1987). Judgment studies: Design, analysis, and meta-analysis. New York: Cambridge University Press.
60. Rosenthal, R., & DePaulo, B. M. (1979). Sex differences in accommodation in nonverbal communication. In R. Rosenthal (Ed.), Skill in nonverbal communication. Cambridge, MA: Oelgeschlager, Gunn, & Hain.
61. Sherman, M. A., & Hassan, J. S. (1986). In M. Pieron, & G. Graham (Eds.), The 1984 Olympic scientific congress proceedings (Vol. 6) (pp. 103-108). Champaign, IL: Human Kinetics.
62. Slavin, R. E. (1980). Cooperative learning. Review of Educational Research, 50, 312-342.
63. Smith, R. E., & Smoll, F. L. (1997). Coaching the coaches: Youth sports as a scientific and applied behavioral setting. Current Directions in Psychological Science, 6, 16-21.
64. Smith, R. E., & Smoll, F. L. (1990). Self-esteem and children’s reaction to youth sport behaviors: A field study of self-enhancement processes. Developmental Psychology, 26, 987-993.
65. Smith, R. E., Smoll, F. L., & Curtis, B. (1978). Coaching behaviors in Little League baseball. In F.L. Smoll & R.E. Smith (Eds.), Psychological perspectives in youth sports (pp. 173-201). Washington, DC: Hemisphere.
66. Smith, R. E., Smoll, F. L., & Curtis, B. (1979). Coach Effectiveness Training: A cognitive-behavioral approach to enhancing relationship skills in youth sport coaches. Journal of Sport Psychology, 1, 59-75.
67. Smoll, F. L., Smith, R. E., Barnett, N. P., & Everett, J. J. (1993). Enhancement of children’s self-esteem through social support training for youth sport coaches. Journal of Applied Psychology, 78, 602-610.
68. Snyder, M., & Ickes, W. (1985). Personality and social behavior. In G. Lindzey & E. Aronson (Eds.), Handbook of social psychology (3rd ed., Vol. 2, pp. 883-948). New York: Random House.
69. Spain, J.S., Eaton, L.G, & Funder, D.C. (2000). Perspectives on personality: The relative accuracy of self versus others for the prediction of emotion and behavior. Journal of Personality, 68, 837-867.
70. Tickle-Degnan, L., & Rosenthal, R. (1987). Group rapport and nonverbal behavior. In C. Hendrick (Ed.), Group processes and intergoup relations. Review of personality and social psychology (Vol. 9, pp. 113-136). Beverly Hills, CA: Sage.
71. Tickle-Degnan, L., & Rosenthal, R. (1990). The nature of rapport and its nonverbal correlates. Psychological Inquiry, 4, 285-293.
72. Vallee, C. N., & Bloom, G. A. (2005). Building a successful university program: Key and common elements of expert coaches. Journal of Applied Sport Psychology, 17, 179-196.
73. Vrij, A., Dragt, A., & Koppelaar, L. (1992). Interviews with ethnic interviewee: Non- verbal communication errors in impression formation. Journal of Community & Applied Social Psychology, 2, 199-208.
74. Wandzilak, T., Ansorge, C.J., & Potter, G. (1988). Comparison between selected practice and game behaviors of youth sport soccer coaches. Journal of Sport Behavior, 2, 78-88.
75. Watson, D. (1989). Strangers’ ratings of the five robust personality factors: Evidence of a surprising convergence with self-report. Journal of Personality and Social Psychology, 57, 120-128.
76. Weinberg, T., Gould, D., & Jackson, A. (1979). Expectations and performance: An empirical test of Bandura’s self-efficacy theory. Journal of Sport Psychology, 3, 345-354.
77. Wiggins, J. S. (1973). Personality and prediction: Principles of personality assessment, Reading, MA: Addison-Wesley.
78. Williams, J. M., Jerome, G. J., Kenow, L. J., Rogers, T., Sartain, T. A., & Darland, G. (2003). Factor structure of the coaching behavior questionnaire and its relationship to athlete variables. Sport Psychologist, 17, 16-35.
79. Wooden, J. R. (1980). Practical modern basketball. New York: John Wiley and Sons.

### Corresponding Author

Dr. Daniel R. Czech, CC-AASP
Department of Health and Kinesiology
Box 8076
Georgia Southern University
Statesboro, Georgia 30460-8076
<drczech@georgiasouthern.edu>
(912) 478-5267