Relative Age Effect Among Elite Youth Female Soccer Players across the United States

The consequence of relative age eff ect (RAE) has been an overrepresentation of athletes born early in the cohort and an underrepresentation of athletes born late in the cohort. There are signifi cantly fewer studies that examine this phenomenon among female soccer players. Therefore, the purpose was to determine the existence of RAEs among elite youth female soccer players competing in the Elite Clubs National League (ECNL) during the 2012-2013 season. Player birthdates (U14-U18 N=7,294) were collected from the ECNL and compared to the birthdates distribution for the general population. Data revealed a RAE across all age groups (U14-U18) indicating a preference for the selection of the oldest in the cohort. An overrepresentation of players was observed in Q1 and an underrepresentation of players in Q4 among the U14-U17 age groups. Among the u18 age group, an overrepresentation of players was detected in Q2 and an underrepresentation of players in Q4. The birthdate distribution for the fi rst and second halves of the playing season showed strong RAEs among the U14-U17 age groups. No statistically signifi cant diff erence was found between the fi rst and second halves of the playing season among players in the U18 age group.


Introduction
Th e relative age eff ect (RAE) in sport refers to a bias in distribution of athletes selected to elite level teams. More specifi cally, there is an overrepresentation of athletes born at the beginning of the competition year and an underrepresentation of athletes born at end of the competition year. Th e RAE phenomenon has been examined in sport since the early 1980s when Grondin, Deshaies, and Nault found signifi cant RAEs among all levels (recreational, competitive and senior) of ice hockey players and elite levels in volleyball players (Grondin, Deshaies, & Nault, 1984).Th e majority of studies on RAEs have been conducted on male athletes and soccer has been characterized as a sport associated with signifi cant RAEs (Cobley, Baker, Wattie, & McKenna, 2009;Musch & Grondin, 2001;Smith, Weir, Till, Romann, & Cobley, 2018).
Th e existence of RAEs among female athletes has yet to be confi rmed. Th e recent systematic review and meta-analysis by Smith and colleagues on the RAEs among female athletes indicated a small RAE across varying sport contexts (Smith et al., 2018). Yet, despite the growth in women's soccer around the world (FIFA Activity Report, 2017) and the wealth of research published on RAEs in soccer there are signifi cantly fewer studies that examine this phenomenon among female soccer players Smith et al., 2018). Th e results of this study are unique among the research on RAEs in female athletes.
Th e discrepancy in the fi ndings on RAEs among females is highlighted by the results of Delorme, Boiche´ and Raspaud (2010) and Goldschmied (2011 resentation of players born early in the cohort and an underrepresentation of younger players would indicate the presence of RAEs among this group of elite athletes. It was hypothesized that a statistically signifi cant RAE would be present among this group of elite level youth soccer players, indicating a bias against the selection of soccer players born late in the cohort.

Participants
Amateur elite youth female soccer players competing in the U14, U15, U16, U17, and U18 (N=7,294) age groups in the ECNL during the 2012-2013 season were used in this study. Th e ECNL was considered the highest level of club soccer for females in the U.S. and was made up of 73 clubs across the country in the 2012-2013 season. Th e season for ECNL teams started in September 2012 and culminated in the championship fi nals on July 10-15, 2013. Th e players competing in the ECNL range in age from 13 to 18 years of age and have all been selected and placed on a team aft er a series of trials.

Procedures
Th e birthdate for each player was collected from the individual team web pages from the ECNL web site (www. eliteclubsnationalleague.com). Th e birthdate of each player is public information and no private details were recorded. Th e birthdates of each player were compared to the birthdates of females in the general U.S. population born during the same years as the players. Th e birthdate range for the 2012/2013 ECNL players is 1992-1999. It should be noted that a player born between 8/1/1992 and 7/31/1993 can participate in the u18 age group as long as she is still in high school. Th e census birthdates were collected from the Center for Disease and Control and Prevention (CDC) vital statistics reports, which can be found on the CDC website (www.cdc.gov/nchs/vitalstats.htm). Th e vital statistics reports are made available to the public and contain no private information. Th e birthdates for the players and females in the general population were organized into quartiles based upon the 2012-2013 ECNL competition year of August 1 st -July 31 st . All birthdates were coded as follows: Q1=August-October, Q2=November-January, Q3=February-April and Q4=May-July.

Statistical Analyses
All data analyses were conducted using IBM SPSS predictive analytics soft ware (Version 20; IBM Inc., USA). Each age group was analyzed for asymmetry in birthdate distributions. A series of chi-square (χ2) goodness-of-fi t tests were used to determine diff erences between the observed birth months across the playing season (August-July) and expected birth month distributions for the births of females born in the U.S. from 1992-1999 (the same years as the players). Th e dependent variable for each analysis was the frequency of soccer players born in each quartile per age group. Th e level of significance was set at p<.05. Statistically signifi cant chi-square (χ2) values were used to calculate an eff ect size w statistic to determine the strength of the RAE. According to Cohen (1992), the following w values indicate the eff ect sizes: small=0.1, me-dium=0.3, large=0.5. Post-hoc analyses were conducted for w values ≥0.1. Lastly, for statistically signifi cant chi-square (χ2) values, standardized residuals were used to determine which observed birthdate quartiles diff ered from the expected distribution (Turnnidge, Hancock, & Côté, 2012). A value of ≥1.96 indicates an overrepresentation of births in the quartile and a value ≤-1.96 indicates an underrepresentation of births in the quartile (Sheskin, 2003).

Results
Th e birthdate distributions for the U14-U18 girls competing in the ECNL and the birthdate distributions for the general population are presented in Table 1, along with the results of the chi-square test, eff ect sizes, and standardized residuals. Th e chi-square analysis indicated a statistical diff erence between the observed and expected quartile distributions for all of the age groups, indicating signifi cant RAEs: (U14) χ2 (3, n=1,443)=133.30, p<.001; (U15) χ2 (3, n=1,423)=103.47, p<.001; (U16) χ2 (3, n=1,458)=82.01, p<.001; (U17) χ2 (3, n=1,456)=70.00, p<.001; (U18) χ2 (3, n=1,514)=17.09, p<.001. When compared to the general population birth distribution, the chi-square test and the post hoc analyses revealed an overrepresentation of players born at the beginning of the cohort and an underrepresentation of players born at the end of the selection year for all age groups. Th e standardized residuals showed an overrepresentation of players born in Q1 and an underrepresentation of players born in Q4 for the U14, U15, U16, and U17 age groups. In the u18 age group, the residuals indicated an overrepresentation of players born in Q2 and an underrepresentation of players born in Q4. Th e eff ect sizes ranged from small (.10) to moderate (.30) with the largest eff ect sizes associated with the u14 age group (.30). According to the analysis, the magnitude of the eff ect size decreased as age increased. In Table 2, the birthdate distribution for each half of the season (August-January and February-July) is presented, together with results of the chi-square tests, eff ect sizes, and standardized residuals. A comparison of the birthdate distribution for the fi rst and second halves of the playing season with the general population birthdate distribution, indicated a statistical diff erence for the following age groups: (U14) χ2 (1, n=1,443)=113.40, p<.001, (U15) χ2 (1, n=1,423)=85.48, p<.001; (U16) χ2 (1, n=1,458)=53.62, p<.001; and (U17) χ2 (1, n=1,456)=47.26, p<.001. Th e analysis revealed the majority of the players selected for the u14-u17 age groups were born between August and January. No statistical signifi cant diff erence was observed in the (U18) χ2 (1, n=1,514)=3.63, p=.056 age group. Th e standardized residuals for the U14, U15, U16, and U17 age groups showed an overrepresentation of players born in the fi rst half of the cohort and an underrepresentation of players born in the second half of the cohort. Th e eff ect sizes for the half-season distributions for the u14-u17 age groups were small to moderate and a decrease in the magnitude of the eff ect size was observed as age increased.

Discussion
Previous research investigating the presence of RAE among female athletes has been inconclusive. Th erefore, the objective of this study was to determine if a RAE existed among girls competing in the ECNL during the 2012-2013 season. It was hypothesized that girls born earlier in the cohort would be overrepresented across the league and girls born later in the cohort would be underrepresented. Th e hypothesis was supported among all age groups (U14-U18) in the ECNL. Girls born closer to the beginning of the selection year were more likely to be selected to teams in the ECNL. In contrast, girls born toward the end of the selection year were less likely to be off ered a spot on a team in the ECNL during the 2012-2013 season. Among the U14-U17 age groups, a traditional RAE existed with an overrepresentation of players born in Q1 and an underrepresentation of players born in Q4. In the U18 age group, players born in Q2 were overrepresented while the youngest in the cohort, players born in Q4, were underrepresented. Th e eff ect sizes indicated the strength of the RAEs were most prominent during the early stages of player development and decreased as players aged.
Th e results of this study indicate RAEs exists among elite level youth female soccer players in the U.S. To our knowledge, this is one of few studies on RAEs that demonstrate such a strong pattern among elite youth female soccer players. Sedano and colleagues found RAEs among Spanish female soccer players competing at the Regional and National team levels (Sedano, Vaeyens, & Redondo, 2015). Yet other studies found little to no RAE among female soccer players (Nakata & Sakamoto, 2012;Romann & Fuchslocher, 2013;Vincent & Glamser, 2006;Van den Honert, 2012).
Th e reason for the systematic RAEs observed in this study is unclear. Traditional explanations for RAEs in sport have linked birthdate with maturation. Relative age eff ects among males indicate that those born earlier in the cohort are assumed to mature earlier than the youngest in the cohort. Maturation among males is associated with an increase in testosterone and muscle mass, resulting in an increase in power, strength, and speed; all benefi cial for athletic performance (Rowland, 2005). Among females, however, maturation is associated with a decrement in athletic performance due to an increase in fat mass and a peak in aerobic capacity. Th e oldest females in a cohort have been shown to be disadvantaged in sport and reverse RAEs have been documented (Goldschmied, 2011;Till et al., 2010). Th e only anthropometric benefi t gained with early maturation among females is height. Th e average peak height velocity for girls is 12 years of age and girls that go through an early growth spurt may have an advantage in sport (Smith et al., 2018). Although speculative, when two soccer players look similar in skill level, it is possible the coach may perceive the taller of the two players as more talented.
Competition for spots on teams may be the leading cause for the RAEs detected among all of the age groups in the ECNL. Competition is a necessary component for RAEs to occur in sport and the strongest RAEs have been detected among the most popular sports worldwide Schorer, Cobley, Büsch, Bräutigam & Baker, 2009;Smith et al., 2018). Th e greater the number of potential players vying for selection to teams, the more likely there will be RAEs. According to the United States Youth Soccer Association (USYSA), which registers approximately 85% of all youth soccer players (aged 5-19 years of age) in the U.S., there were 3,000,000 youth soccer players competing across the country in the 2012-2013 season (USYSA, 2013). Almost half of these players are female, underlining the popularity of soccer in the U.S among females.
Furthermore, the 2014 FIFA Women's Football Study indicated there were 4.8 million registered female soccer players worldwide and approximately half came from the USA and Canada (FIFA Women's Football Survey, 2014). Grondin et al. (1984) showed strong RAEs among youth ice hockey players from the largest cities in Canada where competition was the highest. In contrast, weak RAEs were detected among volleyball players in Canada, where the selection pool was much smaller. Lidor, Côté, Arnon, Zeev and Cohen-Maoz (2010) found no RAEs among female basketball, handball, soccer, and volleyball players in Israel. Th e authors hypothesized the lack of competition, due to the small population size, would nullify any RAEs. Yet, in Brazil, where volleyball is considered a highly competitive sport, strong RAEs were found among elite female volleyball players competing at the highest level of play (Okazaki, Keller, Fontana, & Gallagher, 2011).
During the 2012-2013 season, the ECNL was considered the highest level of elite youth soccer among females in the U.S. Selection to an ECNL team can provide increased competition, where players get the opportunity to compete on a national level. In addition, the ECNL is a primary setting for player identifi cation and recruitment among college soccer coaches since the foundation of the league in 2009. As a result, young female soccer players with aspirations to compete at the collegiate level may feel the ECNL is the most eff ective way to gain exposure to college soccer coaches nationwide. Th e consequence is an increase in competition among the players for selection to the teams and the outcome is an overrepresentation of players from Q1 and an underrepresentation of players from Q4. To further highlight the competitiveness of this league, the majority of girls listed among the player pools for the 2013 U.S. National U15, U17, U18, and U20 teams (U15 [64%], U17 [72%], U18 [71%], and U20 [82%]) were selected from ECNL clubs (US Soccer, 2013).
Th e birthdate distribution for the fi rst and second halves of the playing season also indicated strong RAEs among the U14-U17 age groups. However, no statistically signifi cant difference was found among the u18 age group. Players born between August and January were overrepresented among the U14-U17 teams, while players born between February and July were underrepresented, indicating a preference for the selection of older players. Th e proportion of players born in the fi rst half of the season was highest among the U14s (65%) and decreased as age increased. Th e data also showed that the magnitude of the RAE decreased over time. Th e strength of the relationship was the largest among the U14s and the smallest among the U18s. Smith et al. (2018) observed a similar pattern among athletes in their systematic review and meta-analysis on RAEs within various female sports. It appears that the greatest level of bias against players born later in the cohort occurs during the early years of competition. However, this trend seems to dissipate as players get older.
Th e bias towards the selection of players born during the fi rst half of the cohort among U14-U17 teams might be due to the coaches' perception that older players (physically mature) are more talented than their younger counterparts (Helsen, Starkes, & Van Winckel, 1998). As a result, players selected to elite-level soccer teams receive the benefi ts of increased competition, feedback, and higher levels of coaching Smith et al., 2018;Wattie, Cobley, & Baker, 2008). Subsequently, players selected to elite teams at a younger age may have an advantage in future selections, as these players gain greater exposure to more high-level coaches, perpetuating the RAE during the formative years. In addition, self-motivation is highest among athletes who perceive themselves as competent in their sport. Highly motivated athletes typically invest a greater amount of time and eff ort into developing their skills compared to athletes who do not perceive themselves as competent to compete at a high level in sport (Vincent & Glamser, 2006).
Th e evidence from this study indicates a clear bias toward the selection of older girls among teams participating in the ECNL. Girls born late in the cohort are less likely to be given an opportunity to participate in this elite level league. Th e consequence is a loss of potential talent and emphasizes the need for additional research to examine the mechanisms contributing to RAEs among females in sport. Helsen and colleagues have suggested coaching education as one approach to combat against the negative impact of RAEs in elite youth soccer (Helsen, Van Winckel, & Williams, 2005). Cobley et al. (2009) suggested player identifi cation and selection place greater reliance on skill and movement rather than anthropometric components. It has also been suggested that athletes should be grouped according to their physical rather than chronological age; known as bio-banding (Cumming, Lloyd, Oliver, Eisenmann, & Malina, 2017).
As evidenced in professional soccer in Europe, despite 10 years of research on RAEs, there seems to be no change in the prevalence on this phenomenon among elite athletes (Helsen et al., 2012). Although attempts have been made to reduce RAEs among soccer players, the success of certain strategies have not proven successful. Th e Troendelag Regional Football Association (Norway) instituted a selection process where a minimum of 40% of the players must be born in the last six months of the playing year. Despite a clear attempt to reduce the magnitude of RAEs in the selection process, RAEs were prevalent among boys and girls at the higher levels of competition (Lagestad, Steen, & Dalen, 2018). Consequently, further research should focus on the underlying causes of RAEs among athletes, so eff ective strategies can be implemented among practitioners to reduce and remove the disadvantages to those born late in the sport calendar.