The change in the
lexical complexity of Van Halen songs
by Csikós Mátyás
by Csikós Mátyás
Van Halen is an American hard rock
band that was formed in Pasadena, California, in 1972. The band had two distinctive eras that were
defined by their lead singers, David Lee Roth and Sammy Hagar. The former was
switched to Hagar because Edward Van Halen, the leader of the band, wanted more
lyrical depth in the band's songs. This essay will use corpus linguistic tools
to see if this change in the band lived up to Edward's expectations by
analysing how the complexity of lyrics changed by getting a new singer for the
band and exploring if there is a significant contrast between the two eras'
lyrics. The essay will attempt to find this out by performing lexical
complexity and readability analyses on corpora built from sixteen songs that were
chosen from each album of Van Halen to provide a wide enough array of lyrics. The
lexical complexity and readability indices of the two eras will then be
compared to draw the conclusion. The research paper expects that the Hagar era
should have more complex lyrics and an increased vocabulary along with a more
increased verb variation and lexical sophistication because he was invited to
the band to compose more complex and deeper lyrics. The basis of the research
was Tim Murphey's similar study on pop songs concluded in the early nineties.
This research used online lexical
complexity analysis tools made by Lu Xiaofei (2012) available at
aihaiyang.com/synlex/lexical (currently being migrated to another server), as
well as a number of readability formulas. The first lexical complexity formula
used was lexical density, which measures the ratio of lexical (as opposed to
grammatical) words to the total number of words in a text, so a high lexical
density value means that the text is more complex (p. 190). The next formula
used in the analysis was lexical sophistication (also known as lexical rareness),
which is "the proportion of relatively unusual or advanced words in the
learner's text" (as cited in Xiaofei, 2012, p. 194). Sophisticated words
are defined as English words that are introduced at Grade 9 or later in the Swedish
educational system (Xiaofei, p. 191).
The number of different words (NDW),
number of different words based on the first fifty words (NDW-50) and from a
randomly chosen fifty words (NDWR-50) were also calculated. These are standard
analyses to determine lexical complexity as a higher number of different words
means more word variation, therefore the text is more dense and complex. Type-token
ratio (TTR), the ratio of the number of word types to the number of words in a
text, was also calculated for this research (Xiaofei, p. 193). However, the NDW
and the TTR formulae are heavily dependent on text length, as they will yield
much lower numbers at longer texts since the relative chance of repetition
increases with text length (Xiaofei, p. 197). Although the chosen lyrics were
relatively short (averaging at 200 words) in order to compensate for this
possible redundancy in the NDW and TTR analyses, other lexical variation
formulae were used to get more convincing results.
These were verb variation and
corrected verb variation, noun variation, adverb variation, adjective variation
and modifier variation. These are essentially the part of speech subcategories
of lexical variation, which is the range of the writer's vocabulary used in a
text (Xiaofei, p. 195). The Uber index was also used in the analysis to
describe lexical richness in the lyrics, where U=(logN)2/(logN-logV) , N being the total number of tokens and V
being the number of different words).
Finally, the research utilized a
number of readability formulas to determine the complexity and denseness of the
lyrics. These were Flesch score, the Gunning fog index, the Flesch-Kincaid
grade level, the Coleman-Liau index, and the SMOG index, all of them available
at readabilityformulas.com. These scores all give a number that can be placed
on a scale to see how complex the analysed piece of text is.
The corpus for the research was
built from eight songs chosen from both the David Lee Roth era and the Sammy
Hagar era. The chosen songs were Panama, Runnin' with the Devil, Little
Dreamer, Light Up the Sky, And the
Cradle Will Rock, Unchained, Little Guitars, and Jump from the David Lee Roth
era; Feelin', Aftershock, Seventh Seal, The Dream is Over, Runaround, Cabo Wabo,
Love Walks In, and Summer Nights from the Sammy Hagar era.
All song lyrics were punctuated and
transformed into continuous text. This was necessary so they can be analysed by
the various lexical complexity analyses as they only work on continuous, punctuated
bodies of text. Lyric length was also an important factor during the
composition of the corpus, as sample songs with a similar length will provide a
more accurate feedback, not to mention the possibility of redundancy in the NDW
and TTR formulae which was already mentioned.
After entering the punctuated,
formatted lyrics into each analysis tool individually, the data were summarized
in a spreadsheet. There were a few additional, but crucially important values
calculated. These were mean values for each era per index (e.g. the mean value
of NDW for each era, for example), and a paired T-test was concluded to get a
significance value using Student's commonly used t-distribution. The T-test was
necessary to know if the difference between the two compared groups is
statistically significant. The limit for the significance value was set at .05,
which means all indices that yielded a p
(significance) value below .05 were considered statistically significant.
Consequently, this essay will only deal with those values from now on that
yielded a low enough p value.
The overall results of the analysis
have shown that the expectations described in the introduction, namely that the
lyrics from the Hagar era will be much more complex and dense, were fulfilled,
e.g. Edward Van Halen made the right choice when he wanted more lyrical depth
and replaced David Lee Roth.
The results of the NDW analysis show
that the number of different words used is significantly higher in the lyrics
of the Hagar era, which had an NDW value of 105.88 compared to the 85.38 of the
David Lee Roth era. Out of the randomly chosen fifty words (NDWR-50), the
analysis yielded similar results, favouring the Hagar era.
Out of the word variation subcategories,
verb and noun variation had a correct significance value, and they are all
favouring the Hagar era as well with the verb variation value being 0.24
compared to the earlier era's 0.19 and the noun variation value standing at
0.69 compared to David Lee Roth's 0.50. As it can be seen, within word
variation, the difference was the largest between nouns, while between verbs,
the difference was fairly low. This is probably because the band's early lyrics
contain a lot more repetitions than the Hagar era's do, for example in the song
Panama, the one-word sentence "Panama!" is repeated numerous times,
which contributes a lot to the low noun variation value of the David Lee Roth
era.
Verb sophistication for the second
era of the band were also much higher, with a value of 0.61 which is almost
twice as much as the 0.37 of the David Lee Roth era, which shows that with
Hagar joining the band, the lyrics got a lot more complex, utilizing more
sophisticated, rarer words.
All lyrics yielded fairly similar
results on the field of readability. The Coleman-Liau index yielded the most
similar results, averaging at the speech level of a fifth-grader. However, the
Gunning Fog index revealed that the David Lee Roth era had much simpler lyrics
than the Hagar era had as the readability score of the Hagar lyrics is 6.26
compared to the 4.85 of the David Lee Roth lyrics on a scale of 1 to 10.
As lexical richness is
multidimensional (Xiaofei, 2012, p. 190), the analysis should yield consistent
results on all dimensions: lexical density, lexical sophistication, lexical
variation, and number of errors in vocabulary use. As the latter can be
excluded since lyrics are expected to have correct word usage, it can be said
that since all four lexical sophistication formulae, three out of five density
indices and all ten variation indices are showing higher values for the later
lyrics of Van Halen, it can be concluded that the Hagar lyrics are lexically
richer than the band's early lyrics.
Although the type-token ratio
analyses turned out to be insignificant after the T-test, they must be
mentioned here as interestingly, the word-frequency count yielded fairly
positive results as it showed a type-token ratio ranging generally between 0.43
and 0.47. This means that a word is repeated two times at average within a
song, which is fairly impressive, as for example in Murphey's (1992) research,
this value was 0.29, much lower than the Van Halen songs' TTR (p. 773).
However, it must be noted that the songs analysed were fairly short in order to
have worthwhile results with the NDW analysis, so this value of the TTR is not
far above the average levels.
It must be mentioned though that
the lyrics are considered spoken texts (although they are pre-written), which
have a much lower lexical density than their written counterparts and they may
be affected by factors such as degree of interactiveness (Xiaofei, p. 195).
To conclude, the analysis showed
that the expectations of the essay (and Edward Van Halen) were fulfilled as
comparing the lexical complexity and readability values of the two distinctive
eras of the band showed that the second era with Sam Hagar had a lot more
lyrical depth with more sophisticated words, less repetition and more lexical
density.
Analysis spreadsheet here: https://docs.google.com/spreadsheets/d/1VQgHpyz_xCV8uyvI0qaroa2uG_Xrb1zo4-6y_zJseiU/edit?usp=sharing
References
Murphey,
Tim. (1992). The Discourse of Pop Songs. TESOL
Quarterly, 26, 770-74.
My
Byline Media. (2012). Free Readability Formulas [online computer software].
readabilityformulas.com/free-readability-formula-tests.php
readabilityformulas.com/free-readability-formula-tests.php
Xiaofei,
Lu. (2013). Lexical Complexity Analyser [online computer software].
aihaiyang.com/synlex/lexical
aihaiyang.com/synlex/lexical
Xiaofei,
Lu. (2012). The Relationship of Lexical Richness to the Quality of ESL
Learners’ Oral Narratives. The Modern
Language Journal. 96, 190-208.
No comments:
Post a Comment