Email : editor.ijarmjournals@gmail.com

ISSN : 2583-9667, Impact Factor: 6.49

Contact : +91 7053938407

Email editor.ijarmjournals@gmail.com

Contact : +91 7053938407

Article Abstract

International Journal of Advance Research in Multidisciplinary, 2025;3(4):131-133

The Data-Mined Canon: A Computational Analysis of Thematic Evolution in 20th Century British Poetry

Author : Dr. Bala Rani

Abstract

This study employs computational text mining and Natural Language Processing (NLP) techniques to map and analyze the thematic evolution of 20th-century British poetry. Moving beyond traditional, subjective literary historiography, we construct a quantitative "data-mined canon" from a corpus of over 50,000 poems by 120 canonical and marginalized poets, sourced from digital archives. Using Latent Dirichlet Allocation (LDA) for topic modeling and diachronic word embedding alignment, we identify dominant thematic clusters (e.g., War & Trauma, Urban Modernity, Nature & Ecology, Myth & Archetype, Domestic & Introspective) and trace their flux across four temporal periods: Edwardian/Georgian (1900-1918), Modernist (1919-1945), Post-War (1946-1979), and Late-Century (1980-1999). Our analysis reveals: 1) a sharp, quantifiable thematic rupture caused by World War I, with the War & Trauma cluster displacing Pastoral Idealism; 2) the persistent, albeit transforming, presence of nature poetry, shifting from romantic escapism to environmental anxiety; 3) the rise of a distinct Domestic & Introspective cluster post-1950, correlating with the "Movement" poets and late-century explorations of identity; and 4) evidence of thematic "echoes," where earlier themes re-emerge in mutated forms. This computational approach challenges rigid periodization, demonstrating a more fluid and recursive model of literary change. It also surfaces overlooked thematic continuities in the work of women and post-colonial poets, prompting a re-evaluation of the canonical narrative. The paper argues for a complementary partnership between distant reading and close reading, where data-driven patterns generate new questions for qualitative interpretation.

Keywords

Computational Literary Studies, Digital Humanities, Topic Modeling, 20th Century British Poetry, Thematic Evolution, Distant Reading, Literary History, Natural Language Processing (NLP), Canon Formation