MANEUVERING ROLE OF ADJECTIVES IN PUNJABI GRAMMAR: A CORPUS BASED COMPUTATIONAL GRAMMATICAL ANALYSIS

Authors

  • Sarwat Sohail
  • Dr. Ghulam Ali
  • Ansar Ali

Keywords:

Adjective, corpus-based, computational analysis, Punjabi fiction, Shahmukhi script, Usage-Based Model, descriptive analysis, representativeness, corpus linguistics, preprocessing, text normalization, tokenization, POS tagging, natural language processing

Abstract

This study presents a corpus-based computational analysis of Punjabi prose written in Shahmukhi script, aiming to explore grammatical patterns employed in this language. Specifically, a unique appearance of adjective is noted, traced and described in the study. Usage-Based Model of Grammar, specifically Bybee (2001) and Langacker (1987), is used here for descriptive analysis of linguistic components of Punjabi. A corpus of approximately 3 million words is compiled from contemporary Punjabi fiction to ensure representativeness and diversity of styles. The research integrates computational methods with corpus linguistics to examine how grammatical structures emerge from actual language use. The methodology involves systematic preprocessing, including text normalization and tokenization, followed by the development and application of a Part-of-Speech (POS) tagging model adapted for Shahmukhi Punjabi. The findings are expected to reveal dominant grammatical constructions, high-frequency lexical classes, and patterns of morphological productivity, as is set through framework, specific to Punjabi fiction.  Specifically, the role of adjective determining the gender and number is noted. As adjectives like ‘patla’, ‘lamman’, ‘chota’, 'پتلا'، 'لماں'، 'چھوٹا' and many more, display an interesting role and show gender and number differences. This research contributes to the limited insights into Punjabi grammatical organization and by developing foundational tools for future natural language processing. Study has brought out striking features of the native language and it is a stepping stone for researches in computational analysis of minor neglected languages.

Downloads

Published

2026-06-11