procedure
The pipeline for transforming raw governance documents into structured institutional data consists of: (1) data normalization and pairing rules to align governance snapshots across versions, (2) coreference resolution to reduce pronoun ambiguity, (3) Semantic Role Labeling (SRL) to map sentences to predicate-argument structures, (4) embedding and clustering using BERTopic to capture governance topologies, and (5) evaluation using metrics such as entropy and per-project cluster counts.
Authors
Sources
- Patterns in the Transition From Founder-Leadership to Community ... arxiv.org via serper
Referenced by nodes (1)
- coreference resolution concept